Thanks Ryan for your answer.
yes I was mistaken, I also thought that the default install of hbase did a one node install of HDFS; and it seems that wrong:

a ps auwx|grep java

show only two java processes;

org.apache.hadoop.hbase.zookeeper.HQuorumPeer
and
org.apache.hadoop.hbase.master.HMaster

In the default hbase distribution we have in

hbase-default.xml

<name>hbase.rootdir</name>
<value>file:///tmp/hbase-${user.name}/hbase</value>

I thought that the dependancy of hbase on HDFS was much stronger. For the hbase configuration point of view if the hbase.rootdir parameter the only parameter that hooks hbase to HDFS?
Or does zookeeper also binds hbase to HDFS?
Is it true to say that hbase does play well with HDFS but that it does play well with any POSIX compliant filesystem too? For a small cluster, is that a good idea to *not* use HDFS as a storage for the hbase data? If I accept to loose one hour of hbase data, is it OK to make hbase.rootdir point to local (ext3) file system on the node and then rsync each hour that directory to another node? I guess that rsync is not ideal due to the file structure used (will generate a lot of network traffic)

Thanks in advance,
TR

Ryan Rawson wrote:
I think you might be mistaken a bit - HBase layers on top of, and uses
hadoop.  HBase uses HDFS for persistence, and thus the balancer config
and the other things you point out belong in the hadoop config.

3 nodes is a little light for HDFS... With r=3, there is are no spares.

Reply via email to