Hello,

I am a bit confused how configurations of hbase replication and dfs
replication works together.

My application deploys on an HBase cluster (0.94.3) with two Region
servers. The two hadoop datanodes run on the same two Region severs.

Because we only have two datanodes, dfs.replication was set to 2.

The person who configured the small cluster didn't explicitly set the hbase
replication configs, which includes:

(1) in ${HBASE_HOME}/conf/hbase-site.xml, hbase.replication is not set. I
think the default value is "false" according to
http://hbase.apache.org/replication.html.

(2) in the table,Replication_Scope is set to 0 (by default).

However, even without setting hbase.replication and replication_scope, it
appears that the tables are duplicated in the two Region servers (as I can
go to the shells of these two region servers and find the duplicate rows
from a scan).

My question is - does the default dfs replication takes care of replicating
hbase tables within the same cluster so we don't need to set up the hbase
replication configs? And only when we need to replicate hbase from one
cluster to another cluster should we set up the hbase replication configs
(1) and (2) above?

thanks!

Jason

Reply via email to