[ 
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977904#comment-14977904
 ] 

Ashish Singhi commented on HBASE-13153:
---------------------------------------

Following problems were faced in our internal testing,

*Problem 1:*
Replication for bulk loaded data was not working HDFS HA cluster.

*Solution:*
Sink cluster will need the source cluster hdfs client configuration to be able 
to perform its operation. So we have taken the below approach,

a) Configure unique replication cluster ID <hbase.replication.cluster.id> for 
each Source cluster.  Example: dc1, dc2
b) Each source cluster needs to place the HDFS client configurations in peer 
cluster, in <hbase.replication.cluster.id> of source cluster under 
<hbase.replication.conf.dir> directory.
c) During replication, source cluster will send its unique replication id to 
peer cluster in the request.  Peer cluster will identify this source cluster 
hdfs client configurations based on this id and configured replication 
configuration directory and will perform replication.

!https://imageshack.com/i/p3Ik2xanp!

This approach requires two more configurations,
i. At source cluster, hbase.replication.cluster.id (mandatory when 
hbase.replication.bulkload.enabled is set to true)
ii. At sink cluster, hbase.replication.conf.dir (default hbase configuration 
directory)

*Problem 2:* 
If source and sink cluster are sharing same HDFS then the hfile from source 
cluster is moved to sink cluster, instead of copy as per the logic in 
{{SecureBulkLoadEndPoint}}

*Solution:*
To avoid this we will copy the hfiles from source cluster into sink cluster 
staging directory parallely in the {{HFileReplicationCallable}} class itself 
before sending the request to {{LoadIncrementalHFiles}} and will ensure to 
avoid copy/rename again in {{SecureBulkLoadEndPoint}} to staging directory. 
This will also solve the problem "the bulk load to peer cluster needs a split, 
we will do split by reading each cell from remote src cluster HFile. This will 
be a costly op" found during a offline discussion with [~anoopsamjohn].

*Problem 3:*
If add_peer is done before enabling replication for bulk load data, then to 
that peer, bulk loaded data was not getting replicated.

*Solution:*
As part RS startup, we init {{ReplicationSourceManager}} there we will check if 
all the peers in the cluster have their node in hfile-refs znode.



Please review the above solutions, if it seems ok then I will post a new patch 
based on above solutions and update the doc.

> Bulk Loaded HFile Replication
> -----------------------------
>
>                 Key: HBASE-13153
>                 URL: https://issues.apache.org/jira/browse/HBASE-13153
>             Project: HBase
>          Issue Type: New Feature
>          Components: Replication
>            Reporter: sunhaitao
>            Assignee: Ashish Singhi
>             Fix For: 2.0.0
>
>         Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch, 
> HBASE-13153-v11.patch, HBASE-13153-v2.patch, HBASE-13153-v3.patch, 
> HBASE-13153-v4.patch, HBASE-13153-v5.patch, HBASE-13153-v6.patch, 
> HBASE-13153-v7.patch, HBASE-13153-v8.patch, HBASE-13153-v9.patch, 
> HBASE-13153.patch, HBase Bulk Load Replication-v1-1.pdf, HBase Bulk Load 
> Replication-v2.pdf, HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster 
> tolerance scenario.But we encounter an issue that we will use bulkload very 
> frequently,because bulkload bypass write path, and will not generate WAL, so 
> the data will not be replicated to backup cluster. It's inappropriate to 
> bukload twice both on active cluster and backup cluster. So i advise do some 
> modification to bulkload feature to enable bukload to both active cluster and 
> backup cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to