[
https://issues.apache.org/jira/browse/HBASE-13153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977904#comment-14977904
]
Ashish Singhi commented on HBASE-13153:
---------------------------------------
Following problems were faced in our internal testing,
*Problem 1:*
Replication for bulk loaded data was not working HDFS HA cluster.
*Solution:*
Sink cluster will need the source cluster hdfs client configuration to be able
to perform its operation. So we have taken the below approach,
a) Configure unique replication cluster ID <hbase.replication.cluster.id> for
each Source cluster. Example: dc1, dc2
b) Each source cluster needs to place the HDFS client configurations in peer
cluster, in <hbase.replication.cluster.id> of source cluster under
<hbase.replication.conf.dir> directory.
c) During replication, source cluster will send its unique replication id to
peer cluster in the request. Peer cluster will identify this source cluster
hdfs client configurations based on this id and configured replication
configuration directory and will perform replication.
!https://imageshack.com/i/p3Ik2xanp!
This approach requires two more configurations,
i. At source cluster, hbase.replication.cluster.id (mandatory when
hbase.replication.bulkload.enabled is set to true)
ii. At sink cluster, hbase.replication.conf.dir (default hbase configuration
directory)
*Problem 2:*
If source and sink cluster are sharing same HDFS then the hfile from source
cluster is moved to sink cluster, instead of copy as per the logic in
{{SecureBulkLoadEndPoint}}
*Solution:*
To avoid this we will copy the hfiles from source cluster into sink cluster
staging directory parallely in the {{HFileReplicationCallable}} class itself
before sending the request to {{LoadIncrementalHFiles}} and will ensure to
avoid copy/rename again in {{SecureBulkLoadEndPoint}} to staging directory.
This will also solve the problem "the bulk load to peer cluster needs a split,
we will do split by reading each cell from remote src cluster HFile. This will
be a costly op" found during a offline discussion with [~anoopsamjohn].
*Problem 3:*
If add_peer is done before enabling replication for bulk load data, then to
that peer, bulk loaded data was not getting replicated.
*Solution:*
As part RS startup, we init {{ReplicationSourceManager}} there we will check if
all the peers in the cluster have their node in hfile-refs znode.
Please review the above solutions, if it seems ok then I will post a new patch
based on above solutions and update the doc.
> Bulk Loaded HFile Replication
> -----------------------------
>
> Key: HBASE-13153
> URL: https://issues.apache.org/jira/browse/HBASE-13153
> Project: HBase
> Issue Type: New Feature
> Components: Replication
> Reporter: sunhaitao
> Assignee: Ashish Singhi
> Fix For: 2.0.0
>
> Attachments: HBASE-13153-v1.patch, HBASE-13153-v10.patch,
> HBASE-13153-v11.patch, HBASE-13153-v2.patch, HBASE-13153-v3.patch,
> HBASE-13153-v4.patch, HBASE-13153-v5.patch, HBASE-13153-v6.patch,
> HBASE-13153-v7.patch, HBASE-13153-v8.patch, HBASE-13153-v9.patch,
> HBASE-13153.patch, HBase Bulk Load Replication-v1-1.pdf, HBase Bulk Load
> Replication-v2.pdf, HBase Bulk Load Replication.pdf
>
>
> Currently we plan to use HBase Replication feature to deal with disaster
> tolerance scenario.But we encounter an issue that we will use bulkload very
> frequently,because bulkload bypass write path, and will not generate WAL, so
> the data will not be replicated to backup cluster. It's inappropriate to
> bukload twice both on active cluster and backup cluster. So i advise do some
> modification to bulkload feature to enable bukload to both active cluster and
> backup cluster
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)