[ 
https://issues.apache.org/jira/browse/HBASE-22380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916507#comment-16916507
 ] 

Wellington Chevreuil commented on HBASE-22380:
----------------------------------------------

[~anoop.hbase]
{quote}I mean why the user/client need to pass clusterId? When the bulk load 
req comes to the 1st cluster, that RS can add its clusterId in the WALEdit cell.
{quote}
We are already doing that, in a way. User/client who first triggers the bulk 
load (calling LoadIncrementalHFiles class) does not set the cluster Id. This is 
set only at server side, when the bulk load reaches 1st cluster, as you 
mentioned above. But we can't set it on the WAL edit, because bulk load 
replication does not send WAL edits to peer clusters. Once bulk load happens, 
it does generate a single special WAL edit at the original cluster only. 
Replication source reads that, then triggers a bulk load on the peer cluster 
directly (it never ships any wal for the original bulk load event in the 
source). That's why we had to put cluster id in the bullk load request.
{quote}Only passing the 1st clusterId through all the clusters might not be 
enough. We need to add current clusterId into the existing list while passing 
to next. And same way check to see whether this cluster already saw/handled 
this replication or not.
{quote}
I can't yet see the need for such, but let me try change current test to 
include a 3rd cluster in the circle. Or do you mean,  to avoid replication if 
same hfiles are bulkloaded again on original cluster?
{quote}There's a scenario we should consider about, when MOB feature enabled, 
after compaction PartitionedMobCompactor#bulkloadRefFile will be called
{quote}
Let me review that scenario as well.

> break circle replication when doing bulkload
> --------------------------------------------
>
>                 Key: HBASE-22380
>                 URL: https://issues.apache.org/jira/browse/HBASE-22380
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 3.0.0, 1.5.0, 2.2.0, 1.4.10, 2.0.5, 2.3.0, 2.1.5, 1.3.5
>            Reporter: chenxu
>            Assignee: Wellington Chevreuil
>            Priority: Critical
>              Labels: bulkload
>             Fix For: 3.0.0, 1.5.0, 2.3.0, 1.4.11, 2.1.7, 2.2.2
>
>
> when enabled master-master bulkload replication, HFiles will be replicated 
> circularly between two clusters



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to