[
https://issues.apache.org/jira/browse/HBASE-22380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916507#comment-16916507
]
Wellington Chevreuil commented on HBASE-22380:
----------------------------------------------
[~anoop.hbase]
{quote}I mean why the user/client need to pass clusterId? When the bulk load
req comes to the 1st cluster, that RS can add its clusterId in the WALEdit cell.
{quote}
We are already doing that, in a way. User/client who first triggers the bulk
load (calling LoadIncrementalHFiles class) does not set the cluster Id. This is
set only at server side, when the bulk load reaches 1st cluster, as you
mentioned above. But we can't set it on the WAL edit, because bulk load
replication does not send WAL edits to peer clusters. Once bulk load happens,
it does generate a single special WAL edit at the original cluster only.
Replication source reads that, then triggers a bulk load on the peer cluster
directly (it never ships any wal for the original bulk load event in the
source). That's why we had to put cluster id in the bullk load request.
{quote}Only passing the 1st clusterId through all the clusters might not be
enough. We need to add current clusterId into the existing list while passing
to next. And same way check to see whether this cluster already saw/handled
this replication or not.
{quote}
I can't yet see the need for such, but let me try change current test to
include a 3rd cluster in the circle. Or do you mean, to avoid replication if
same hfiles are bulkloaded again on original cluster?
{quote}There's a scenario we should consider about, when MOB feature enabled,
after compaction PartitionedMobCompactor#bulkloadRefFile will be called
{quote}
Let me review that scenario as well.
> break circle replication when doing bulkload
> --------------------------------------------
>
> Key: HBASE-22380
> URL: https://issues.apache.org/jira/browse/HBASE-22380
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Affects Versions: 3.0.0, 1.5.0, 2.2.0, 1.4.10, 2.0.5, 2.3.0, 2.1.5, 1.3.5
> Reporter: chenxu
> Assignee: Wellington Chevreuil
> Priority: Critical
> Labels: bulkload
> Fix For: 3.0.0, 1.5.0, 2.3.0, 1.4.11, 2.1.7, 2.2.2
>
>
> when enabled master-master bulkload replication, HFiles will be replicated
> circularly between two clusters
--
This message was sent by Atlassian Jira
(v8.3.2#803003)