[
https://issues.apache.org/jira/browse/HBASE-23270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17076037#comment-17076037
]
Diptam Ghosh commented on HBASE-23270:
--------------------------------------
For this issue, in my view, "*HbaseInterClusterReplication*" is handling too
many responsibilities , separating out which will give us more flexibility
before proceeding to the implementation of RSGroup aware replication.
# We can extract the "batch creation", "sink management" and "triggering
replication of one batch" out of *this* class, leaving it with "init config"
and "manage shipping logic with failure handling". Tasks mentioned above can be
abstract out with help of a Processor class. Further Sink Selection can also be
abstracted out with help of strategy implementation.
# Proceeding with this design, we can implement custom sink selector, batching
logic and will be able to configure from outside as well.
> Inter-cluster replication is unaware destination peer cluster's RSGroup to
> push the WALEdits
> --------------------------------------------------------------------------------------------
>
> Key: HBASE-23270
> URL: https://issues.apache.org/jira/browse/HBASE-23270
> Project: HBase
> Issue Type: Bug
> Reporter: Pradeep
> Assignee: Pradeep
> Priority: Major
>
> In a source RSGroup enabled HBase cluster where replication is enabled to
> another destination RSGroup enabled cluster, the replication stream of
> List<WALEdit.Entry> go to any node in the destination cluster without the
> awareness of RSGroup and then gets routed to appropriate node where the
> region is hosted. This extra hop where the data is received and routed could
> be of any node in the cluster and no restriction exists to select the node
> within the same RSGroup.
> Implications: RSGroup owner in the multi-tenant HBase cluster can see
> performance and throughput deviations because of this unpredictability caused
> by replication.
> Potential fix: options:
> a) Select a destination node having RSGroup awareness
> b) Group the WAL.Edit list based on region and then by region-servers in
> which the regions are assigned in the destination. Pass the list WALEdit
> directly to the region-server to avoid extra intermediate hop in the
> destination cluster during the replication process.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)