[
https://issues.apache.org/jira/browse/HBASE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14714485#comment-14714485
]
Yu Li commented on HBASE-6617:
------------------------------
Hi [~zjushch],
Thanks for the review.
I've considered your point carefully, but I still think one replication source
per wal group is a better way, for below reasons:
1. w.r.t semantic of ReplicationSource, I believe it's "many-one" rather than
"one-one" relationship between source and peer. One replication source stands
for one kind of source, and no matter how many kinds of source, we need to
replicate them all to the specified peer. Before multi wal it's a special case
that there's only one kind of source. Just think about the heterogeneous
storage implementation in HDFS, after supporting different kinds of disks, the
block report granularity has changed from node-level to disk-level. I think
multiple wal is quite similar to that.
2. w.r.t business point of view, one wal group may stand for one business. In
our scenario we created a grouping strategy based on namespace which allows
regions of the same business writing into the same log group. In this case one
source per group could allow us to know the replication latency of each
business, per regionserver/cluster level.
3. w.r.t deleting ReplicationSource instance, you could find the logic in
ReplicationSourceManager#removePeer, where the source would be terminated first
and then removed from the source list.
4. w.r.t source metrics, we will use "peerId@groupId" as the id, and when
reporting, the metrics name would be like
"source.<peerId@groupId>.ageOfLastShippedOp", you can find the whole logic in
constructor of MetricsSource. If you'd still prefer to have a metrics
collection to track like "per regionserver level latency to one peer", we could
add a "MetricsReplicationPeerSourceSource" similar to
MetricsReplicationGlobalSourceSource, when using strategy like randomly bounded
region group.
Feel free to let me know your thoughts.
> ReplicationSourceManager should be able to track multiple WAL paths
> -------------------------------------------------------------------
>
> Key: HBASE-6617
> URL: https://issues.apache.org/jira/browse/HBASE-6617
> Project: HBase
> Issue Type: Improvement
> Components: Replication
> Reporter: Ted Yu
> Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-6617.patch, HBASE-6617_v2.patch,
> HBASE-6617_v3.patch
>
>
> Currently ReplicationSourceManager uses logRolled() to receive notification
> about new HLog and remembers it in latestPath.
> When region server has multiple WAL support, we need to keep track of
> multiple Path's in ReplicationSourceManager
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)