[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

Ian Varley (JIRA) Tue, 29 Jan 2013 13:45:14 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565847#comment-13565847
 ]


Ian Varley commented on HBASE-7709:
-----------------------------------

Re: (B -> C -> B) -> A, that's fine; no participants are detecting a cycle 
they're not part of (A isn't adding any peers, it's the slave). B detects a 
cycle it's part of (B -> C -> B) and C does as well.

The API would be simple, and would let the caller walk the graph of clusters: 
ask the peer you're trying to add for all of its peers, then ask each of them 
in turn, and build up a graph structure that you can interrogate. Only call is 
"Tell me your current peers".

I suppose this could cause problems if not all clusters can communicate; say, 
if B is visible to A, and C is visible to B, but C is not visible to A. And I 
guess there might be race conditions if you try to add peers on multiple 
clusters simultaneously, there's not really a way to avoid that.
                
> Infinite loop possible in Master/Master replication
> ---------------------------------------------------
>
>                 Key: HBASE-7709
>                 URL: https://issues.apache.org/jira/browse/HBASE-7709
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Lars Hofhansl
>             Fix For: 0.96.0, 0.94.6
>
>
> We just discovered the following scenario:
> # Cluster A and B are setup in master/master replication
> # By accident we had Cluster C replicate to Cluster A.
> Now all edit originating from C will be bouncing between A and B. Forever!
> The reason is that when the edit come in from C the cluster ID is already set 
> and won't be reset.
> We have a couple of options here:
> # Optionally only support master/master (not cycles of more than two 
> clusters). In that case we can always reset the cluster ID in the 
> ReplicationSource. That means that now cycles > 2 will have the data cycle 
> forever. This is the only option that requires no changes in the HLog format.
> # Instead of a single cluster id per edit maintain a (unordered) set of 
> cluster id that have seen this edit. Then in ReplicationSource we drop any 
> edit that the sink has seen already. The is the cleanest approach, but it 
> might need a lot of data stored per edit if there are many clusters involved.
> # Maintain a configurable counter of the maximum cycle side we want to 
> support. Could default to 10 (even maybe even just). Store a hop-count in the 
> WAL and the ReplicationSource increases that hop-count on each hop. If we're 
> over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

Reply via email to