[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752757#comment-13752757
 ] 

Jean-Daniel Cryans commented on HBASE-7709:
---

+1

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752762#comment-13752762
 ] 

stack commented on HBASE-7709:
--

Applied to 0.95 and to trunk.  Want this in 0.94 [~lhofhansl]?  
[~vasu.mariy...@gmail.com] Thanks boss.  Any chance of a release note on this 
issue?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752818#comment-13752818
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Yeah, will sync up with Vasu off line and probably commit to 0.94 soon.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752841#comment-13752841
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Will commit to 0.94 later today if there are no objections.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, 
 HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752858#comment-13752858
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12600457/7709-0.94-rev6.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6953//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, 
 HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752911#comment-13752911
 ] 

Hudson commented on HBASE-7709:
---

SUCCESS: Integrated in hbase-0.95 #500 (See 
[https://builds.apache.org/job/hbase-0.95/500/])
HBASE-7709 Infinite loop possible in Master/Master replication (stack: rev 
1518334)
* 
/hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
* 
/hbase/branches/0.95/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
* /hbase/branches/0.95/hbase-protocol/src/main/protobuf/WAL.proto
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ReplicationProtbufUtil.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotLogSplitter.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, 
 HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752920#comment-13752920
 ] 

Hudson commented on HBASE-7709:
---

SUCCESS: Integrated in HBase-TRUNK #4441 (See 
[https://builds.apache.org/job/HBase-TRUNK/4441/])
HBASE-7709 Infinite loop possible in Master/Master replication (stack: rev 
1518335)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
* /hbase/trunk/hbase-protocol/src/main/protobuf/WAL.proto
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ReplicationProtbufUtil.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotLogSplitter.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, 
 HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753119#comment-13753119
 ] 

Hudson commented on HBASE-7709:
---

SUCCESS: Integrated in hbase-0.95-on-hadoop2 #276 (See 
[https://builds.apache.org/job/hbase-0.95-on-hadoop2/276/])
HBASE-7709 Infinite loop possible in Master/Master replication (stack: rev 
1518334)
* 
/hbase/branches/0.95/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
* 
/hbase/branches/0.95/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
* /hbase/branches/0.95/hbase-protocol/src/main/protobuf/WAL.proto
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ReplicationProtbufUtil.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/branches/0.95/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotLogSplitter.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java
* 
/hbase/branches/0.95/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, 
 HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753140#comment-13753140
 ] 

Hudson commented on HBASE-7709:
---

SUCCESS: Integrated in HBase-0.94-security #274 (See 
[https://builds.apache.org/job/HBase-0.94-security/274/])
HBASE-7709 Infinite loop possible in Master/Master replication (Vasu Mariyala) 
(larsh: rev 1518410)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, 
 HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753164#comment-13753164
 ] 

Hudson commented on HBASE-7709:
---

SUCCESS: Integrated in HBase-0.94 #1128 (See 
[https://builds.apache.org/job/HBase-0.94/1128/])
HBASE-7709 Infinite loop possible in Master/Master replication (Vasu Mariyala) 
(larsh: rev 1518410)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALEdit.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, 
 HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13753204#comment-13753204
 ] 

Hudson commented on HBASE-7709:
---

FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #700 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/700/])
HBASE-7709 Infinite loop possible in Master/Master replication (stack: rev 
1518335)
* 
/hbase/trunk/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Mutation.java
* 
/hbase/trunk/hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java
* /hbase/trunk/hbase-protocol/src/main/protobuf/WAL.proto
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Import.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/ReplicationProtbufUtil.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/BaseRowProcessor.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RowProcessor.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/FSHLog.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/SnapshotLogSplitter.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/HLogPerformanceEvaluation.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestMasterReplication.java


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 7709-0.94-rev6.txt, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, 
 HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-27 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751447#comment-13751447
 ] 

Ted Yu commented on HBASE-7709:
---

+1

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750192#comment-13750192
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12599751/HBASE-7709-rev5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6896//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-26 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750289#comment-13750289
 ] 

Vasu Mariyala commented on HBASE-7709:
--

The patch HBASE-7709-rev5.patch is on the top of 0.94 and hence the hadoop qa 
would always fail while applying this patch on trunk. Can any one please run 
the hadoop qa build for the patch 0.95-trunk-rev4.patch (which is the trunk 
and 0.95 patch)?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-26 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750297#comment-13750297
 ] 

Ted Yu commented on HBASE-7709:
---

Please attach 0.95-trunk-rev4.patch one more time - Hadoop QA picks up the 
latest attachment

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750559#comment-13750559
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/1250/0.95-trunk-rev4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 2 release 
audit warnings (more than the trunk's current 0 warnings).

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6902//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that 

[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-26 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750617#comment-13750617
 ] 

Vasu Mariyala commented on HBASE-7709:
--

The release audit warnings are not related to the patch. This has to do with 
the missing licenses in the below files. After correcting the license info in 
these files, the release audit is successful

{code}
***

Unapproved licenses:

  
/home/vmariyala/bigdata-dev/testhbase/hbase-server/src/main/resources/hbase-webapps/static/css/bootstrap-theme.min.css
  
/home/vmariyala/bigdata-dev/testhbase/hbase-server/src/main/resources/hbase-webapps/static/css/bootstrap-theme.css

***
{code}

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-26 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750727#comment-13750727
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

I reviewed 0.94 and trunk patch. They both looks good to me! +1 from me. 
Thanks. 
In the trunk, we currently carry all clusterIds in the replication path and we 
could optimize this later when there is a need.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749550#comment-13749550
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12599751/HBASE-7709-rev5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6866//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749520#comment-13749520
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12599751/HBASE-7709-rev5.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6865//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, 0.95-trunk-rev4.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, 
 HBASE-7709-rev3.patch, HBASE-7709-rev4.patch, HBASE-7709-rev5.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748685#comment-13748685
 ] 

Ted Yu commented on HBASE-7709:
---

{code}
+  public void setClusters(SetUUID clusterIds) {
{code}
Would setClusterIds() be better name for the above method ?
{code}
+   * @return the set of clusters that have consumed the mutation
{code}
'set of clusters' - 'set of cluster Ids'
{code}
+  public SetUUID getClusters() {
{code}
getClusters - getClusterIds
{code}
-private UUID clusterId;
+private SetUUID clusters;
{code}
clusters - clusterIds

If you agree with the above comments, please modify names in other places as 
well.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-23 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748982#comment-13748982
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

I reviewed the trunk patch. One thing I noticed that the trunk patch deprecats 
clusterId and related code. I think we should still keep it around. The reason 
is that one of the semantics of clusterId is the Original ClusterId where the 
changes are generated. This information will be very useful when we build 
monitoring dashboard to show how many edits from each source cluster. Similarly 
we could combine the original cluster Id and write time to know replication 
latency from source to current cluster. 
The rest looks good. Thanks.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, HBASE-7709.patch, 
 HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, 
 HBASE-7709-rev4.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-23 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749081#comment-13749081
 ] 

Vasu Mariyala commented on HBASE-7709:
--

[~jeffreyz] This cluster information is only stored as part of the HLog and it 
gets rolled. So do you think it is the place from where we read the information 
about the originating cluster to build such metrics?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, HBASE-7709.patch, 
 HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, 
 HBASE-7709-rev4.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-23 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749166#comment-13749166
 ] 

Lars Hofhansl commented on HBASE-7709:
--

This does raise a good point. Maybe we should store the cluster ids in order of 
traversal. That would later allow us to reconstruct the replication path 
between clusters and display it in the shell.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, HBASE-7709.patch, 
 HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, 
 HBASE-7709-rev4.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-23 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749171#comment-13749171
 ] 

Ted Yu commented on HBASE-7709:
---

bq. we should store the cluster ids in order of traversal
+1

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, 0.95-trunk-rev3.patch, HBASE-7709.patch, 
 HBASE-7709-rev1.patch, HBASE-7709-rev2.patch, HBASE-7709-rev3.patch, 
 HBASE-7709-rev4.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748263#comment-13748263
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12599256/0.95-trunk-rev2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6850//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: 

[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746586#comment-13746586
 ] 

Lars Hofhansl commented on HBASE-7709:
--

The 0.94 patch looks good. Bit large, but then again this is a bad bug to have 
(when it hits you you'll useless load on your cluster forever, throwing your 
versions off, etc).
Nice refactoring of the replication test.

Few nits:
* PREFIX_CLUSTER_KEY in WALEdit could just be '_', right? No need to store that 
longer prefix everywhere.
* Similarly maybe make PREFIX_CONSUMED_CLUSTER_IDS in Mutation just _cs.id
* The comment for scopes in WALEdit could be a bit more explicit that we're 
overloading scopes with the cluster id for backwards compatibility.

+1 otherwise (assuming the full 0.94 test suite passes)

Looking at trunk patch now.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-21 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746611#comment-13746611
 ] 

Lars Hofhansl commented on HBASE-7709:
--

In trunk:
* should repeated UUID clusters = 8 in WAL.proto? Otherwise we can't read old 
log entries. But maybe that's not a problem...?
* in Import:
{code}
+clusters = new HashSetUUID();
+clusters.add(ZKClusterId.getUUIDForCluster(zkw));
{code}
Can be written as {{cluster = 
Collections.Collections.singleton(ZKClusterId.getUUIDForCluster(zkw))}}
* Is this right?
{code}
+  for(UUID clusterId : key.getClusters()) {
 uuidBuilder.setLeastSigBits(clusterId.getLeastSignificantBits());
 uuidBuilder.setMostSigBits(clusterId.getMostSignificantBits());
+keyBuilder.addClusters(uuidBuilder.build());
{code}
addClusters expects a Set.
* Where is HlogKey.PREFIX_CLUSTER_KEY used? Just to read old versions of 
WALEdits? Need to discuss if that is necessary. [~stack]? This has to do with 
upgrading WALEdits from pre 0.95.

Otherwise looks great.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-21 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746781#comment-13746781
 ] 

Vasu Mariyala commented on HBASE-7709:
--

Attached the patches for 0.94 (HBASE-7709-rev3.patch) and 0.95, 
trunk(0.95-trunk-rev2.patch) which addresses the nits mentioned by Lars

0.94

   a) Changed PREFIX_CLUSTER_KEY to '.' (period as the column family names 
can't start with it)

   b) PREFIX_CONSUMED_CLUSTER_IDS changed to _cs.id

   c) A comment has been added in WALEdit mentioning that it is done for 
backwards compatibility and has been removed in 0.95.2+ releases

trunk/0.95
 
  a) From protobuf documentation 

 repeated: this field can be repeated any number of times (including zero) 
in a well-formed message. The order of the repeated values will be preserved..
 optional: a well-formed message can have zero or one of this field (but 
not more than one).

  So does repeated imply it is optional? Also, from the WALProtos.java the 
clusters list is initialized to empty list in the initFields() method so we 
would not get any   NullPointerException. May be, I would do more research on 
this.

  b) clusters in Import has been changed to use singleton

  c) addClusters has a method public Builder 
addClusters(org.apache.hadoop.hbase.protobuf.generated.HBaseProtos.UUID value) 
which takes the UUID as the parameter.

  d) Yes, this is used only to read the older log entries when migrating from 
0.94 to 0.95.2.


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-21 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746818#comment-13746818
 ] 

Himanshu Vashishtha commented on HBASE-7709:


bq. +  repeated UUID clusters = 8;
 /*
-  optional CustomEntryType custom_entry_type = 8;
-
+  optional CustomEntryType custom_entry_type = 9;

This re-ordering good because 0.96.0 is not released yet?

I think we should have the flexibility to read older edits as a clean shutdown 
is a stringent requirement (especially for larger clusters). Also when 
replication is enabled, there may be some old logs left to replicate.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-21 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13746838#comment-13746838
 ] 

Vasu Mariyala commented on HBASE-7709:
--

[~v.himanshu] There is no re-ordering done with the patch. The entry 
custom_entry_type is and was a commented one. I changed the number to 9 just 
incase if some one un-comments it in the future. Please let me know if I miss 
anything

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 0.95-trunk-rev2.patch, HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch, HBASE-7709-rev3.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-20 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13745573#comment-13745573
 ] 

Vasu Mariyala commented on HBASE-7709:
--

0.95-trunk-rev1.patch contains the javadoc fix and is the latest patch for the 
trunk and 0.95 branches. HBASE-7709-rev2.patch is the updated patch on the top 
of 0.94 which addresses the comments made by [~lhofhansl], [~jdcryans] and 
[~jeffreyz]

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
Assignee: Vasu Mariyala
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-16 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742367#comment-13742367
 ] 

Vasu Mariyala commented on HBASE-7709:
--

I ran all the test cases on my local machine with the trunk patch and they are 
successful. But everytime it is run on jenkins, it throws 

FATAL: Unable to delete script file /tmp/hudson5964600500647866956.sh
hudson.util.IOException2: remote file operation failed: 
/tmp/hudson5964600500647866956.sh at hudson.remoting.Channel@5ce45886:hadoop1
at hudson.FilePath.act(FilePath.java:902)
at hudson.FilePath.act(FilePath.java:879)
at hudson.FilePath.delete(FilePath.java:1288)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:101)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
at hudson.model.Build$BuildExecution.build(Build.java:199)
at hudson.model.Build$BuildExecution.doRun(Build.java:160)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
at hudson.model.Run.execute(Run.java:1597)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:247)
Caused by: hudson.remoting.ChannelClosedException: channel is already closed
at hudson.remoting.Channel.send(Channel.java:516)
at hudson.remoting.Request.call(Request.java:129)
at hudson.remoting.Channel.call(Channel.java:714)
at hudson.FilePath.act(FilePath.java:895)
... 13 more
Caused by: java.io.IOException: Unexpected termination of the channel
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at hudson.remoting.Command.readFrom(Command.java:92)
at 
hudson.remoting.ClassicCommandTransport.read(ClassicCommandTransport.java:72)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
hudson.remoting.RequestAbortedException: 
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
at 
hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
at 
hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
at hudson.remoting.Request.call(Request.java:174)
at hudson.remoting.Channel.call(Channel.java:714)
at 
hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:167)
at com.sun.proxy.$Proxy40.join(Unknown Source)
at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:925)
at hudson.Launcher$ProcStarter.join(Launcher.java:360)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:91)
at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:60)
at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:804)
at hudson.model.Build$BuildExecution.build(Build.java:199)
at hudson.model.Build$BuildExecution.doRun(Build.java:160)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:586)
at hudson.model.Run.execute(Run.java:1597)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:247)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: 
Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:299)
at hudson.remoting.Channel.terminate(Channel.java:774)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.io.IOException: Unexpected termination of the channel
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
at 
java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2596)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1316)
 

[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742560#comment-13742560
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12598500/095-trunk.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestAdmin

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6788//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, HBASE-7709.patch, 
 HBASE-7709-rev1.patch, HBASE-7709-rev2.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742640#comment-13742640
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12598525/0.95-trunk-rev1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6792//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: 095-trunk.patch, 0.95-trunk-rev1.patch, 
 HBASE-7709.patch, HBASE-7709-rev1.patch, HBASE-7709-rev2.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741880#comment-13741880
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12598350/HBASE-7709-rev2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6775//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: HBASE-7709.patch, HBASE-7709-rev1.patch, 
 HBASE-7709-rev2.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-15 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741909#comment-13741909
 ] 

Vasu Mariyala commented on HBASE-7709:
--

0.95 patch works for trunk as well.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: HBASE-7709-095.patch, HBASE-7709.patch, 
 HBASE-7709-rev1.patch, HBASE-7709-rev2.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741951#comment-13741951
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12598360/HBASE-7709-095.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 9 new 
or modified tests.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestHLog
  org.apache.hadoop.hbase.migration.TestNamespaceUpgrade

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6777//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.94.12, 0.96.0

 Attachments: HBASE-7709-095.patch, HBASE-7709.patch, 
 HBASE-7709-rev1.patch, HBASE-7709-rev2.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more 

[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-13 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738867#comment-13738867
 ] 

Jean-Daniel Cryans commented on HBASE-7709:
---

The logic here is getting big enough that it should be encapsulated, but I 
can't think right now if a nice way to do it.

WALEdit will need more javadoc.

The reason WALEdit.scopes wasn't instantiated is that you wouldn't need to keep 
an extra TreeMap around if replication wasn't enabled. It seems this patch 
changes that assumption. If it makes more sense to always instantiate it then 
it should be final, there are a bunch of if (scopes == null) that aren't 
needed anymore, and if (scopes != null) would always be true.

bq. A - B - C - A replication

I think you meant the last cluster to be B? It seems we should refactor 
TestMasterReplication a bit because with this patch it would just look like the 
same code is running 3 times (to the untrained eye).

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch, HBASE-7709-rev1.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-13 Thread Dave Latham (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738999#comment-13738999
 ] 

Dave Latham commented on HBASE-7709:


Thanks all for the great work on this.

We currently have a pair of clusters in two datacenters in a master/master 
setup and want to migrate one of them to a new datacenter.  I'm trying to 
determine if this patch will be required for us and would love if someone would 
be willing to double check my thinking.

Currently we have
A - B, B - A

1. Setup C, create presplit tables with replication_scope enabled on them.
2. Add peer B - C  (New state A - B, B - A, B- C)
3. Copy table on each table from B - C
4. Stop applications in A
5. Wait for queues from A - B to clear
6. Remove peer A - B (New state B - A, B- C
7. Remove peer B - A (New state B - C)
8. Add peer C - B (New state B - C, C - B)
9. Start applications in C

Given that we can live with applications only running in a single datacenter 
for a period of time we don't ever need to have writes from one cluster 
replicate to a downstream loop.  Therefore I don't think this patch is required 
for this migration.  Does that sound correct?  So does the state of (A - B) 
- C still trigger the problem?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch, HBASE-7709-rev1.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-13 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739015#comment-13739015
 ] 

Lars Hofhansl commented on HBASE-7709:
--

You are correct, you do not need this patch as along as you do step #6 before 
step #8. (A - B) - C is fine. C - (A - B) is not.


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch, HBASE-7709-rev1.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-12 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737365#comment-13737365
 ] 

Vasu Mariyala commented on HBASE-7709:
--

Thanks [~lhofhansl] and [~jeffreyz] for the review comments. I have attached 
the patch HBASE-7709-rev1.patch with the fixes based on your comments. 
Also added a test cases for A - B - C - A replication. Please do review this 
patch.

For the 0.95+ and the trunk fixes, I would like to either remove/deprecate the 
setClusterId and getClusterId methods of Mutation and also HLogKey as these are 
primary maintained to avoid cyclic replication. Please do let me know your 
opinion on this so that I can work on providing the patches for the same.


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch, HBASE-7709-rev1.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737363#comment-13737363
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12597557/HBASE-7709-rev1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6706//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch, HBASE-7709-rev1.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13737390#comment-13737390
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12597567/HBASE-7709-rev1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6707//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch, HBASE-7709-rev1.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-11 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736398#comment-13736398
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

[~vasu.mariy...@gmail.com] I reviewed your patch. The following lines:
{code}
-  if (!logKey.getClusterId().equals(peerClusterId)) {
+  // don't replicate if the log entries has already been consumed by the 
peer cluster
+  if (!edit.hasClusterConsumed(peerClusterId)) {
{code}
I think we still need keep the old check around. The above change may cause 
issues during upgrade. For example, we have ( A - B, B- A) replication setup. 
If we just upgrade cluster A, the above change may cause infinite loop before 
we upgrade cluster B.

Rest code looks good to me.


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-11 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736423#comment-13736423
 ] 

Vasu Mariyala commented on HBASE-7709:
--

[~jeffreyz]

The cluster ids in the WALEdit contains all the cluster ids including the 
cluster id on which the entry was created first.When ReplicationSource runs on 
A, it marks the cluster itself as consumed in the WALEdit. This entry would 
then be sent on to B. ReplicationSource running on B checks that the cluster A 
has already consumed and therefore does not send the entry back to A. If there 
are any possible scenarios where this would not work, please do let me know.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-11 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736472#comment-13736472
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

[~vasu.mariy...@gmail.com] The above you mentioned will work if both A and B 
cluster are upgraded to the latest bits. Let's say cluster B is NOT upgraded 
yet while A is upgraded. When B send edits to A, the scope of a WALEdit won't 
have cluster Ids so A's WALEdits won't have cluster ID B in its scope. Then 
later the check edit.hasClusterConsumed(peerClusterId) in cluster A will 
evaluate as false and the edits of B will be sent back to cluster B.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736200#comment-13736200
 ] 

Lars Hofhansl commented on HBASE-7709:
--

In any event. We should probably have separate issues for the real 0.95+ fix 
and the backwards compatible 0.94 patch. Maybe we can try to keep the API the 
same, or at least similar.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735635#comment-13735635
 ] 

Hadoop QA commented on HBASE-7709:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12597230/HBASE-7709.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/6689//console

This message is automatically generated.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735742#comment-13735742
 ] 

Lars Hofhansl commented on HBASE-7709:
--

HadoopQA only works against trunk. But since the trunk patch would presumably 
be quite a bit different it wouldn't help with 0.94.
Will have a look at the patch over the weekend.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-09 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13735748#comment-13735748
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Patch looks good. Smaller than expected :)
Two comments:
# Maybe change the new methods to addClusterId and getAllClusterIds?
# In ReplicationSink we need to group by unique path, I think, (just like I do 
now by ClusterId) so that the path is maintained during intermediary 
replication.


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12

 Attachments: HBASE-7709.patch


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-08 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733611#comment-13733611
 ] 

Lars Hofhansl commented on HBASE-7709:
--

We also need to keep HBASE-9158 (a bug I just discovered). Here we need to 
group the edits by path and apply them strictly in these groups.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732124#comment-13732124
 ] 

Jean-Daniel Cryans commented on HBASE-7709:
---

I'm +1 on [~lhofhansl]'s proposition, let's do it in a different jira?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732750#comment-13732750
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

I think for 0.95 and onwards, we should store relay cluster ids along 
replication path in HLogKey to solve the issue . Since the list of replay 
cluster ids is added into each WALEdit, the storage  network traffic overhead 
isn't trivial when we have a long replication path. 

We can use an optimization(mentioned above as adaptive #2). We introduce a 4 
byte path checksum field into HLogKey, a cluster only adds its cluster id into 
the relay cluster id list when it finds there exist multiple paths from a 
single cluster id. In most cases such as a simple replication loop or a acyclic 
replication path, the relay cluster id list is empty. The overhead is just the 
4 bytes path check sum.   

For 0.94, we can either use Lars approach(configuration option 
hbase.enable.cyclic.replication) or introduce a new configuration option 
hbase.replication.reset.clusterid=clusterId which a user specifies | *. 
Only cluster specified here reset clusterId to itself. When 
hbase.replication.reset.clusterid=*, it is equivalent to Lars approach.
In addition, we can leverage existing field HLogKey.writeTime to detect loop in 
0.94 if a WALEdit is stale too long for replication(like configurable 30mins). 
We can pass the writeTime as an attribute like the way cluster Id is passed 
during replication so that we can check the original writeTime to see if we 
have a possible infinite loop situation.

Thanks.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732869#comment-13732869
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Reading back through the comments here. How about we follow Ted's approach.
Instead of writing a boolean we write the number of hops into the HLogKey. 0 
will still be interpreted as false in the old code, 0 as true. Thus we can 
store the number hops and still not break the existing code (although the code 
would not be able to stop the bouncing).
In our Salesforce scenario we would limit the hop count to 3 and would be able 
to support our setup that way.

Yet another option is to make this configurable. At this point we're still able 
to fully bounce our clusters. So we can do the hop count and optionally (per a 
config option) store the full path, this might even be applicable to trunk as 
the user now has the choice between limiting loops to some limit with little 
extra storage or be precise at the expense of more storage.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732880#comment-13732880
 ] 

Jean-Daniel Cryans commented on HBASE-7709:
---

My impression regarding configuring path or hops count is that if you start 
changing the clusters the upkeep becomes very expensive and it's not clear what 
happens while it's being changed (or if a cluster just goes down).

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732914#comment-13732914
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

Due to the 0.94 upgrade complications, the configuration approach is a 
practical one. Basically we shifts the duty to users to break infinite loop 
situation with the newly introduced configuration. 

Meantime we have to provide a way to detect possible infinite loop situation in 
0.94 so that a user can act upon it. Using max hop counter is better only used 
for detection not the way to break infinite loop because it's error prone as JD 
pointed above when replication path changes.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732928#comment-13732928
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Are you referring to switching via config option from storing just the hop 
count to storing the path?
Yep, an admin would need to make the call and bounce the cluster (no rolling 
restart when that option is enabled the first time). Not ideal.
I'm just looking for ways to avoid local Salesforce-only patches, but maybe 
backporting a trunk patch that stores the path would not be so bad (it sets a 
bad precedence here, though :) )

The hop count we can always do safely (I think). In our case we'd enable the 
path option and bounce the cluster(s).

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732933#comment-13732933
 ] 

Vasu Mariyala commented on HBASE-7709:
--

Can we add the information of the clusterids which already contain the change 
to the scopes variable of the WALEdit? Scopes being a navigable map of byte 
array to integer can contain the byte array of the cluster ids to 1 (indicating 
the cluster has received the change already)?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732934#comment-13732934
 ] 

Lars Hofhansl commented on HBASE-7709:
--

[~jeffreyz] The hop count is always safe to do, no? We'd default it to a 
reasonably large value (say 1000). This should be immune to topology changes. 
Without it edits would bounce within the replication ring *forever*, with no 
way to stop it (other than disabling replication or deleting the WAL files), 
that is almost worse than downtime.


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732947#comment-13732947
 ] 

Lars Hofhansl commented on HBASE-7709:
--

The scope idea might just work. It's only read and match against column 
families so as long as we prefix it with something that cannot be contained in 
a column family name.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732984#comment-13732984
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

[~lhofhansl] The hop count is only safe when using a big value. Let's say you 
use 1000. For 3 cluster situation, it means the same data will be re-written 
300+ times before we stop. This affects a cluster's performance and slow down 
regular replication as well.

Yeah, I think the scope idea can fly as column family only allows printable 
characters so it's possible to come up a special prefix character to store 
cluster id

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13732996#comment-13732996
 ] 

Lars Hofhansl commented on HBASE-7709:
--

[~vasu.mariy...@gmail.com], wanna work out a patch? We can work on that 
together if you like.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-07 Thread Vasu Mariyala (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13733131#comment-13733131
 ] 

Vasu Mariyala commented on HBASE-7709:
--

[~lhofhansl], working on it. Sure, will take your help.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-05 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729707#comment-13729707
 ] 

Jean-Daniel Cryans commented on HBASE-7709:
---

Ah ok that's a fancy setup you got there. Sounds ok to me.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730116#comment-13730116
 ] 

stack commented on HBASE-7709:
--

What is to be done on this for 0.95.2?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.12


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-02 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728275#comment-13728275
 ] 

Jean-Daniel Cryans commented on HBASE-7709:
---

{quote}
So I'd like to introduce a config option: hbase.enable.cyclic.replication. The 
default is true to maintain the current functionality.
If set to false we'd reset the cluster id at each source and hence would only 
support master-master replication (cycles involving more that 2 nodes would 
lead to infinite loops).
{quote}

This seems like a lose-lose. The current functionality has the problem that 
7709 is about and setting the config to false would just make it worse?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.11


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-02 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728280#comment-13728280
 ] 

Lars Hofhansl commented on HBASE-7709:
--

It would allow a A - B - C scenario, which is currently not possible.
At the same time it would break setups like A - B - C - A


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.11


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-08-02 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728432#comment-13728432
 ] 

Lars Hofhansl commented on HBASE-7709:
--

In fact we will have the following setup:

A - B, C - D, E - F, ... (where these are all pairs of DR clusters. We 
keep them both as master so that a failover for other reasons, even just as 
exercise does not need further configuration).
We sometime migrate an entire cluster, say A. In that case we'd also replicate 
A - C. Currently we can't do that, because the data from A would bounce 
between C and D forever.


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.11


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-07-17 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13711868#comment-13711868
 ] 

Lars Hofhansl commented on HBASE-7709:
--

That would be more flexible, but at the same time more tedious to manage.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.10


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-07-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710575#comment-13710575
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Any opinions?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.10


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-07-16 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13710629#comment-13710629
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

For 0.94, I think it'd be better that we introduce a configuration setting 
hbase.replication.reset.clusterid=clusterId which a user specifies. Only 
cluster specified here reset clusterId to itself so that we still can support 
master-master replication involving more than 2 nodes without bumping up logkey 
version.

We could possibly bump up HLogKey version with one upgrade configuration 
setting like upgrade.logkey plus two rounds of rolling restart. Originally we 
set the config setting to false. First round rolling start to upgrade RS 
bits(new RS still write hlogkey in old version) and after all RS upgraded, we 
set the configuration to true and then second rolling start.

The above complicates the upgrade scenario a little bit and requires all 
involved clusters in the replication are upgraded. 


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.10


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-07-10 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13704388#comment-13704388
 ] 

Lars Hofhansl commented on HBASE-7709:
--

It seems for 0.94 we can either do option #1 or nothing at all.

So I'd like to introduce a config option: hbase.enable.cyclic.replication. The 
default is true to maintain the current functionality.
If set to false we'd reset the cluster id at each source and hence would only 
support master-master replication (cycles involving more that 2 nodes would 
lead to infinite loops).


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.95.2, 0.94.10


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-05-01 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646413#comment-13646413
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

[~lhofhansl] Sure. Is that all right to implement option #6 for 0.94 and 
adaptive option#2 for trunk? 

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.94.8, 0.95.1


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-04-30 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646400#comment-13646400
 ] 

Lars Hofhansl commented on HBASE-7709:
--

The proposal sounds good. Are you still planning to work on this [~jeffreyz]?

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.94.6, 0.95.1
Reporter: Lars Hofhansl
 Fix For: 0.98.0, 0.94.8, 0.95.1


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-04-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13623119#comment-13623119
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Sorry, I missed this. I need to read through and digest it. :)
In any event, moving to 0.94.8.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.1, 0.94.6
Reporter: Lars Hofhansl
 Fix For: 0.95.1, 0.94.7


  We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-03-11 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13598579#comment-13598579
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

Continue with more proposals...

The disadvantages of option#2 is obvious as its advantages. Even in cases(maybe 
majority replication usage cases), there is no loop at all and just a long 
replication queue. The downstream RSs still need to replay and store a long 
list of clusterIds for each WALEdit. Encoding may help compress the clusterId 
list in sending part but not in storing.

Let me firstly try to show you if we can do better than option#2 and then an 
alternative way which is good in most cases without more storage need. Both 
options are good IMHO.

As we know loop is caused by back-edge in graph. We can roughly identify them 
by the fact if a region server sees there are more than one path from same 
source. If that's the case, loop situation is likely. Only by then, we need to 
append current cluster Id to the source cluster Id of a WAL edit for later loop 
detection. Therefore, in most cases, we don't need store long clusterId list if 
there is no loop or a simple master-master-master… cycle setup.

I called the above updated option#2 as adaptive option#2 where it only need 
more storage when there is a need. We can implement it as following:

1) Maintain a hash string PathCheckum(= Hash(receivedPathChecksum + current 
clusterId)) of a WAL edit 
2) Each replaying  receiving region server maintains an internal memory 
ClusterDistanceMap clusterId, SetPathChecksums seen so far. 
  2.a Every time if it sees a new PathChecksum(which isn't in 
SetPathChecksums ), it add the new PathChecksum into SetPathChecksums or 
drop a stale one from  SetPathChecksums when it's expired, i.e. after a 
configurable time period, a region server doesn't see any data coming in from 
the path.
3) When SetPathChecksums's size  1, append current cluster id into the WAL 
edit for later replication loop detection.

We can use top 8 bytes of clusterId to store PathChecksum and the rest 8 bytes 
as the hash of the original cluserId value. After the update, we only need to 
pay cost when there is a need. 


While you can image in real life replication setup normally doesn't involve any 
complicated graph, the option#2 is using extra storage need to deal with 
situations most likely won't happen. Therefore, in the following, I want to 
propose a solution without changing current WAL format and is good for most 
cases including the situation triggering the JIRA. In extreme cases, it reports 
errors for infinite loop. 

The new proposal(option #6) is as following:
1) Maintain a hash string PathCheckum(= Hash(receivedPathChecksum + current 
clusterId)) of a WAL edit 
2) Each replaying  receiving region server maintains an internal memory 
ClusterDistanceMap clusterId, SetPathChecksums seen so far. 
  2.a Every time if it sees a new PathChecksum(which isn't in 
SetPathChecksums ), it add the new PathChecksum into SetPathChecksums or 
drop a stale one from  SetPathChecksums when it's expired, i.e. after a 
configurable time period, a region server doesn't any data coming in from the 
path.
3) When SetPathChecksums's size  1, reset a WAL edit's clusterId to current 
clusterId and increment a counter(ResetCounter) to mark how many times current 
WAL edit's clusterId has been reset.
4) When ResetCounter  64, reports error( we could drop WAL edits as well 
because when ResetCounter  64, it means we have at least 64 back-edges or 
duplicated sources. I think it's way complicated to have such cases.)

The advantage of the above option is possibly using existing HLog format to 
prevent possible loop situation in real life cases

To implement,
1) we can introduce a new version(3) in HLogKey
2) use top 7 bytes of UUID to store PathChecksum, use the following 1 byte to 
store RD and the remaining 8 bytes as a hash value of the 16 bytes length of 
origin UUID value without compromising uniqueness because in most cases we have 
10s clusters involved in replication and the collision probability is less than 
10(-18)
3) we can introduce a configuration setting with default to false(suggested by 
Lars). After we rollout the feature, we can turn it on and turn if off in 
revert scenario.

Thanks,
-Jeffrey


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.0, 0.94.6
Reporter: Lars Hofhansl
 Fix For: 0.95.0, 0.94.7


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit 

[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-03-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597418#comment-13597418
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

I have another idea which IMHO is better. The basic idea is following:

1) We maintain a counter value called RD(replication distance) which 
represents how far a WAL edit from a source cluster to current cluster like the 
hop-counter mentioned in option 3.
2) Each replaying  receiving region server maintains an internal memory 
ClusterDistanceMap clusterId, MIN(RD). Every time, if it sees a WAL with RD 
less than it currently has seen then just update the internal map with the 
smaller RD value.
3) drop all WAL edits from a cluster with RD  the the one current region 
server has in the ClusterDistanceMap

Initially we could duplicate data for first several WAL edits but it will be 
corrected soon so we don't need to persistent any data for fail over scenario. 

The above idea is similar to option 3 but without always double replicating 
data on some clusters and maintaining the max-hop is human error-prone if we 
forget to bump up the max hop-count value when more clusters join in 
replication cycle.

Why it works? Loop detection: quick walker will catch up slow walker but travel 
more.
When we have infinite loop replication as mentioned in the JIRA, the data from 
a source must come from multiple ways to the destination with different RDs. 
Because it's evolving some loops, the RD won't be same otherwise there is no 
loop. Since the RD is different, we just need keep the data from the source 
with min distance.  

You may ask the diamond situation like following.

a-b-d
a-c-d

where the data from a will be replicated to d twice. This is we configure to 
let d receive a's data twice. If there is loop involved and the loop-backed 
data will be dropped by the above way.


This is general loop detection strategy so we can implement it in 0.96 or 
above. For 0.94, 

1) we can introduce a new version(3) in HLogKey
2) use top two bytes of UUID to store the RD value and the remaining 14 bytes 
as a hash value of the 16 bytes length of origin UUID value without 
compromising uniqueness because in most cases we have 10s clusters involved in 
replication and the collision probability is less than 10(-18)
OR
using Ted's suggestion to overload the boolean byte.
3) we can introduce a configuration setting with default to true. When we want 
to revert the new behavior, we can turn it off. 

please let me how do you think? Assign the ticket to me firstly in case we 
agree the implement the way I'm proposing.

Thanks,
-Jeffrey


  
 

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.0, 0.94.6
Reporter: Lars Hofhansl
 Fix For: 0.95.0, 0.94.7


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-03-08 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597532#comment-13597532
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Hey Jeffrey, that is my option #3 in the description, right? :)

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.0, 0.94.6
Reporter: Lars Hofhansl
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.94.7


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-03-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597611#comment-13597611
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

[~lhofhansl] My proposal is similar to option 3 because we both use 
hop-counter(replication distance in my proposal). While as I mentioned in the 
proposal 
{quote}
The above idea is similar to option 3 but without always double replicating 
data on some clusters and maintaining the max-hop is human error-prone if we 
forget to bump up the max hop-count value when more clusters join in 
replication cycle.
{quote}

In the new proposal, region servers dynamically discover  maintain the MIN(RD) 
from a cluster and drop all edits which higher RDs from the same cluster.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.0, 0.94.6
Reporter: Lars Hofhansl
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.94.7


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-03-08 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597636#comment-13597636
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Ah yes. Cool. I should read the entire text before replying. Yes, that should 
work. I like it. The distance data does not have to be persisted as you say, 
upon restart an RS would just relearn.

Generally, do you like this better than option #2? #2 would store too much data?

As for 0.94. I like the config option, but it needs to be default off, so that 
we can do rolling restarts by default.


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.0, 0.94.6
Reporter: Lars Hofhansl
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.94.7


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-03-08 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597711#comment-13597711
 ] 

Enis Soztutar commented on HBASE-7709:
--

I like option #2 better than this. It is more simpler. Jeff's idea is good, but 
has the problem of dealing with the topology changes. If the topology changes 
in a way to make the normal route to a cluster longer, than all the updates 
afterwards will be dropped unless we somehow clear the cached mappings. This 
brings in an operational burden of cleaning the caches of downstream clusters, 
once the admin changes the topology upstream. 
{code}
A - B - C is changed to A - B - D - C - B 
{code}

Orthogonal to this, we also should be dropping the edits at the replication 
source, not the sink. We are doubling the network cost in cyclic cases. #2 also 
helps with this condition, because we can detect the sink cluster's id, and 
filter out. 

We can do a similar dynamic dictionary encoding for storing set of cluster ids. 
We can do it as a follow up optimization.



 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.0, 0.94.6
Reporter: Lars Hofhansl
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.94.7


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-03-08 Thread Jeffrey Zhong (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13597734#comment-13597734
 ] 

Jeffrey Zhong commented on HBASE-7709:
--

I agree #2 is simpler but it is at cost of replaying more data and storing more 
data.

As Enis mentioned, my proposal will need special handling when a new cluster 
joins. Either dynamically encoding a special token to let downstream RSs to 
reset its internal cache or ask operators to reset replication. It maks my 
proposal less appealing. 


 

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 0.95.0, 0.94.6
Reporter: Lars Hofhansl
Assignee: Jeffrey Zhong
 Fix For: 0.95.0, 0.94.7


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-02-07 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13574115#comment-13574115
 ] 

Ted Yu commented on HBASE-7709:
---

TestRowProcessorEndpoint shows another potential implementation for option #3:
{code}
// We can also inject some meta data to the walEdit
KeyValue metaKv = new KeyValue(
row, HLog.METAFAMILY,
Bytes.toBytes(num of hops),
Bytes.toBytes(hops));
walEdit.add(metaKv);
{code}

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-02-02 Thread Ian Varley (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569611#comment-13569611
 ] 

Ian Varley commented on HBASE-7709:
---

At a minimum, this should be called out in the reference guide  replication 
page. Replication is still a pretty advanced feature, and replication for  2 
clusters even more so; if a patch doesn't go into 0.94.5, it's not end of the 
world.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-02-02 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569623#comment-13569623
 ] 

Ted Yu commented on HBASE-7709:
---

Looking at HLogKey#readFields():
{code}
if (version.atLeast(Version.INITIAL)) {
  if (in.readBoolean()) {
{code}
From the javadoc of readBoolean():

Reads one input byte and returns true if that byte is nonzero, false if that 
byte is zero.

I think there is room to implement option #3 in the description. We can 
introduce new version (two, considering compression) where write, instead of 
true, the number of hops that HLog.Entry has gone through - starting with 1. A 
byte should suffice for this purpose.

+1 on documenting this intricacy for 0.94.x in the refguide.

I think we should create several subtasks for this JIRA.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-02-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13569466#comment-13569466
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Any good ideas for 0.94? We cannot change the HLog forward in a non-backwards 
compatible way there.
Maybe in 0.94 we can do something simple along Ian's line of thinking. I don't 
care if it blows up in this case, even the RSs just aborting is better than a 
infinite back and forth of replication data (will fill up the memstore is 
useless versions, forever).


 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565699#comment-13565699
 ] 

Lars Hofhansl commented on HBASE-7709:
--

Thanks to [~cody.mar...@gmail.com], [~ivarley], and [~jesse_yates] for finding 
the issue.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565706#comment-13565706
 ] 

Ted Yu commented on HBASE-7709:
---

Option #2 seems the best.
I think number of clusters exceeding 10 in master/master replication would be 
rare.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565714#comment-13565714
 ] 

Lars Hofhansl commented on HBASE-7709:
--

I'd agree. Need to check if this is possible in 0.94 while keeping the HLog 
backwards compatible. If that is tricky for 0.94 we might need option #1.
Also, I cannot promise that I will get to this any time soon.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ian Varley (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565752#comment-13565752
 ] 

Ian Varley commented on HBASE-7709:
---

Would another option be to do some kind of checking at add_peer time, to make 
sure no pernicious cycles are detected? I.e. when I add a peer, first walk the 
graph of current master/peer relationships and refuse to add if I detect a 
cycle I'm not part of? Would require an API to ask that question, but that's 
probably a good thing anyway.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ian Varley (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565760#comment-13565760
 ] 

Ian Varley commented on HBASE-7709:
---

(Because cycles  2 are still fine, they just have to include all nodes. It can 
go A - B - C - A; when an edit from A gets to C, it won't re-send to A, and 
the cycle will stop. The problem is just when it's a cycle from A - (B - C - 
B).)

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565794#comment-13565794
 ] 

Ted Yu commented on HBASE-7709:
---

The new API would allow specification of more than one cluster, right ?
What about (B - C - B) - A where B replicates to A unidirectionally ?

I think option #2 is the general solution.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565835#comment-13565835
 ] 

Ted Yu commented on HBASE-7709:
---

For HLog.Entry:
{code}
public void write(DataOutput dataOutput) throws IOException {
  this.key.write(dataOutput);
  this.edit.write(dataOutput);
}
{code}
where the first integer for WALEdit is versionOrLength. If it is  0, it is 
length. Otherwise it should be -1.
We can introduce a marker (a.k.a WALEdit.VERSION_3 == -2) after which 
additional cluster Ids can be serialized.

This is an incompatible change which should be acceptable to singularity 
release.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ian Varley (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565847#comment-13565847
 ] 

Ian Varley commented on HBASE-7709:
---

Re: (B - C - B) - A, that's fine; no participants are detecting a cycle 
they're not part of (A isn't adding any peers, it's the slave). B detects a 
cycle it's part of (B - C - B) and C does as well.

The API would be simple, and would let the caller walk the graph of clusters: 
ask the peer you're trying to add for all of its peers, then ask each of them 
in turn, and build up a graph structure that you can interrogate. Only call is 
Tell me your current peers.

I suppose this could cause problems if not all clusters can communicate; say, 
if B is visible to A, and C is visible to B, but C is not visible to A. And I 
guess there might be race conditions if you try to add peers on multiple 
clusters simultaneously, there's not really a way to avoid that.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565851#comment-13565851
 ] 

Ted Yu commented on HBASE-7709:
---

What if C - B - A is established, and after some time, a potential cycle is 
formed with B - C ?

Along my comment above, we can PB the metadata in WALEdit and HLogKey where 
cluster Id is declared as repeated in .proto. A tool is provided to convert 
pre-0.96 WAL files into the new format.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ian Varley (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565869#comment-13565869
 ] 

Ian Varley commented on HBASE-7709:
---

Ah, touche. That particular arrangement is still fine (the resulting graph, (B 
- C - B) - A, doesn't have any bad cycles. However, you raise a good point; 
if you start with:

A - B - C

and then later add C - B, you'd get:

A - (B - C - B)

which is a bad cycle. And C has no way of knowing about A - B; as a peer, you 
only know who you replicate to, not who replicates to you. 

A cluster could keep track of who is replicating TO it; in ReplicationSink, we 
could track all the cluster IDs that have ever sent data in, and report that 
through the who do you replicate with API. So then it would let you build a 
full graph, because you get the backwards edges.

Of course, there's still plenty of catches: the race conditions, plus the 
possibility that someone is set up to replicate to you, but they just haven't 
sent any edits yet.

Meh. With this level of complication, a solution in the direction you're 
talking about (adding info to the WAL) might be safer.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565874#comment-13565874
 ] 

Ted Yu commented on HBASE-7709:
---

Thanks Ian for correcting my example given @ 29/Jan/13 21:53

My point was that replication topology can grow quite complex. If we cannot 
enumerate all the intricacies, we'd better design something that suits future 
development.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7709) Infinite loop possible in Master/Master replication

2013-01-29 Thread Ian Varley (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13565876#comment-13565876
 ] 

Ian Varley commented on HBASE-7709:
---

Yes, I agree. While it's possible (in theory) to interrogate the actual 
topology at runtime, a solution that makes such problems impossible is much 
better.

 Infinite loop possible in Master/Master replication
 ---

 Key: HBASE-7709
 URL: https://issues.apache.org/jira/browse/HBASE-7709
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Lars Hofhansl
 Fix For: 0.96.0, 0.94.6


 We just discovered the following scenario:
 # Cluster A and B are setup in master/master replication
 # By accident we had Cluster C replicate to Cluster A.
 Now all edit originating from C will be bouncing between A and B. Forever!
 The reason is that when the edit come in from C the cluster ID is already set 
 and won't be reset.
 We have a couple of options here:
 # Optionally only support master/master (not cycles of more than two 
 clusters). In that case we can always reset the cluster ID in the 
 ReplicationSource. That means that now cycles  2 will have the data cycle 
 forever. This is the only option that requires no changes in the HLog format.
 # Instead of a single cluster id per edit maintain a (unordered) set of 
 cluster id that have seen this edit. Then in ReplicationSource we drop any 
 edit that the sink has seen already. The is the cleanest approach, but it 
 might need a lot of data stored per edit if there are many clusters involved.
 # Maintain a configurable counter of the maximum cycle side we want to 
 support. Could default to 10 (even maybe even just). Store a hop-count in the 
 WAL and the ReplicationSource increases that hop-count on each hop. If we're 
 over the max, just drop the edit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira