[ https://issues.apache.org/jira/browse/HBASE-20305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16430073#comment-16430073 ]
Hudson commented on HBASE-20305: -------------------------------- Results for branch HBASE-19064 [build #90 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/90/]: (x) *{color:red}-1 overall{color}* ---- details (if available): (x) {color:red}-1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/90//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/90//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-19064/90//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Add option to SyncTable that skip deletes on target cluster > ----------------------------------------------------------- > > Key: HBASE-20305 > URL: https://issues.apache.org/jira/browse/HBASE-20305 > Project: HBase > Issue Type: Improvement > Components: mapreduce > Affects Versions: 2.0.0-alpha-4 > Reporter: Wellington Chevreuil > Assignee: Wellington Chevreuil > Priority: Minor > Fix For: 3.0.0 > > Attachments: 0001-HBASE-20305.master.001.patch, > HBASE-20305.master.002.patch > > > We had a situation where two clusters with active-active replication got out > of sync, but both had data that should be kept. The tables in question never > have data deleted, but ingestion had happened on the two different clusters, > some rows had been even updated. > In this scenario, a cell that is present in one of the table clusters should > not be deleted, but replayed on the other. Also, for cells with same > identifier but different values, the most recent value should be kept. > Current version of SyncTable would not be applicable here, because it would > simply copy the whole state from source to target, then losing any additional > rows that might be only in target, as well as cell values that got most > recent update. This could be solved by adding an option to skip deletes for > SyncTable. This way, the additional cells not present on source would still > be kept. For cells with same identifier but different values, it would just > perform a Put for the cell version from source, but client scans would still > fetch the most recent timestamp. > I'm attaching a patch with this additional option shortly. Please share your > thoughts. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)