[ https://issues.apache.org/jira/browse/HBASE-22839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907692#comment-16907692 ]
Andrew Purtell edited comment on HBASE-22839 at 8/14/19 11:29 PM: ------------------------------------------------------------------ branch-1 is 1.5. Please use that as base branch. There will be no more patch release branches for HBase 1. After branch-1 work is committed, we can consider backports to 1.4 (branch-1.4) and 1.3 (branch-1.3). was (Author: apurtell): branch-1 is 1.5. Use that. There will be no more patch release branches for HBase 1. After branch-1 work is committed, we can consider backports to 1.4 (branch-1.4) and 1.3 (branch-1.3). > Make sure the batches within one region are shipped to the sink clusters in > order (branch-1) > -------------------------------------------------------------------------------------------- > > Key: HBASE-22839 > URL: https://issues.apache.org/jira/browse/HBASE-22839 > Project: HBase > Issue Type: Improvement > Components: Replication > Affects Versions: 1.3.4, 1.3.5 > Reporter: Bin Shi > Assignee: Bin Shi > Priority: Major > Fix For: 1.5.0 > > > Problem Statement: > In the cross-cluster replication validation, we found some cells in source > and sink cluster can have the same row key, the same timestamp but different > values. The happens when mutations with the same row key are submitted in > batch without specifying the timestamp, and the same timestamp in the unit of > millisecond is assigned at the time when they are committed to the WAL. > When this happens, if the major compaction hasn’t happened yet and you scan > the table, you can find some cells have the same row key, the same timestamps > but different values, like the first three rows in the following table. > |Row Key 1|CF0::Column 1|Timestatmp 1|Value 1| > |Row Key 1|CF0::Column 1|Timestatmp 1|Value 2| > |Row Key 1|CF0::Column 1|Timestatmp 1|Value 3| > |Row Key 2|CF0::Column 1|Timestatmp 2|Value 4| > |Row Key 3|CF0::Column 1|Timestatmp 4|Value 5| > The ordering of the first three rows is indeterminate in the presence of the > cross-replication, so after compaction, in the master cluster you will see > “Row Key 1, CF0::Column1, Timestamp1” having the value 3, but in the slave > cluster, you might see the cell having one of the three possible values 1, 2, > 3, which results in data inconsistency issue between the master and slave > clusters. > Root Cause Analysis: > In HBaseInterClusterReplicationEndpoint.createBatches() of branch-1.3, the > WAL entries from the same region could be split into different batches > according to replication RPC limit and these batches are shipped by > ReplicationSource concurrently, so the batches for the same region could > arrive at the sink in the slave clusters then apply to the region > synchronously in indeterminate order. > Solution: > In HBase 3.0.0 and 2.1.0, [~Apache9]&[~openinx]&[~fenghh] provided Serial > Replication HBASE-20046 which guarantees the order of pushing logs to slave > clusters is same as the order of requests from client in the master cluster. > It contains mainly two changes: > # Recording the replication "barriers" in ZooKeeper to synchronize the > replication across old/failed RS and new RS to provide strict ordering > semantics even in the presence of region-move or RS failure. > # Make sure the batches within one region are shipped to the slave clusters > in order. > The second part of change is exactly what we need and the minimal change to > fix the issue in this JIRA. > To fix the issue in this JIRA, we have two options: > # Cherry-Pick HBASE-20046 to branch 1.3. Pros: It also fixes the data > inconsistency issue when there is region-move or RS failure and help to > reduce the noises in our cross-cluster replication/backup validation which is > our ultimate goal. Cons: the change is big and I'm not sure for now whether > the change is self-contained or it has other dependencies which need to port > to branch 1.3 too; and we need longer time to validate and stabilize. > # Port the minimal change or make the equivalent change as the second part > of HBASE-20046 to make sure the batches within one region are shipped to the > slave clusters in order." > With limited knowledge about HBase Release Schedule and Process, I prefer > option 2 because of cons of option 1, but I'm open to option 1 and other > options. Thoughts? -- This message was sent by Atlassian JIRA (v7.6.14#76016)