Jeevan Prakash created HBASE-28829:
--------------------------------------

             Summary: Increment Inconsistency in Replication
                 Key: HBASE-28829
                 URL: https://issues.apache.org/jira/browse/HBASE-28829
             Project: HBase
          Issue Type: Bug
          Components: Replication
         Environment: OS: macOS Sonoma 14.6.1
            Reporter: Jeevan Prakash


*Issue:*
Consistency is not achieved for Increment operation in replication.

*Setup:*
Lets have two HBase clusters 'cluster1' and 'cluster2' and both are added as 
peers to each other in them and both have replication enabled. There is a 
counter cell with initial value '2' in a table. There is a replication delay 
from 'cluster1' to 'cluster2'.

*Actions:*
1. Perform increment in 'cluster1' with 1.
2. Perform increment in 'cluster2' with 2.

*Expected Behaviour:*
The value in the counter cell should be 5.

*Actual Behaviour:*
The value in the counter cell is 4.

*Analysis:*
1. After increment in 'cluster1', the value became 2 in 'cluster1'.
2. The replication from 'cluster1' to 'cluster2' gets initiated.
3. But there is a replication delay from 'cluster1' to 'cluster2' and within 
that timeframe, increment in 'cluster2' performed.
4. Now the value is 4 in 'cluster2' and it got replicated to 'cluster1'.
5. Since the replication is cell-based, not operation based and the 'cluster2' 
increment is the latest, value 4 from 'cluster2' overrides value 3 in 
'cluster1'.

*Steps to reproduce:*
Add a coprocessor in 'cluster2' with 'Thread.sleep' in 'preWALAppend' to 
simulate replication delay.

*Inference:*
During debugging, it was discovered that the replication is cell-based, meaning 
the entire cell is being replicated rather than the specific operation being 
performed. This method works for other operations such as put and delete 
operations because it resolves the inconsistency problem by utilising the 
timestamp and version of the cell. However, for increment operations, which 
rely on the cell's previous value, this method is not successful.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to