Kezhu Wang created ZOOKEEPER-4925:
-------------------------------------

             Summary: Diff sync introduce hole in stale follower's committedLog 
which cause data loss in leading
                 Key: ZOOKEEPER-4925
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4925
             Project: ZooKeeper
          Issue Type: Bug
          Components: server
    Affects Versions: 3.9.3
            Reporter: Kezhu Wang


There are two variants of {{ZooKeeperServer::processTxn}}. Those two variants 
diverge in behavior since ZOOKEEPER-3484. {{processTxn(Request request)}} pops 
outstanding change from {{outstandingChanges}} and adds txn to {{committedLog}} 
for follower to sync in addition to what {{processTxn(TxnHeader hdr, Record 
txn)}} does. The {{Learner}} uses {{processTxn(TxnHeader hdr, Record txn)}} to 
commit txn to memory after ZOOKEEPER-4394, which means it leaves 
{{committedLog}} untouched in {{SYNCHRONIZATION}} phase.

In above case, a stale follower will have hole in its {{committedLog}} after 
joining cluster. The stale follower will propagate the in memory hole to other 
stale nodes after becoming leader. This causes data loss.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to