Kezhu Wang created ZOOKEEPER-4925: ------------------------------------- Summary: Diff sync introduce hole in stale follower's committedLog which cause data loss in leading Key: ZOOKEEPER-4925 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4925 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.9.3 Reporter: Kezhu Wang
There are two variants of {{ZooKeeperServer::processTxn}}. Those two variants diverge in behavior since ZOOKEEPER-3484. {{processTxn(Request request)}} pops outstanding change from {{outstandingChanges}} and adds txn to {{committedLog}} for follower to sync in addition to what {{processTxn(TxnHeader hdr, Record txn)}} does. The {{Learner}} uses {{processTxn(TxnHeader hdr, Record txn)}} to commit txn to memory after ZOOKEEPER-4394, which means it leaves {{committedLog}} untouched in {{SYNCHRONIZATION}} phase. In above case, a stale follower will have hole in its {{committedLog}} after joining cluster. The stale follower will propagate the in memory hole to other stale nodes after becoming leader. This causes data loss. -- This message was sent by Atlassian Jira (v8.20.10#820010)