Race condition while taking snapshot can lead to not restoring data tree
correctly
----------------------------------------------------------------------------------
Key: ZOOKEEPER-1090
URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1090
Project: ZooKeeper
Issue Type: Bug
Components: server
Affects Versions: 3.3.3
Reporter: Vishal K
Priority: Critical
Fix For: 3.4.0
I think I have found a bug in the snapshot mechanism.
The problem occurs because dt.lastProcessedZxid is not synchronized (or rather
set before the data tree is modified):
FileTxnSnapLog:
{code}
public void save(DataTree dataTree,
ConcurrentHashMap<Long, Integer> sessionsWithTimeouts)
throws IOException {
long lastZxid = dataTree.lastProcessedZxid;
LOG.info("Snapshotting: " + Long.toHexString(lastZxid));
File snapshot=new File(
snapDir, Util.makeSnapshotName(lastZxid));
snapLog.serialize(dataTree, sessionsWithTimeouts, snapshot); <=== the
Datatree may not have the modification for lastProcessedZxid
}
{code}
DataTree:
{code}
public ProcessTxnResult processTxn(TxnHeader header, Record txn) {
ProcessTxnResult rc = new ProcessTxnResult();
String debug = "";
try {
rc.clientId = header.getClientId();
rc.cxid = header.getCxid();
rc.zxid = header.getZxid();
rc.type = header.getType();
rc.err = 0;
if (rc.zxid > lastProcessedZxid) {
lastProcessedZxid = rc.zxid;
}
[...modify data tree...]
}
{code}
The lastProcessedZxid must be set after the modification is done.
As a result, if server crashes after taking the snapshot (and the snapshot does
not contain change corresponding to lastProcessedZxid) restore will not restore
the data tree correctly:
{code}
public long restore(DataTree dt, Map<Long, Integer> sessions,
PlayBackListener listener) throws IOException {
snapLog.deserialize(dt, sessions);
FileTxnLog txnLog = new FileTxnLog(dataDir);
TxnIterator itr = txnLog.read(dt.lastProcessedZxid+1); <=== Assumes
lastProcessedZxid is deserialized
}
{code}
I have had offline discussion with Ben and Camille on this. I will be posting
the discussion shortly.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira