Hi, Folks:

ZOOKEEPER-3104 is a critical issue for data inconsistency. The risk also exists 
in 3.4 branch.

In one of our 3.4.13 clusters, the data inconsistency happens for many times.

After digging some transaction logs and snapshot, we believe that 
ZOOKEEPER-3104<https://issues.apache.org/jira/browse/ZOOKEEPER-3104> is the 
main risk to contributes to our data inconsistency.

The risk probability maybe higher than we can consider in real product 
environment.  The serialization of big DataTree may leads to a big risk time 
window in the high writing traffic situation. Any failure during the risk time 
window would cause the data inconsistency.

The data inconsistency is almost unacceptable in zookeeper semantic.

This issue is already fixed in 3.6. But I think it is very necessary to 
backport ZOOKEEPER-3104<https://issues.apache.org/jira/browse/ZOOKEEPER-3104> 
to branch-3.4, especially in the situation that the migration from 3.4 to 3.5 
actually take more effort to evaluate the compatibility risk in real product 
environment.

I have already opened an issue 
[ZOOKEEPER-3589](https://issues.apache.org/jira/browse/ZOOKEEPER-3589) and 
submit a github pull request https://github.com/apache/zookeeper/pull/1123 to 
fix it.

So I suggest to accept the pull request and release it in 3.4.16. This fix 
would make branch-3.4 more robust and fully-fledged.


Thanks!


Best regards

Pierre Yin


Reply via email to