Hi Kezhu, Thanks for fixing the data loss bug. I will review the PR.
Best, Li On Tue, May 13, 2025 at 7:27 AM Kezhu Wang <kez...@apache.org> wrote: > Hi devs, > > ZOOKEEPER-4925[1] reports a data loss bug that could happen in the > following condition: > 1. A follower which stalls for a while will introduce hole in its > `committedLog` after sync with the leader. This was introduced in > pr-2152[2] to solve NPE in `syncWithLeader` reported in > ZOOKEEPER-4394[3] and shipped in 3.9.3. > 2. The hole introduced above could be propagated to other nodes if the > above follower becomes leader. We never forbid discontinuous txns in > all cases. > > I have opened pr-2254[4] to fix this. I would like to solve it before > the next release since it could be easily introduced in certain > conditions. I have expressed this in the voting thread for > 3.9.4-rc0[5]. > > Look forward to your reviews! > > Best, > Kezhu Wang > > [1]: https://issues.apache.org/jira/browse/ZOOKEEPER-4925 > [2]: https://github.com/apache/zookeeper/pull/2152 > [3]: https://issues.apache.org/jira/browse/ZOOKEEPER-4394 > [4]: https://github.com/apache/zookeeper/pull/2254 > [5]: https://lists.apache.org/thread/sq5djdm5ttbscbtdw5ykp5vl4dfb8p56 >