ztzg commented on PR #1997: URL: https://github.com/apache/zookeeper/pull/1997#issuecomment-1690121267
Hi @adamyi, @kezhuw, For the record, we have also hit this in production. This is indeed a very critical issue, as the corruption can spread from member to member! I initially preferred solution 2 from the ticket description—the one which was tentatively implemented above—but given the difficulties encountered, and @kezhuw's suggestion of never removing the ACL `aclIndex` is pointing to, I am also reconsidering. Are we missing something? We would also like to add some kind of (optional) "fsck" pass which sanity checks the tree before the service starts—to prevent this and other kinds of corruption from spreading—but that can be implemented in a followup ticket. Cc: @eolivelli, @symat, @anmolnar, in case you haven't seen this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@zookeeper.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org