Maybe we could fix the gap somehow...? Start new txn log strictly on snap zxid + 1.
In which case we'll have createTxn on parent with Acl information in txn log with which we can fix the missing Acl reference. I need to take a closer look. Andor On Thu, 2025-02-06 at 14:13 -0600, Andor Molnar wrote: > There's a txn gap between snapshot and the next txn log: > > -rw-r--r-- 1 andor andor 67108880 Feb 4 13:01 log.674c2 > -rw-r--r-- 1 andor andor 2630669 Feb 1 11:11 snapshot.674c0 > > Txn zxid 674c1 (create of the parent znode) has never been persisted, > so the ACL information is completely lost. The only thing we can do > is > to restore consistency by fixing the ACL ref to something which is > present in the cache. > > > The only other solution I can think of is to persist a full > > representation of the ACLs per every node. > > I agree, but it requires DB format change and it's not backward > compatible. > > Andor > > > > On Thu, 2025-02-06 at 13:36 -0600, Andor Molnar wrote: > > That's true, but we'll lose that ACL either way, right? > > > > I mean before that recent patch mentioned by Damien, the situation > > was > > the same: we emitted a warning message about a znode which had a > > non- > > existing ACL reference and went on. > > > > "Ignoring acl XYZ as it does not exist in the cache" at INFO level. > > > > We're talking about an ACL which never gets persisted and now with > > the > > patch, ZK is not willing to startup in the situation anymore. > > > > Andor > > > > > > > > > > On Thu, 2025-02-06 at 11:03 -0800, Chris Nauroth wrote: > > > In the scenario described in the pull request, it sounds like we > > > ultimately > > > are not serializing the ACL for inode /foo. IIUC, this proposal > > > would > > > result in setting /foo to world readable, not the original > > > intended > > > ACL > > > that was dropped from serialization. It would fix the failure, > > > but > > > it > > > could > > > open a security risk. (Let me know if I misunderstood.) > > > > > > The only other solution I can think of is to persist a full > > > representation > > > of the ACLs per every node. The ReferenceCountedACLCache would > > > not > > > be > > > serialized, and instead be reconstructed from the deserialized > > > nodes, > > > becoming purely a runtime memory optimization. The HDFS > > > implementation for > > > ACLs is similar to this. Of course, this would be a much bigger > > > change. > > > > > > Chris Nauroth > > > > > > > > > On Wed, Feb 5, 2025 at 4:03 PM Andor Molnar <an...@apache.org> > > > wrote: > > > > > > > I've a crazy idea for this which is super quick: > > > > > > > > Here we add usage of ACL to the cache: > > > > > > > > https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1358 > > > > > > > > What if we do this...? > > > > > > > > In the cache, when we realize that ACL is missing, return > > > > false: > > > > > > > > https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/ReferenceCountedACLCache.java#L175 > > > > > > > > In DataTree we'll modify the Znode ACL reference to "-1" which > > > > is > > > > "world readable", essentially removing the ACL from the znode > > > > and > > > > continue: > > > > > > > > synchronized (node) { > > > > if (!aclCache.addUsage(node.acl)) { > > > > // Fix missing ACL > > > > node.acl = OPEN_UNSAFE_ACL_ID; > > > > LOG.warn("Missing ACL has been removed from znode, > > > > proceeding."); > > > > } > > > > } > > > > > > > > Txn's processing will be fine, next snapshot will be "fixed". > > > > > > > > Andor > > > > > > > > > > > > > > > > On Wed, 2025-02-05 at 15:45 -0600, Andor Molnar wrote: > > > > > Hi ZK folks, > > > > > > > > > > Let me draw your attention to this ticket. We've seen this > > > > > happening > > > > > in > > > > > production and I would like to work on a fix. > > > > > > > > > > Damien already created a draft PR here: > > > > > https://github.com/apache/zookeeper/pull/2183 > > > > > > > > > > Let's take a closer look and work on a strategic solution. > > > > > > > > > > Thanks, > > > > > Andor > > > > > > > > > > > > > > > > > > > > >