There's a txn gap between snapshot and the next txn log:

-rw-r--r-- 1 andor andor 67108880 Feb  4 13:01 log.674c2
-rw-r--r-- 1 andor andor  2630669 Feb  1 11:11 snapshot.674c0

Txn zxid 674c1 (create of the parent znode) has never been persisted,
so the ACL information is completely lost. The only thing we can do is
to restore consistency by fixing the ACL ref to something which is
present in the cache.

> The only other solution I can think of is to persist a full
> representation of the ACLs per every node.

I agree, but it requires DB format change and it's not backward
compatible.

Andor



On Thu, 2025-02-06 at 13:36 -0600, Andor Molnar wrote:
> That's true, but we'll lose that ACL either way, right?
> 
> I mean before that recent patch mentioned by Damien, the situation
> was
> the same: we emitted a warning message about a znode which had a non-
> existing ACL reference and went on.
> 
> "Ignoring acl XYZ as it does not exist in the cache" at INFO level.
> 
> We're talking about an ACL which never gets persisted and now with
> the
> patch, ZK is not willing to startup in the situation anymore.
> 
> Andor
> 
> 
> 
> 
> On Thu, 2025-02-06 at 11:03 -0800, Chris Nauroth wrote:
> > In the scenario described in the pull request, it sounds like we
> > ultimately
> > are not serializing the ACL for inode /foo. IIUC, this proposal
> > would
> > result in setting /foo to world readable, not the original intended
> > ACL
> > that was dropped from serialization. It would fix the failure, but
> > it
> > could
> > open a security risk. (Let me know if I misunderstood.)
> > 
> > The only other solution I can think of is to persist a full
> > representation
> > of the ACLs per every node. The ReferenceCountedACLCache would not
> > be
> > serialized, and instead be reconstructed from the deserialized
> > nodes,
> > becoming purely a runtime memory optimization. The HDFS
> > implementation for
> > ACLs is similar to this. Of course, this would be a much bigger
> > change.
> > 
> > Chris Nauroth
> > 
> > 
> > On Wed, Feb 5, 2025 at 4:03 PM Andor Molnar <an...@apache.org>
> > wrote:
> > 
> > > I've a crazy idea for this which is super quick:
> > > 
> > > Here we add usage of ACL to the cache:
> > > 
> > > https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1358
> > > 
> > > What if we do this...?
> > > 
> > > In the cache, when we realize that ACL is missing, return false:
> > > 
> > > https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/ReferenceCountedACLCache.java#L175
> > > 
> > > In DataTree we'll modify the Znode ACL reference to "-1" which is
> > > "world readable", essentially removing the ACL from the znode and
> > > continue:
> > > 
> > > synchronized (node) {
> > >     if (!aclCache.addUsage(node.acl)) {
> > >         // Fix missing ACL
> > >         node.acl = OPEN_UNSAFE_ACL_ID;
> > >         LOG.warn("Missing ACL has been removed from znode,
> > > proceeding.");
> > >     }
> > > }
> > > 
> > > Txn's processing will be fine, next snapshot will be "fixed".
> > > 
> > > Andor
> > > 
> > > 
> > > 
> > > On Wed, 2025-02-05 at 15:45 -0600, Andor Molnar wrote:
> > > > Hi ZK folks,
> > > > 
> > > > Let me draw your attention to this ticket. We've seen this
> > > > happening
> > > > in
> > > > production and I would like to work on a fix.
> > > > 
> > > > Damien already created a draft PR here:
> > > > https://github.com/apache/zookeeper/pull/2183
> > > > 
> > > > Let's take a closer look and work on a strategic solution.
> > > > 
> > > > Thanks,
> > > > Andor
> > > > 
> > > > 
> > > 
> > > 
> 

Reply via email to