I agree that being able to start up is important.

Ultimately, there is still a correctness problem from dropping the original
ACL. I can't think of a way to fully fix that without changing the
serialization format, which we could only do on a major version boundary.

Chris Nauroth


On Thu, Feb 6, 2025 at 12:13 PM Andor Molnar <an...@apache.org> wrote:

> There's a txn gap between snapshot and the next txn log:
>
> -rw-r--r-- 1 andor andor 67108880 Feb  4 13:01 log.674c2
> -rw-r--r-- 1 andor andor  2630669 Feb  1 11:11 snapshot.674c0
>
> Txn zxid 674c1 (create of the parent znode) has never been persisted,
> so the ACL information is completely lost. The only thing we can do is
> to restore consistency by fixing the ACL ref to something which is
> present in the cache.
>
> > The only other solution I can think of is to persist a full
> > representation of the ACLs per every node.
>
> I agree, but it requires DB format change and it's not backward
> compatible.
>
> Andor
>
>
>
> On Thu, 2025-02-06 at 13:36 -0600, Andor Molnar wrote:
> > That's true, but we'll lose that ACL either way, right?
> >
> > I mean before that recent patch mentioned by Damien, the situation
> > was
> > the same: we emitted a warning message about a znode which had a non-
> > existing ACL reference and went on.
> >
> > "Ignoring acl XYZ as it does not exist in the cache" at INFO level.
> >
> > We're talking about an ACL which never gets persisted and now with
> > the
> > patch, ZK is not willing to startup in the situation anymore.
> >
> > Andor
> >
> >
> >
> >
> > On Thu, 2025-02-06 at 11:03 -0800, Chris Nauroth wrote:
> > > In the scenario described in the pull request, it sounds like we
> > > ultimately
> > > are not serializing the ACL for inode /foo. IIUC, this proposal
> > > would
> > > result in setting /foo to world readable, not the original intended
> > > ACL
> > > that was dropped from serialization. It would fix the failure, but
> > > it
> > > could
> > > open a security risk. (Let me know if I misunderstood.)
> > >
> > > The only other solution I can think of is to persist a full
> > > representation
> > > of the ACLs per every node. The ReferenceCountedACLCache would not
> > > be
> > > serialized, and instead be reconstructed from the deserialized
> > > nodes,
> > > becoming purely a runtime memory optimization. The HDFS
> > > implementation for
> > > ACLs is similar to this. Of course, this would be a much bigger
> > > change.
> > >
> > > Chris Nauroth
> > >
> > >
> > > On Wed, Feb 5, 2025 at 4:03 PM Andor Molnar <an...@apache.org>
> > > wrote:
> > >
> > > > I've a crazy idea for this which is super quick:
> > > >
> > > > Here we add usage of ACL to the cache:
> > > >
> > > >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1358
> > > >
> > > > What if we do this...?
> > > >
> > > > In the cache, when we realize that ACL is missing, return false:
> > > >
> > > >
> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/ReferenceCountedACLCache.java#L175
> > > >
> > > > In DataTree we'll modify the Znode ACL reference to "-1" which is
> > > > "world readable", essentially removing the ACL from the znode and
> > > > continue:
> > > >
> > > > synchronized (node) {
> > > >     if (!aclCache.addUsage(node.acl)) {
> > > >         // Fix missing ACL
> > > >         node.acl = OPEN_UNSAFE_ACL_ID;
> > > >         LOG.warn("Missing ACL has been removed from znode,
> > > > proceeding.");
> > > >     }
> > > > }
> > > >
> > > > Txn's processing will be fine, next snapshot will be "fixed".
> > > >
> > > > Andor
> > > >
> > > >
> > > >
> > > > On Wed, 2025-02-05 at 15:45 -0600, Andor Molnar wrote:
> > > > > Hi ZK folks,
> > > > >
> > > > > Let me draw your attention to this ticket. We've seen this
> > > > > happening
> > > > > in
> > > > > production and I would like to work on a fix.
> > > > >
> > > > > Damien already created a draft PR here:
> > > > > https://github.com/apache/zookeeper/pull/2183
> > > > >
> > > > > Let's take a closer look and work on a strategic solution.
> > > > >
> > > > > Thanks,
> > > > > Andor
> > > > >
> > > > >
> > > >
> > > >
> >
>
>

Reply via email to