Hi Jordon,

I like this feature and always thought it would be useful to have something
like this for Apache Helix as well. We do have a clean up thread that
deletes the znodes. But I felt it was tied to Helix.

Here are some of the questions that made me think its better to have the
user of zookeeper handle deleting the parent node according to the use case.

How would one go about using this feature? Perhaps a pseudo api and client
code will help me understand.

How can we guarantee that the last delete is actually the last delete. What
if there was a race condition on delete and creation of the new node under
the same parent. What kind of exception will we throw when a participant
tried to create a node under a container node but the parent directory was
deleted. How should the client handle such an exception.

What about libraries(such as curator, zkclient) that provide mkdir -p kind
of api where they go ahead and create parent nodes automatically if they
don't exist.

At a very  high level the big question is are we tying this feature to a
specific recipe and the way its implemented.

Does this make sense?

thanks,
Kishore G

On Tue, Apr 14, 2015 at 10:49 AM, Camille Fournier <[email protected]>
wrote:

> Look at the session managers, they track what sessions are alive and clean
> up when they aren't.
>
> On Tue, Apr 14, 2015 at 1:49 PM, Camille Fournier <[email protected]>
> wrote:
>
> > Look at the session managers, they track what sessions are alive and
> clean
> > up when they aren't.
> >
> > C
> >
> > On Tue, Apr 14, 2015 at 1:36 PM, Jordan Zimmerman <
> > [email protected]> wrote:
> >
> >> Another question…
> >>
> >> So, my two current questions are:
> >>
> >> * noting that a ZNode is a container, would you suggest the hack of a
> >> special ephemeralOwner value or would you add a new field to Stat?
> >>
> >> * is there a current mechanism in the ZK server code (for the leader in
> >> particular) to handle periodic housecleaning tasks? If so, where is that
> >> code?
> >>
> >> -Jordan
> >>
> >>
> >>
> >> On April 13, 2015 at 2:48:27 PM, Jordan Zimmerman (
> >> [email protected]) wrote:
> >>
> >> As for noting that a ZNode is a container, would you suggest the hack of
> >> a special ephemeralOwner value or would you add a new field to Stat?
> >>
> >> -Jordan
> >>
> >>
> >>
> >> On April 10, 2015 at 6:40:23 PM, Patrick Hunt ([email protected]) wrote:
> >>
> >> Adding is typically good from a b/w compact perspective. If you use the
> >> new
> >> feature (at runtime) it generally precludes rollback though.
> >>
> >> See CreateTxn and CreateTxnV0
> >>
> >> A bit of background on convenience vs availability: Originally in ZK's
> >> life
> >> we explicitly stayed away from such operations at the API level (another
> >> example being "rm -r"). We wanted to have high availability, in the
> sense
> >> that a single operation performed a single discreet operation on the
> >> service. We didn't want to allow "unbounded" sets of changes that might
> >> affect availability - say a single operation that triggered a thousand
> >> discreet operations on the service, blocking out clients from doing
> other
> >> work. This seems pretty bounded to me though - at worst deleting the
> >> entire
> >> parent chain, which in general should be relatively small.
> >>
> >> Patrick
> >>
> >> On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman <
> >> [email protected]> wrote:
> >>
> >> > You don’t even need to look at cversion. If the parent node is a
> >> container
> >> > and has no children (i.e. the node being deleted is the last child),
> it
> >> > gets deleted.
> >> >
> >> > The trouble I’m currently having, though, is that I don’t want to
> modify
> >> > the CreateTxn record. I can’t find a place to mark that the node
> should
> >> be
> >> > a container. I guess I’ll have to add a new record type. What are the
> >> > ramifications of that?
> >> >
> >> > -JZ
> >> >
> >> > On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki (
> [email protected])
> >> > wrote:
> >> >
> >> > I see, so the container znode is a znode that gets deleted if it's
> >> > empty and it ever had a child (cversion is greater than zero). It
> >> > sounds good to me. Let's see what other people say.
> >> >
> >> > Thanks Jordan!
> >> >
> >> > On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman
> >> > <[email protected]> wrote:
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> >> > >
> >> > > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it
> >> > overloads
> >> > > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the
> use
> >> > cases
> >> > > that I see, the parent node is always PERSISTENT - i.e. not tied to
> a
> >> > > session.
> >> > >
> >> > > I haven't looked at the patch yet, but how do you handle the "first
> >> > > child" problem?
> >> > >
> >> > > My solution applies only when a node is deleted. So, there is no
> need
> >> > for a
> >> > > first child check. When a node is deleted, iff it's parent has zero
> >> > children
> >> > > and is of type CONTAINER, then the parent is deleted and recursively
> >> up
> >> > the
> >> > > tree.
> >> > >
> >> > > -Jordan
> >> > >
> >> > > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki (
> >> [email protected])
> >> > > wrote:
> >> > >
> >> > > Hi Jordan.
> >> > >
> >> > > This sounds great to me, but it sounds a lot like ZOOKEEPER-723.
> >> > > Different people had different ideas there, but the original
> >> > > description was:
> >> > >
> >> > > "rather than changing the semantics of ephemeral nodes, i propose
> >> > > ephemeral parents: znodes that disappear when they have no more
> >> > > children. this cleanup would happen automatically when the last
> child
> >> > > is removed. an ephemeral parent is not tied to any particular
> session,
> >> > > so even if the creator goes away, the ephemeral parent will remain
> as
> >> > > long as there are children."
> >> > >
> >> > > I haven't looked at the patch yet, but how do you handle the "first
> >> > > child" problem? Is the container znode created with a first child to
> >> > > prevent getting deleted, or does the client rely on multi to create
> a
> >> > > container and its children, or something else?
> >> > >
> >> > >
> >> > > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman
> >> > > <[email protected]> wrote:
> >> > >> BACKGROUND
> >> > >> ============
> >> > >> A recurring problem for ZooKeeper users is garbage collection of
> >> parent
> >> > >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the
> creation
> >> > of a
> >> > >> parent node under which participants create sequential nodes. When
> >> the
> >> > >> participant is done, it deletes its node. In practice, the
> ZooKeeper
> >> > tree
> >> > >> begins to fill up with orphaned parent nodes that are no longer
> >> needed.
> >> > The
> >> > >> ZooKeeper APIs don't provide a way to clean these. Over time,
> >> ZooKeeper
> >> > can
> >> > >> become unstable due to the number of these nodes.
> >> > >>
> >> > >> CURRENT SOLUTIONS
> >> > >> ===================
> >> > >> Apache Curator has a workaround solution for this by providing the
> >> > Reaper
> >> > >> class which runs in the background looking for orphaned parent
> nodes
> >> and
> >> > >> deleting them. This isn't ideal and it would be better if ZooKeeper
> >> > >> supported this directly.
> >> > >>
> >> > >> PROPOSAL
> >> > >> =========
> >> > >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow
> EPHEMERAL
> >> > >> nodes to contain child nodes. This is not optimum as EPHEMERALs are
> >> > tied to
> >> > >> a session and the general use case of parent nodes is for
> PERSISTENT
> >> > nodes.
> >> > >> This proposal adds a new node type, CONTAINER. A CONTAINER node is
> >> the
> >> > same
> >> > >> as a PERSISTENT node with the additional property that when its
> last
> >> > child
> >> > >> is deleted, it is deleted (and CONTAINER nodes recursively up the
> >> tree
> >> > are
> >> > >> deleted if empty).
> >> > >>
> >> > >> I have a first pass (untested) straw man proposal open for comment
> >> here:
> >> > >>
> >> > >> https://github.com/apache/zookeeper/pull/28
> >> > >>
> >> > >> In order to have minimum impact on existing implementations, a
> >> container
> >> > >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE.
> >> This
> >> > is
> >> > >> pretty hackish, but I think it's the most supportable without
> causing
> >> > >> disruption. Also, a container behaves a "little bit" like an
> >> EPHEMERAL
> >> > node
> >> > >> so it isn't totally illogical. Alternatively, a new field could be
> >> > added to
> >> > >> STAT.
> >> > >>
> >> > >> I look forward to feedback on this. If people think it's worthwhile
> >> I'll
> >> > >> open a Jira and work on a Production quality solution. If it's
> >> > rejected, I'd
> >> > >> appreciate discussion of an alternate as this is a real need in the
> >> ZK
> >> > user
> >> > >> community.
> >> > >>
> >> > >> -Jordan
> >> > >>
> >> > >>
> >> >
> >>
> >
> >
>

Reply via email to