FYI https://issues.apache.org/jira/browse/ZOOKEEPER-2163
On Fri, Apr 10, 2015 at 6:39 PM, Patrick Hunt <[email protected]> wrote: > Adding is typically good from a b/w compact perspective. If you use the new > feature (at runtime) it generally precludes rollback though. > > See CreateTxn and CreateTxnV0 > > A bit of background on convenience vs availability: Originally in ZK's life > we explicitly stayed away from such operations at the API level (another > example being "rm -r"). We wanted to have high availability, in the sense > that a single operation performed a single discreet operation on the > service. We didn't want to allow "unbounded" sets of changes that might > affect availability - say a single operation that triggered a thousand > discreet operations on the service, blocking out clients from doing other > work. This seems pretty bounded to me though - at worst deleting the entire > parent chain, which in general should be relatively small. > > Patrick > > On Thu, Apr 9, 2015 at 12:41 PM, Jordan Zimmerman < > [email protected]> wrote: > >> You don’t even need to look at cversion. If the parent node is a container >> and has no children (i.e. the node being deleted is the last child), it >> gets deleted. >> >> The trouble I’m currently having, though, is that I don’t want to modify >> the CreateTxn record. I can’t find a place to mark that the node should be >> a container. I guess I’ll have to add a new record type. What are the >> ramifications of that? >> >> -JZ >> >> On April 9, 2015 at 2:24:16 PM, Michi Mutsuzaki ([email protected]) >> wrote: >> >> I see, so the container znode is a znode that gets deleted if it's >> empty and it ever had a child (cversion is greater than zero). It >> sounds good to me. Let's see what other people say. >> >> Thanks Jordan! >> >> On Thu, Apr 9, 2015 at 10:20 AM, Jordan Zimmerman >> <[email protected]> wrote: >> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723. >> > >> > The problem with both ZOOKEEPER-723 and ZOOKEEPER-834 is that it >> overloads >> > the concept of EPHEMERAL. EPHEMERALs are tied to sessions. In the use >> cases >> > that I see, the parent node is always PERSISTENT - i.e. not tied to a >> > session. >> > >> > I haven't looked at the patch yet, but how do you handle the "first >> > child" problem? >> > >> > My solution applies only when a node is deleted. So, there is no need >> for a >> > first child check. When a node is deleted, iff it's parent has zero >> children >> > and is of type CONTAINER, then the parent is deleted and recursively up >> the >> > tree. >> > >> > -Jordan >> > >> > On April 9, 2015 at 12:15:33 PM, Michi Mutsuzaki ([email protected]) >> > wrote: >> > >> > Hi Jordan. >> > >> > This sounds great to me, but it sounds a lot like ZOOKEEPER-723. >> > Different people had different ideas there, but the original >> > description was: >> > >> > "rather than changing the semantics of ephemeral nodes, i propose >> > ephemeral parents: znodes that disappear when they have no more >> > children. this cleanup would happen automatically when the last child >> > is removed. an ephemeral parent is not tied to any particular session, >> > so even if the creator goes away, the ephemeral parent will remain as >> > long as there are children." >> > >> > I haven't looked at the patch yet, but how do you handle the "first >> > child" problem? Is the container znode created with a first child to >> > prevent getting deleted, or does the client rely on multi to create a >> > container and its children, or something else? >> > >> > >> > On Thu, Apr 9, 2015 at 8:00 AM, Jordan Zimmerman >> > <[email protected]> wrote: >> >> BACKGROUND >> >> ============ >> >> A recurring problem for ZooKeeper users is garbage collection of parent >> >> nodes. Many recipes (e.g. locks, leaders, etc.) call for the creation >> of a >> >> parent node under which participants create sequential nodes. When the >> >> participant is done, it deletes its node. In practice, the ZooKeeper >> tree >> >> begins to fill up with orphaned parent nodes that are no longer needed. >> The >> >> ZooKeeper APIs don't provide a way to clean these. Over time, ZooKeeper >> can >> >> become unstable due to the number of these nodes. >> >> >> >> CURRENT SOLUTIONS >> >> =================== >> >> Apache Curator has a workaround solution for this by providing the >> Reaper >> >> class which runs in the background looking for orphaned parent nodes and >> >> deleting them. This isn't ideal and it would be better if ZooKeeper >> >> supported this directly. >> >> >> >> PROPOSAL >> >> ========= >> >> ZOOKEEPER-723 and ZOOKEEPER-834 have been proposed to allow EPHEMERAL >> >> nodes to contain child nodes. This is not optimum as EPHEMERALs are >> tied to >> >> a session and the general use case of parent nodes is for PERSISTENT >> nodes. >> >> This proposal adds a new node type, CONTAINER. A CONTAINER node is the >> same >> >> as a PERSISTENT node with the additional property that when its last >> child >> >> is deleted, it is deleted (and CONTAINER nodes recursively up the tree >> are >> >> deleted if empty). >> >> >> >> I have a first pass (untested) straw man proposal open for comment here: >> >> >> >> https://github.com/apache/zookeeper/pull/28 >> >> >> >> In order to have minimum impact on existing implementations, a container >> >> node is denoted by having an ephemeralOwner id of Long.MIN_VALUE. This >> is >> >> pretty hackish, but I think it's the most supportable without causing >> >> disruption. Also, a container behaves a "little bit" like an EPHEMERAL >> node >> >> so it isn't totally illogical. Alternatively, a new field could be >> added to >> >> STAT. >> >> >> >> I look forward to feedback on this. If people think it's worthwhile I'll >> >> open a Jira and work on a Production quality solution. If it's >> rejected, I'd >> >> appreciate discussion of an alternate as this is a real need in the ZK >> user >> >> community. >> >> >> >> -Jordan >> >> >> >> >>
