Re: Parent nodes multi-step transactions
Hi Gustavo, I have a very strong feeling against more complex operations in the ZK server. These are things that should be provided by a ZK client helper library. The zkclient library from 101tec for example gives you exactly that. If you're planning to write another layer on top of the ZK API please have a look at https://issues.apache.org/jira/browse/ZOOKEEPER-835 I'm planning to provide an alternative java client API for 3.4.0 and would then propose to deprecate the current one in the long run. You can preview the new API at http://github.com/thkoch2001/zookeeper/tree/operation_classes However we need to redo it on top of ZOOKEEPER-823 ones it is applied to trunk. Best regards, Thomas Koch, http://www.koch.ro
Re: Parent nodes multi-step transactions
My own opinion is that lots of these structure sorts of problems are solved by putting the structure into a single znode. Atomic creation and update come for free at that point and we can even make the node ephemeral which we can't really do if there are children. Sure, it makes sense that using a single znode gets rid of some of the problems, after all we'd be effectively getting an atomic operation. It also gets rid of many of the advantages of using ZooKeeper, though. Independent changes become conflicts, watches fire more frequently than they should, clients have to parse the whole blob to know what has changed and filter locally, etc. The natural representation is to have the nodes signal that they are handling a particular node by creating an ephemeral file under a per shard directory. This is nice because node failures cause automagical update of the data. The dual is also natural ... we can create shard files under node directories. That dual is a serious mistake, however, and it is much better to put all the dual information in a single node file that the node itself creates. This allows ephemerality to maintain a correct view for us. Interesting indeed. (...) This doesn't eliminate all desire for transactions, but it gets rid of LOTs of them. Thanks for these ideas. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/blog http://niemeyer.net/twitter
Re: Parent nodes multi-step transactions
Hi Thomas, I have a very strong feeling against more complex operations in the ZK server. Can you please describe a little better what that feeling is about? These are things that should be provided by a ZK client helper library. The Which things should be provided by client helper libraries? Client libraries cannot provide atomic operations, which means that the reasoning and logic which must happen on top of ZK to avoid half-initialized states and observation of structure set up and tear down must continue to be taken in account. It basically means that to avoid having a relatively simple batch operation, the reasoning which must happen around ZK gets significantly more complex, or has to be avoided entirely. zkclient library from 101tec for example gives you exactly that. It's not clear to me what exactly that is in this context. I've looked for the code and couldn't find an answer/alternative to the issues discussed in this thread. If you're planning to write another layer on top of the ZK API please have a look at https://issues.apache.org/jira/browse/ZOOKEEPER-835 Looked there as well. Also can't find anything relative to this discussion. I'm planning to provide an alternative java client API for 3.4.0 and would then propose to deprecate the current one in the long run. You can preview the new API at http://github.com/thkoch2001/zookeeper/tree/operation_classes And this is a full branch of ZK. Tried checking out the commit messages or something to get an idea of what you mean, but also am unable to find answers to these problems. If you actually have/know of solutions for the suggested problems which were not yet covered here, I'm very interested in knowing about them, but will need slightly more precise information. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/blog http://niemeyer.net/twitter
Re: Parent nodes multi-step transactions
Gustavo Niemeyer: Hi Thomas, I have a very strong feeling against more complex operations in the ZK server. Can you please describe a little better what that feeling is about? Every functionality added to ZK will make it harder to maintain. The use case you're asking for is IMHO easily solvable in a client site helper library. So there's no reason to let ZK solve your problems. These are things that should be provided by a ZK client helper library. The Which things should be provided by client helper libraries? [...] zkclient library from 101tec for example gives you exactly that. It's not clear to me what exactly that is in this context. I've looked for the code and couldn't find an answer/alternative to the issues discussed in this thread. recursiveDelete, recursiveCreate: If you want to create /A/C/D-1 just use recursiveCreate and you will end up with /A/C/D-1, even if the full parent path did not exist before. If you're planning to write another layer on top of the ZK API please have a look at https://issues.apache.org/jira/browse/ZOOKEEPER-835 Looked there as well. Also can't find anything relative to this discussion. I'm planning to provide an alternative java client API for 3.4.0 and would then propose to deprecate the current one in the long run. You can preview the new API at http://github.com/thkoch2001/zookeeper/tree/operation_classes And this is a full branch of ZK. Tried checking out the commit messages or something to get an idea of what you mean, but also am unable to find answers to these problems. The idea is to provide operation classes that can be handed around. So you can create a list of create operation and hand the full list to a specific executor. If the executor ignores NodeExists exeptions then you already have an implementation of recursiveCreate: List creates = new List {new Create(/A), new Create(/A/C), new Create(/A/C/D-1)} myExecutor.execute(creates) If you actually have/know of solutions for the suggested problems which were not yet covered here, I'm very interested in knowing about them, but will need slightly more precise information. An alternative would be that you have a special znode in /A that signals, that the full structure has correctly been setup. Best regards, Thomas Koch, http://www.koch.ro
Re: Parent nodes multi-step transactions
Hi Gustavo, Usually the paradigm I like to suggest is to have something like /A/init Every client watches for the existence of this node and this node is only created after /A has been initialized with the creation of /A/C or other stuff. Would that work for you? Thanks mahadev On 8/23/10 7:34 AM, Gustavo Niemeyer gust...@niemeyer.net wrote: Greetings, We (a development team at Canonical) are stumbling into a situation here which I'd be curious to understand what is the general practice, since I'm sure this is somewhat of a common issue. It's quite easy to describe it: say there's a parent node A somewhere in the tree. That node was created dynamically over the course of running the system, because it's associated with some resource which has its own life-span. Now, under this node we put some control nodes for different reasons (say, A/B), and we also want to track some information which is related to a sequence of nodes (say, A/C/D-0, A/C/D-1, etc). So, we end up with something like this: A/B A/C/D-0 A/C/D-1 The question here is about best-practices for taking care of nodes like A/C. It'd be fantastic to be able to create A's structure together with A itself, otherwise we risk getting in a situation where a client can see the node A before its initialization has been finished (A/C doesn't exist yet). In fact, A/C may never exist, since it is possible for a client to die between the creation of A and C. Anyway, I'm sure you all understand the problem. The question here is: this is pretty common, and quite boring to deal with properly on every single client. Is there any feature in the roadmap to deal with this, and any common practice besides the obvious check for half-initialization and wait for A/C to be created or deal with timeouts and whatnot on every client? I'm about to start writing another layer on top of Zookeeper's API, so it'd be great to have some additional insight into this issue. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/blog http://niemeyer.net/twitter
Re: Parent nodes multi-step transactions
Hi Mahadev, Usually the paradigm I like to suggest is to have something like /A/init Every client watches for the existence of this node and this node is only created after /A has been initialized with the creation of /A/C or other stuff. Would that work for you? Yeah, this is what I referred to as liveness nodes in my prior ramblings, but I'm a bit sad about the amount of boilerplate work that will have to be done to put use something like this. It feels like as the size of the problem increases, it might become a bit hard to keep the whole picture in mind. Here is a slightly more realistic example (still significantly reduced), to give you an idea of the problem size: /services/wordpress/settings /services/wordpress/units/wordpress-0/agent-connected /services/wordpress/units/wordpress-1 /machines/machine-0/agent-connected /machines/machine-0/units/wordpress-1 /machines/machine-1/units/wordpress-0 There are quite a few dynamic nodes here which are created and initialized on demand. If we use these liveness nodes, we'll have to not only set watches in several places, but also have some kind of recovering daemon to heal a half-created state, and also filter user-oriented feedback to avoid showing nodes which may be dead. All of that would be avoided if there was a way to have multi-step atomic actions. I'm almost pondering about a journal-like system on top of the basic API, to avoid having to deal with this manually. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/blog http://niemeyer.net/twitter
Re: Parent nodes multi-step transactions
For my $0.02, I really think it would be nice if ZK supported lightweight transactions. By that, I simply mean that a batch of create/update/delete requests could be submitted in a single request, and be processed atomically (if any of the requests would fail, none are applied). I know transactions have been discussed before and discarded as adding too much complexity, but I think a simple version of transactions would be extremely helpful. A significant portion of our code is cleanup/workarounds for the inability to make several updates atomically. Should the time allow for me to work on any single feature, that's probably the one I would pick, although I'm concerned that there would be resistance to accepting upstream. -Dave Wright On Mon, Aug 23, 2010 at 6:51 PM, Gustavo Niemeyer gust...@niemeyer.net wrote: Hi Mahadev, Usually the paradigm I like to suggest is to have something like /A/init Every client watches for the existence of this node and this node is only created after /A has been initialized with the creation of /A/C or other stuff. Would that work for you? Yeah, this is what I referred to as liveness nodes in my prior ramblings, but I'm a bit sad about the amount of boilerplate work that will have to be done to put use something like this. It feels like as the size of the problem increases, it might become a bit hard to keep the whole picture in mind. Here is a slightly more realistic example (still significantly reduced), to give you an idea of the problem size: /services/wordpress/settings /services/wordpress/units/wordpress-0/agent-connected /services/wordpress/units/wordpress-1 /machines/machine-0/agent-connected /machines/machine-0/units/wordpress-1 /machines/machine-1/units/wordpress-0 There are quite a few dynamic nodes here which are created and initialized on demand. If we use these liveness nodes, we'll have to not only set watches in several places, but also have some kind of recovering daemon to heal a half-created state, and also filter user-oriented feedback to avoid showing nodes which may be dead. All of that would be avoided if there was a way to have multi-step atomic actions. I'm almost pondering about a journal-like system on top of the basic API, to avoid having to deal with this manually. -- Gustavo Niemeyer http://niemeyer.net http://niemeyer.net/blog http://niemeyer.net/twitter