another option would be to start the znode id at the znode id of the parent znode which will be different between each deletion and creation of child nodes. One problem with this though (apart from being limited to 2^31 bits), is that the api doesn't have any way to return the initial znode version on creation. Fixing this, in a backward-compatible, non-ugly way would be hard I think.
-Ivan On 15 November 2014 03:48, Robin <[email protected]> wrote: > Hi zookeepers, > > When I dig into ZooKeeper's internals, I have learned the following flaw > about znode version in ZooKeeper: znode's version will be reset when znode is > deleted/re-created. This is a trap for some operations which make updates > based on znode version. > > Let's see an example: a client gets the data of a znode (e.g, /test) and > version(e.g, 1), change the data of the znode, and writes it back with the > condition that the version does not change (still be 1). If another client > deletes and re-creates this znode during the first client is updating the > data, the version matches, but it now contains the wrong data. > > The problem I can see is that the znode version is designed to be a > monotonically increasing integer. If we can include the birth-date(timestamp) > of the znode or zxid for the creation of the znode as part of the znode's > version, and only the integer part of the version will increase every time > when the znode is updated, while keeping the birth-date or zxid part of the > version not change, we can avoid the problem. > > Of course, there will be some cost for the new design: it needs bigger size > for the version field. > > Thanks, > - Robin
