[ https://issues.apache.org/jira/browse/ZOOKEEPER-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Henry Robinson updated ZOOKEEPER-423: ------------------------------------- Attachment: ZOOKEEPER-423.patch Draft patch > Add getFirstChild API > --------------------- > > Key: ZOOKEEPER-423 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-423 > Project: ZooKeeper > Issue Type: New Feature > Components: contrib-bindings, documentation, java client, server > Reporter: Henry Robinson > Attachments: ZOOKEEPER-423.patch > > > When building the distributed queue for my tutorial blog post, it was pointed > out to me that there's a serious inefficiency here. > Informally, the items in the queue are created as sequential nodes. For a > 'dequeue' call, all items are retrieved and sorted by name by the client in > order to find the name of the next item to try and take. This costs O( n ) > bandwidth and O(n.log n) sorting time - per dequeue call! Clearly this > doesn't scale very well. > If the servers were able to maintain a data structure that allowed them to > efficiently retrieve the children of a node in order of the zxid that created > them this would make successful dequeue operations O( 1 ) at the cost of O( n > ) memory on the server (to maintain, e.g. a singly-linked list as a queue). > This is a win if it is generally true that clients only want the first child > in creation order, rather than the whole set. > We could expose this to the client via this API: getFirstChild(handle, path, > name_buffer, watcher) which would have much the same semantics as > getChildren, but only return one znode name. > Sequential nodes would still allow the ordering of znodes to be made > explicitly available to the client in one RPC should it need it. Although: > since this ordering would now be available cheaply for every set of children, > it's not completely clear that there would be that many use cases left for > sequential nodes if this API was augmented with a getChildrenInCreationOrder > call. However, that's for a different discussion. > A halfway-house alternative with more flexibility is to add an 'order' > parameter to getFirstChild and have the server compute the first child > according to the requested order (creation time, update time, lexicographical > order). This saves bandwidth at the expense of increased server load, > although servers can be implemented to spend memory on pre-computing commonly > requested orders. I am only in favour of this approach if servers maintain a > data-structure for every possible order, and then the memory implications > need careful consideration. > [edit - JIRA interprets ( n ) without the spaces as a thumbs-down. cute.] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira