[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henry Robinson updated ZOOKEEPER-423:
-------------------------------------

    Attachment: ZOOKEEPER-423.patch

[Non-garbage patch this time]

This is a draft patch for this issue. I'd appreciate feedback on the approach. 

I've added two API methods - getFirstChild and getLastChild. I've changed 
DataNode to keep its child list in a TreeSet, not a HashSet. This may be 
controversial because operations are now logarithmic in the number of children, 
not O(1), but the datatree's hashmap is still used for most path-based lookups.

Both API methods throw an Empty exception if there are no children. 

I've decided not to address the issue of *what* to sort by here, and just 
defaulted to sorting by node name. The API leaves the door open to specifying a 
sort order at the time of parent node creation (i.e. another argument to 
create(...)).

This patch obviously lacks tests, comments and general cleanup. You should be 
able to use a shell to call lsfirst and lslast. 

Any comments on the APIs would be great - would we rather one getOrderedChild 
API which took a flag describing whether to get the first or last? Or a 
getNthChild API?

> Add getFirstChild API
> ---------------------
>
>                 Key: ZOOKEEPER-423
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-423
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: contrib-bindings, documentation, java client, server
>            Reporter: Henry Robinson
>         Attachments: ZOOKEEPER-423.patch
>
>
> When building the distributed queue for my tutorial blog post, it was pointed 
> out to me that there's a serious inefficiency here. 
> Informally, the items in the queue are created as sequential nodes. For a 
> 'dequeue' call, all items are retrieved and sorted by name by the client in 
> order to find the name of the next item to try and take. This costs O( n ) 
> bandwidth and O(n.log n) sorting time - per dequeue call! Clearly this 
> doesn't scale very well. 
> If the servers were able to maintain a data structure that allowed them to 
> efficiently retrieve the children of a node in order of the zxid that created 
> them this would make successful dequeue operations O( 1 ) at the cost of O( n 
> ) memory on the server (to maintain, e.g. a singly-linked list as a queue). 
> This is a win if it is generally true that clients only want the first child 
> in creation order, rather than the whole set. 
> We could expose this to the client via this API: getFirstChild(handle, path, 
> name_buffer, watcher) which would have much the same semantics as 
> getChildren, but only return one znode name. 
> Sequential nodes would still allow the ordering of znodes to be made 
> explicitly available to the client in one RPC should it need it. Although: 
> since this ordering would now be available cheaply for every set of children, 
> it's not completely clear that there would be that many use cases left for 
> sequential nodes if this API was augmented with a getChildrenInCreationOrder 
> call. However, that's for a different discussion. 
> A halfway-house alternative with more flexibility is to add an 'order' 
> parameter to getFirstChild and have the server compute the first child 
> according to the requested order (creation time, update time, lexicographical 
> order). This saves bandwidth at the expense of increased server load, 
> although servers can be implemented to spend memory on pre-computing commonly 
> requested orders. I am only in favour of this approach if servers maintain a 
> data-structure for every possible order, and then the memory implications 
> need careful consideration.
> [edit - JIRA interprets ( n ) without the spaces as a thumbs-down. cute.]

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to