Ted - are you interested in writing this on top of Curator? If not, I'll give it a whack.
-JZ On 1/5/12 12:50 AM, "Ted Dunning" <[email protected]> wrote: >Jordan, I don't think that leader election does what Josh wants. > >I don't think that consistent hashing is particularly good for that either >because the loss of one node causes the sequential state for lots of >entities to move even among nodes that did not fail. > >What I would recommend is a variant of micro-sharding. The key space is >divided into many micro-shards. Then nodes that are alive claim the >micro-shards using ephemerals and proceed as Josh described. On loss of a >node, the shards that node was handling should be claimed by the remaining >nodes. When a new node appears or new work appears, it is helpful to >direct nodes to effect a hand-off of traffic. > >In my experience, the best way to implement shard balancing is with and >external master instance much in the style of hbase or katta. This >external master can be exceedingly simple and only needs to wake up on >various events like loss of a node or change in the set of live shards. >It >can also wake up at intervals if desired to backstop the normal >notifications or to allow small changes for certain kinds of balancing. > Typically, this only requires a few hundred lines of code. > >This external master can, of course, be run on multiple nodes and which >master is in current control can be adjudicated with yet another leader >election. > >You can view this as a package of many leader elections. Or as >discretized >consistent hashing. The distinctions are a bit subtle but are very >important. These include, > >- there is a clean division of control between the master which determines >who serves what and the nodes that do the serving > >- there is no herd effect because the master drives the assignments > >- node loss causes the minimum amount of change of assignments since no >assignments to surviving nodes are disturbed. This is a major win. > >- balancing is pretty good because there are many shards compared to the >number of nodes. > >- the balancing strategy is highly pluggable. > >This pattern would make a nice addition to Curator, actually. It comes up >repeatedly in different contexts. > >On Thu, Jan 5, 2012 at 12:11 AM, Jordan Zimmerman ><[email protected]>wrote: > >> OK - so this is two options for doing the same thing. You use a Leader >> Election algorithm to make sure that only one node in the cluster is >> operating on a work unit. Curator has an implementation (it's really >>just >> a distributed lock with a slightly different API). >> >> -JZ >> >> On 1/5/12 12:04 AM, "Josh Stone" <[email protected]> wrote: >> >> >Thanks for the response. Comments below: >> > >> >On Wed, Jan 4, 2012 at 10:46 PM, Jordan Zimmerman >> ><[email protected]>wrote: >> > >> >> Hi Josh, >> >> >> >> >Second use case: Distributed locking >> >> This is one of the most common uses of ZooKeeper. There are many >> >> implementations - one included with the ZK distro. Also, there is >> >>Curator: >> >> https://github.com/Netflix/curator >> >> >> >> >First use case: Distributing work to a cluster of nodes >> >> This sounds feasible. If you give more details I and others on this >>list >> >> can help more. >> >> >> > >> >Sure. I basically want to handle race conditions where two commands >>that >> >operate on the same data are received by my cluster of znodes, >> >concurrently. One approach is to lock on the data that is effected by >>the >> >command (distributed lock). Another approach is make sure that all of >>the >> >commands that operate on any set of data are routed to the same node, >> >where >> >they can be processed serially using local synchronization. Consistent >> >hashing is an algorithm that can be used to select a node to handle a >> >message (where the inputs are the key to hash and the number of nodes >>in >> >the cluster). >> > >> >There are various implementations for this floating around. I'm just >> >interesting to know how this is working for anyone else. >> > >> >Josh >> > >> > >> >> >> >> -JZ >> >> >> >> ________________________________________ >> >> From: Josh Stone [[email protected]] >> >> Sent: Wednesday, January 04, 2012 8:09 PM >> >> To: [email protected] >> >> Subject: Use cases for ZooKeeper >> >> >> >> I have a few use cases that I'm wondering if ZooKeeper would be >>suitable >> >> for and would appreciate some feedback. >> >> >> >> First use case: Distributing work to a cluster of nodes using >>consistent >> >> hashing to ensure that messages of some type are consistently >>handled by >> >> the same node. I haven't been able to find any info about ZooKeeper + >> >> consistent hashing. Is anyone using it for this? A concern here >>would be >> >> how to redistribute work as nodes come and go from the cluster. >> >> >> >> Second use case: Distributed locking. I noticed that there's a recipe >> >>for >> >> this on the ZooKeeper wiki. Is anyone doing this? Any issues? One >> >>concern >> >> would be how to handle orphaned locks if a node that obtained a lock >> >>goes >> >> down. >> >> >> >> Third use case: Fault tolerance. If we utilized ZooKeeper to >>distribute >> >> messages to workers, can it be made to handle a node going down by >> >> re-distributing the work to another node (perhaps messages that are >>not >> >> ack'ed within a timeout are resent)? >> >> >> >> Cheers, >> >> Josh >> >> >> >>
