Sure - give me what you have and I'll port it to Curator. On 1/12/12 6:18 PM, "Ted Dunning" <[email protected]> wrote:
>I think I have a bit of it written already. > >It doesn't use Curator and I think you could simplify it substantially if >you were to use it. Would that help? > >On Thu, Jan 12, 2012 at 12:52 PM, Jordan Zimmerman ><[email protected]>wrote: > >> Ted - are you interested in writing this on top of Curator? If not, I'll >> give it a whack. >> >> -JZ >> >> On 1/5/12 12:50 AM, "Ted Dunning" <[email protected]> wrote: >> >> >Jordan, I don't think that leader election does what Josh wants. >> > >> >I don't think that consistent hashing is particularly good for that >>either >> >because the loss of one node causes the sequential state for lots of >> >entities to move even among nodes that did not fail. >> > >> >What I would recommend is a variant of micro-sharding. The key space >>is >> >divided into many micro-shards. Then nodes that are alive claim the >> >micro-shards using ephemerals and proceed as Josh described. On loss >>of a >> >node, the shards that node was handling should be claimed by the >>remaining >> >nodes. When a new node appears or new work appears, it is helpful to >> >direct nodes to effect a hand-off of traffic. >> > >> >In my experience, the best way to implement shard balancing is with and >> >external master instance much in the style of hbase or katta. This >> >external master can be exceedingly simple and only needs to wake up on >> >various events like loss of a node or change in the set of live shards. >> >It >> >can also wake up at intervals if desired to backstop the normal >> >notifications or to allow small changes for certain kinds of balancing. >> > Typically, this only requires a few hundred lines of code. >> > >> >This external master can, of course, be run on multiple nodes and which >> >master is in current control can be adjudicated with yet another leader >> >election. >> > >> >You can view this as a package of many leader elections. Or as >> >discretized >> >consistent hashing. The distinctions are a bit subtle but are very >> >important. These include, >> > >> >- there is a clean division of control between the master which >>determines >> >who serves what and the nodes that do the serving >> > >> >- there is no herd effect because the master drives the assignments >> > >> >- node loss causes the minimum amount of change of assignments since no >> >assignments to surviving nodes are disturbed. This is a major win. >> > >> >- balancing is pretty good because there are many shards compared to >>the >> >number of nodes. >> > >> >- the balancing strategy is highly pluggable. >> > >> >This pattern would make a nice addition to Curator, actually. It >>comes up >> >repeatedly in different contexts. >> > >> >On Thu, Jan 5, 2012 at 12:11 AM, Jordan Zimmerman >> ><[email protected]>wrote: >> > >> >> OK - so this is two options for doing the same thing. You use a >>Leader >> >> Election algorithm to make sure that only one node in the cluster is >> >> operating on a work unit. Curator has an implementation (it's really >> >>just >> >> a distributed lock with a slightly different API). >> >> >> >> -JZ >> >> >> >> On 1/5/12 12:04 AM, "Josh Stone" <[email protected]> wrote: >> >> >> >> >Thanks for the response. Comments below: >> >> > >> >> >On Wed, Jan 4, 2012 at 10:46 PM, Jordan Zimmerman >> >> ><[email protected]>wrote: >> >> > >> >> >> Hi Josh, >> >> >> >> >> >> >Second use case: Distributed locking >> >> >> This is one of the most common uses of ZooKeeper. There are many >> >> >> implementations - one included with the ZK distro. Also, there is >> >> >>Curator: >> >> >> https://github.com/Netflix/curator >> >> >> >> >> >> >First use case: Distributing work to a cluster of nodes >> >> >> This sounds feasible. If you give more details I and others on >>this >> >>list >> >> >> can help more. >> >> >> >> >> > >> >> >Sure. I basically want to handle race conditions where two commands >> >>that >> >> >operate on the same data are received by my cluster of znodes, >> >> >concurrently. One approach is to lock on the data that is effected >>by >> >>the >> >> >command (distributed lock). Another approach is make sure that all >>of >> >>the >> >> >commands that operate on any set of data are routed to the same >>node, >> >> >where >> >> >they can be processed serially using local synchronization. >>Consistent >> >> >hashing is an algorithm that can be used to select a node to handle >>a >> >> >message (where the inputs are the key to hash and the number of >>nodes >> >>in >> >> >the cluster). >> >> > >> >> >There are various implementations for this floating around. I'm just >> >> >interesting to know how this is working for anyone else. >> >> > >> >> >Josh >> >> > >> >> > >> >> >> >> >> >> -JZ >> >> >> >> >> >> ________________________________________ >> >> >> From: Josh Stone [[email protected]] >> >> >> Sent: Wednesday, January 04, 2012 8:09 PM >> >> >> To: [email protected] >> >> >> Subject: Use cases for ZooKeeper >> >> >> >> >> >> I have a few use cases that I'm wondering if ZooKeeper would be >> >>suitable >> >> >> for and would appreciate some feedback. >> >> >> >> >> >> First use case: Distributing work to a cluster of nodes using >> >>consistent >> >> >> hashing to ensure that messages of some type are consistently >> >>handled by >> >> >> the same node. I haven't been able to find any info about >>ZooKeeper + >> >> >> consistent hashing. Is anyone using it for this? A concern here >> >>would be >> >> >> how to redistribute work as nodes come and go from the cluster. >> >> >> >> >> >> Second use case: Distributed locking. I noticed that there's a >>recipe >> >> >>for >> >> >> this on the ZooKeeper wiki. Is anyone doing this? Any issues? One >> >> >>concern >> >> >> would be how to handle orphaned locks if a node that obtained a >>lock >> >> >>goes >> >> >> down. >> >> >> >> >> >> Third use case: Fault tolerance. If we utilized ZooKeeper to >> >>distribute >> >> >> messages to workers, can it be made to handle a node going down by >> >> >> re-distributing the work to another node (perhaps messages that >>are >> >>not >> >> >> ack'ed within a timeout are resent)? >> >> >> >> >> >> Cheers, >> >> >> Josh >> >> >> >> >> >> >> >> >>
