it really depends on the volume of tasks you're dealing with.

If you have a medium volume of background tasks (say, more than one task every 
5 seconds with high growth potential) it might be a good idea to consider 
something which was designed to perform as a task queue (things like 
http://www.celeryproject.org/, http://gearman.org/, and 
https://github.com/resque/resque fit into this category).

If you have a much higher number of tasks (more than a few per second, possibly 
millions of tasks per second), something like https://kafka.apache.org/ (very 
high volume), or http://www.rabbitmq.com/ (medium-high volume) might be a good 
fit for the job.

If you have a very low volume of tasks (a few per minute), you might be able to 
get away with a quick queue implementation directly on top of zookeeper. Take a 
look at the queue recipes at 
https://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_recipes_Queues (also at 
http://blog.cloudera.com/blog/2009/05/building-a-distributed-concurrent-queue-with-apache-zookeeper/).
 It shouldn't be too much effort to whip up something using 
https://kazoo.readthedocs.org/en/latest/.

There's a ton of systems out there to do something like this (I've even made 
one myself https://github.com/hpc/libcircle), so there's a good chance I've 
missed the one that would be perfect for your use case. However, the links in 
this email should give you a decent starting point.

-Jon

On Aug 18, 2014, at 7:41 AM, Phil Burress 
<[email protected]<mailto:[email protected]>> wrote:

Currently we have a cluster of machines running a single application. The
cluster performs various background tasks and we have a hacky, home-grown
solution for the nodes in the cluster to coordinate with each other to
perform these background tasks. It's very error-prone and we're looking to
replace it. Would Zookeeper be a good fit for coordinating something like
this? If so, are there any lightweight examples out there we could look at
it?

Thanks very much!

Phil

Reply via email to