I have heard pretty good words about Rabbit MQ as well. Message queues definitely are a great way to go. They do have limits with regard to double processing.
On Fri, Mar 18, 2011 at 10:05 PM, Michi Mutsuzaki <[email protected]>wrote: > Hi Otis, > > Would it be possible to change your app to use producer-consumer queue? > That way, > no document will go unprocessed when an instance goes down. > > > http://zookeeper.apache.org/doc/r3.3.2/zookeeperTutorial.html#sc_producerConsumerQueues > > --Michi > > On Mar 18, 2011, at 9:41 PM, Otis Gospodnetic wrote: > > > Hi, > > > > Thanks Ben. Let me describe the context a bit more - once you know what > I'm > > trying to do you may have suggestions for how to solve the problem with > or > > without ZK. > > > > I have a continuous "stream" of documents that I need to process. The > stream is > > pretty fast (don't know the exact number, but it's many docs a second). > Docs > > live only in-memory in that stream and cannot be saved to disk at any > point up > > the stream. > > > > My app listens to this stream. Because of the high document ingestion > rate I > > need N instances of my app to listen to this stream. So all N apps > listen and > > they all "get" the same documents, but only 1 app actually processes each > > document -- "if (docID mod N == appID) then process doc" -- the usual > consistent > > hashing stuff. I'd like to be able to add and remove apps dynamically > and have > > the remaining apps realize that "N" has changed. Similarly, when some > app > > instance dies and thus "N" changes, I'd like all the remaining instances > to know > > about it. > > > > If my apps don't know the correct "N" then 1/Nth of docs will go > unprocessed (if > > the app died or was removed) until the remaining apps adjust their local > value > > of "N". > > > >> to deal with this applications can use views, which allow clients to > >> reconcile differences. for example, if two processes communicate and > > > > Hm, this requires apps to communicate with each other. If each app was > aware of > > other apps, then I could get the membership count directly using that > mechanism, > > although I still wouldn't be able to immediately detect when some app > died, at > > least I'm not sure how I could do that. > > > >> one has a different list of members than the other then they can both > >> consult zookeeper to reconcile or use the membership list with the > >> highest zxid. the other option is to count on eventually everyone > >> converging. > > > > Right, if I could live with eventually having the right "N", then I could > use ZK > > as described on > > > http://eng.wealthfront.com/2010/01/actually-implementing-group-management.html > > > > But say I'm OK with "eventually everyone converging". Can I use ZK then? > And > > if so, how "eventually" is this "eventually"? That is, if an app dies, > how > > quickly can ZK notify all znode watchers that znode change? A few > milliseconds > > or more? > > > > In general, how does one deal with situations like the one I described > above, > > where each app is responsible for 1/Nth of work and where N can > uncontrollably > > and unexpectedly change? > > > > Thanks! > > Otis > > > > > > > > > > > > ----- Original Message ---- > >> From: Benjamin Reed <[email protected]> > >> To: [email protected] > >> Sent: Fri, March 18, 2011 5:59:43 PM > >> Subject: Re: Using ZK for real-time group membership notification > >> > >> in a distributed setting such an answer is impossible. especially > >> given the theory of relativity and the speed of light. a machine may > >> fail right after sending a heart beat or another may come online right > >> after sending a report. even if zookeeper could provide this you would > >> still have thread scheduling issues on a local machine that means that > >> you are operating on old information. > >> > >> to deal with this applications can use views, which allow clients to > >> reconcile differences. for example, if two processes communicate and > >> one has a different list of members than the other then they can both > >> consult zookeeper to reconcile or use the membership list with the > >> highest zxid. the other option is to count on eventually everyone > >> converging. > >> > >> i would not develop a distributed system with the assumption that "all > >> group members know *the exact number of members at all times*". > >> > >> ben > >> > >> On Fri, Mar 18, 2011 at 2:02 PM, Otis Gospodnetic > >> <[email protected]> wrote: > >>> Hi, > >>> > >>> Short version: > >>> How can ZK be used to make sure that all group members know *the exact > >> number of > >>> members at all times*? > >>> > >>> I have an app that can be run on 1 or more servers. New instances of > the > >> app > >>> come and go, may die, etc. -- the number of the app instances is > completely > >>> dynamic. At any one time, as these apps come and go, each live > instance of > >> the > >>> app needs to know how many instances are there total. If a new > instance of > >> the > >>> app is started, all instances need to know the new total number of > >> instances. > >>> If an app is stopped or if it dies, the remaining apps need to know > the new > >>> number of app instances. > >>> > >>> Also, and this is critical, they need to know about these > additions/removals > >> of > >>> apps right away and they all need to find out them at the same time. > >> Basically, > >>> all members of some group need to know *the exact number of members at > all > >>> times*. > >>> > >>> This sounds almost like we need to watch a "parent group znode" and > monitor > >> the > >>> number of its ephemeral children, which represent each app instance > that is > >>> watching the "parent groups znode". Is that right? If so, then all > I'd > >> need to > >>> know is the answer to "How many watchers are watching this znode?" of > "How > >> many > >>> kids does this znode have?". And I'd need ZK to notify all watchers > whenever > >> the > >>> answer to this question changes. Ideally it would send/push the > answer > > (the > >>> number of watchers) to all watchers, but if not, I assume any watcher > that > >> is > >>> notified about the change would go poll ZK to get the number of > ephemeral > >> kids. > >>> > >>> I think the above is essentially what's described on > >>> > >> > http://eng.wealthfront.com/2010/01/actually-implementing-group-management.html, > >>> but doesn't answer the part that's critical for me (the very first Q > up > >> above). > >>> > >>> Thanks, > >>> Otis > >>> ---- > >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > >>> Lucene ecosystem search :: http://search-lucene.com/ > >>> > >>> > >> > >
