You can use Kafka. You can partition your topic using a key and this will give you the capability to use multiple spouts to read from the same topic.
Supun.. On Sat, May 9, 2015 at 4:57 PM, Javier Gonzalez <[email protected]> wrote: > Hi, > > I'm currently approaching the design of an application that will have a > single source of data from AMPS (high speed pub-sub system like Kafka). We > are currently facing the issue that the spout is much faster than the > bolts, and I believe the farming out of the processing to different nodes > is hurting our performance. Before we used to have several consumers on a > queue-like producer, so each spout would likely transfer to the "nearest" > bolts, but now with the pub-sub model we can't just consume blindly off the > source or we would face duplication. > > Any ideas on how to approach this? One idea we're toying with is using > more than one consumer, but using filters so that we can assure there is no > duplicate reads. Any others any of you could have, I would be grateful :) > > best regards, > > -- > Javier González Nicolini > -- Supun Kamburugamuva Member, Apache Software Foundation; http://www.apache.org E-mail: [email protected]; Mobile: +1 812 369 6762 Blog: http://supunk.blogspot.com
