Rather than iterate through the entire queue of all activity entries from all publishers (presumably since the beginning of the queue's existence) would it make more sense to have a map of publisher urls to activity queues for specific publishers? That way subscribers could look up activity only for publishers that he/she has subscribed to quickly rather than sifting through publishers that he/she doesn't care about.
> Date: Tue, 9 Jul 2013 20:01:48 -0400 > Subject: Re: Subscriber/ Publisher handling of activity > From: [email protected] > To: [email protected] > > A clarifying point on the iteration - the aggregator service knows about > each subscriber and is responsible for pulling activities from the queue > and offering them to each subscriber - this is a potential bottleneck and > it would be good to get a discussion going on how to mitigate that > > On Tuesday, July 9, 2013, Jason Letourneau wrote: > > > The last discussion on this topic had subscribers applying a filter to > > each published message on the queue - there should be some stub classes in > > the source that shows this thought direction - each subscriber would be > > iterated over and asked to process each published activity on the queue - > > they would apply a filter adhering to the filter interface - the > > implementation of that filter could be anything - one thought was a dsl > > like lucene syntax could be the default implementation - to answer your > > foundational question - publishers should have no knowledge of who is > > subscribed an subscriber should be able to filter in the best way for them > > (I.e. based on source o message, user, activity streams properties etc) > > > > Jason > > > > > > On Tuesday, July 9, 2013, Danny Sullivan wrote: > > > >> Will publishers or subscribers be in charge or making sure that only > >> specific activity stream entries make it to a certain queue? > >> If publishers are in charge, I would imagine that there would exist a > >> list of all subscribers for each publisher. Then each activity published > >> would be added to all the subscribers in that publishers subscriber list. > >> If subscribers are in charge, each subscriber would have a list of > >> publishers he/she is subscribed to. Then on some sort of timer, the list > >> would be iterated through and all activity entires not already consumed by > >> that subscriber would be outputted. > >> Looking at the application architecture here: > >> http://streams.incubator.apache.org/architecture.html It looks like all > >> activity is passed through a single queue. If this is going to be the > >> implementation going forward, I would think it would make more sense for > >> subscribers to handle the filtering. That would make it so that all > >> activity entires could be dumped in a single database by the publishers and > >> activity could be extracted and filtered based on some list kept by each > >> individual subscriber. Let me know if this sounds like it aligns with the > >> direction of the project. I would like to have the functionality to allow > >> subscribers to get only specific messages that are published. > >> > > > >
