One idea off the top of my head: write tweets to something like Lucene, and
then rely on its more sophisticated query engine to pull tweets.  You'll
sacrifice some latency here of course.


On Fri, Apr 16, 2010 at 3:47 PM, Jeffrey Greenberg <> wrote:

> So I'm looking at the streaming api (track), and I've got thousands of
> searches.  ( ) I mainly need it to deal with
> terms that are very high volume, and to deal search api rate limiting.
> The main difficulty I'm thinking about is the best way to de-multiplex
> the stream back into the individual searches I'm trying to accomplish.
> 1. How do you handle if the searches are more complex than single
> terms, but a boolean expression... Do you convert the boolean into
> something like regex, and then run that regex on every tweet... So if
> I have several thousand regexs and thousands of tweets, that's a huge
> amount of processing just to demultiplex... But is that the way to go?
> 2 And if the search is just a simple expression, do folks simply
> demultiplex by doing a string search for each word in the search for
> every received tweet... like above?
> I'm looking for recommended ways to demultiplex the search stream...
> Thanks,
> jeffrey greenberg
> --
> Subscription settings:

Reply via email to