Re: [twitter-dev] Recommended ways to demultiplex the search stream with thousands of searches

Andrew Badera Fri, 16 Apr 2010 16:02:26 -0700

I know it's not Web 2.0-cool, but I'm writing to SQL Server 2008
(Standard, x64) and using fulltext indexing/searching from there. On
production hardware, I hardly see any real impact as far as latency
goes, even on busy predicates. I can't imagine that the
lighter-weight/more efficient Lucene would have a significantly
perceivable impact.


∞ Andy Badera
∞ +1 518-641-1280 Google Voice
∞ This email is: [ ] bloggable [x] ask first [ ] private
∞ Google me: http://www.google.com/search?q=andrew%20badera



On Fri, Apr 16, 2010 at 6:59 PM, Mark McBride <mmcbr...@twitter.com> wrote:
> One idea off the top of my head: write tweets to something like Lucene, and
> then rely on its more sophisticated query engine to pull tweets.  You'll
> sacrifice some latency here of course.
>   ---Mark
>
> http://twitter.com/mccv
>
>
> On Fri, Apr 16, 2010 at 3:47 PM, Jeffrey Greenberg
> <jeffreygreenb...@gmail.com> wrote:
>>
>> So I'm looking at the streaming api (track), and I've got thousands of
>> searches.  ( http://tweettronics.com ) I mainly need it to deal with
>> terms that are very high volume, and to deal search api rate limiting.
>>
>> The main difficulty I'm thinking about is the best way to de-multiplex
>> the stream back into the individual searches I'm trying to accomplish.
>>
>> 1. How do you handle if the searches are more complex than single
>> terms, but a boolean expression... Do you convert the boolean into
>> something like regex, and then run that regex on every tweet... So if
>> I have several thousand regexs and thousands of tweets, that's a huge
>> amount of processing just to demultiplex... But is that the way to go?
>> 2 And if the search is just a simple expression, do folks simply
>> demultiplex by doing a string search for each word in the search for
>> every received tweet... like above?
>>
>> I'm looking for recommended ways to demultiplex the search stream...
>>
>> Thanks,
>> jeffrey greenberg
>>
>>
>> --
>> Subscription settings:
>> http://groups.google.com/group/twitter-development-talk/subscribe?hl=en
>
>

Re: [twitter-dev] Recommended ways to demultiplex the search stream with thousands of searches

Reply via email to