[twitter-dev] Streaming API vs. Search API: no API returns >95% of intented tweets

Karussell Tue, 15 Feb 2011 06:50:51 -0800

Hi,

this problem was already posted to the twitter4j mailing list [1]. Not
sure if it is an issue with my code, twitter4j or an API issue... user
reported similar problems in the past [2].

First:

I'm doing a 100 tweet search (without paging) every 5 minutes e.g.
against 'twitter search'. I get a set of tweets A - excluding the
duplicates, of course. I get approx 5 new tweets for every 5 minutes,
so 100 tweets as pageSize should be perfectly sufficient to get all
tweets.

Second:
When I'm doing a streaming filter request for the same terms 'twitter
search' then I'm getting a set of tweets B.

The problem is: combining A and B ('C=A v B') gives me a set C where
the count of C is more than 10% larger then A or B, which means that
neither with search nor streaming API I can catch a nearly complete
set of tweets.

E.g. doing this for 3 hours I'm getting 254 tweets (A) for the search
and 257 tweets (B) for the streaming but the combined set C has 337
tweets!

Is this a bug in my code or could this be an API issue?

BTW: I don't assume 100% correctness, I only want something above
90% :) especially for such relatively infrequent terms, where users
can, should and have noticed it.

Regards,
Peter.

[1]
http://groups.google.com/group/twitter4j/msg/d959e6257ceb452f

[2]
http://groups.google.com/group/twitter-development-talk/browse_thread/thread/71ab5cc666113c9e

http://blog.tweetsmarter.com/twitter-downtime/twitters-dirty-secret-they-dont-show-you-all-tweets/

http://jetwick.com Twitter Search without Noise

--
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group:
http://groups.google.com/group/twitter-development-talk

[twitter-dev] Streaming API vs. Search API: no API returns >95% of intented tweets

Reply via email to