On Fri, 11 Feb 2011 13:25:07 -0500, Adam Green <140...@gmail.com> wrote:
Be aware that the streaming API does not deliver everything you are
tracking. In theory it delivers everything up to 1% of the total flow
of tweets. In practice, I find that it delivers about 95% of the
tweets that match your keywords or users. This is fine when sampling,
which is what I generally use it for, but will cause much anguish if
you assume you will get everything sent by people you are following. I
have to admit that I have only found this issue with the streaming
API, but I'm betting that the user streams are based on the same
underlying code.

My solution to the missing values from the streaming API is to collect
everything I can from streaming, then use the REST API to backfill
data I might not have received. If you run the backfill every hour,
you only have to go back to the last set of good tweets, adding
anything you missed.

Backfill for keywords is easy - just use Search. But how do you determine what you *haven't* received from accounts that you're following? Do you need to grab the most recent 200 tweets from everyone you're following using REST, or do you do a Search with "from:xxxx OR from:yyyy ..." as many times as it takes?

http://twitter.com/znmeb http://borasky-research.net

"A mathematician is a device for turning coffee into theorems." -- Paul Erdős

Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group: 

Reply via email to