So, it turns out that I was an order of magnitude off when I mentioned numbers above. We receive 500,000 tweets/day not 50,000.
On Apr 1, 3:49 pm, Colin Surprenant <[email protected]> wrote: > Well, first, In the Gnip Power Track > documentationhttp://docs.gnip.com/w/page/35663947/Power-Trackat the "has:geo" > section they say <<Currently, 'has:geo' is about 2-4% of the full > firehose>>. > > Also, I ran some tests a few weeks ago to see the difference in > content between the search api and the streaming api for equivalent > geolocalized searches. See this > threadhttp://groups.google.com/group/twitter-development-talk/browse_thread... > > My results showed that the streaming API returns a very small fraction > (3% in my tests) of what the search API returns. This is because the > streaming API only uses the geotagging API to locate tweets, but the > search API uses both the geotagging API and the user location field. > > For example, I can get around 250 000 tweets/day for San Francisco > using the search api but the streaming api will return around 7000 > tweets/day. > > At 7000 tweets/day for San Francisco, 50 000 for the whole US seems > small. > > Colin > > On Apr 1, 2:40 pm, Augusto Santos <[email protected]> wrote: > > > > > > > > > Sorry Colin, but where did you get this information? Doesn't match with the > > reality. Not at all. > > > On Fri, Apr 1, 2011 at 12:35 PM, Colin Surprenant < > > > [email protected]> wrote: > > > As a side note, currently only 3-4% of the total tweets (firehose) are > > > geo-tagged and are eligible to be selected in a stream location > > > bounding box. If the current firehose rate is about 140M tweets/day, > > > that makes ~5M eligible tweets/day. > > > > I do not know what the proportion of tweets from the US is but I would > > > think 50% seem reasonable and would result in ~2.5M tweets/day. Even > > > if we lower that proportion, your 50 000 tweets/day seems way off. > > > > There are 3 possibilities, 1) you are being rate limited more than you > > > think, 2) your bounding box is wrong or 3) your bounding box is too > > > large and Twitter has reduced it somehow. I remember I read somewhere > > > in the api doc that each bounding box could not be more than 1 degree > > > square "enough to cover most metropolitan areas" - but I cannot find > > > that back. > > > > Colin > > > > On Mar 31, 4:08 pm, Data Gatherer <[email protected]> wrote: > > > > We have a bounding box set for the United States. Even though it's a > > > > large box, we only receive about 50,000 tweets a day. However, I see > > > > that we get rate limited at least once a week already. The box is > > > > large, but the number of matching results is fairly low. Knowing how > > > > the rate limiting works more specifically would be important when > > > > trying to gather data for other projects (more bounding boxes, other > > > > keywords). > > > > > On Mar 31, 3:50 pm, Jeremy Dunck <[email protected]> wrote: > > > > > > On Thu, Mar 31, 2011 at 2:48 PM, Augusto Santos <[email protected]> > > > wrote: > > > > > > No it won't. Streaming has rate limit with around 1% of firehose, if > > > your > > > > > > search term os too much generic. > > > > > > If your search term or bouding box get too many tweets, you will > > > start > > > > > > receive 'limit' status message as doc said. > > > > > >http://dev.twitter.com/pages/streaming_api_concepts#parsing-responses > > > > > > Sure, I understand that, I just meant to say that 1% of all tweets is > > > > > a lot (140M average per day now). > > > > > > If your terms are not very general, you have a lot of head room. > > > > -- > > > Twitter developer documentation and resources:http://dev.twitter.com/doc > > > API updates via Twitter:http://twitter.com/twitterapi > > > Issues/Enhancements Tracker: > > >http://code.google.com/p/twitter-api/issues/list > > > Change your membership to this group: > > >http://groups.google.com/group/twitter-development-talk > > > -- > > 氣 -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
