Well, first, In the Gnip Power Track documentation http://docs.gnip.com/w/page/35663947/Power-Track at the "has:geo" section they say <<Currently, 'has:geo' is about 2-4% of the full firehose>>.
Also, I ran some tests a few weeks ago to see the difference in content between the search api and the streaming api for equivalent geolocalized searches. See this thread http://groups.google.com/group/twitter-development-talk/browse_thread/thread/a4bf3b7c6373657b# My results showed that the streaming API returns a very small fraction (3% in my tests) of what the search API returns. This is because the streaming API only uses the geotagging API to locate tweets, but the search API uses both the geotagging API and the user location field. For example, I can get around 250 000 tweets/day for San Francisco using the search api but the streaming api will return around 7000 tweets/day. At 7000 tweets/day for San Francisco, 50 000 for the whole US seems small. Colin On Apr 1, 2:40 pm, Augusto Santos <augu...@gemeos.org> wrote: > Sorry Colin, but where did you get this information? Doesn't match with the > reality. Not at all. > > On Fri, Apr 1, 2011 at 12:35 PM, Colin Surprenant < > > > > > > > > > > colin.surpren...@gmail.com> wrote: > > As a side note, currently only 3-4% of the total tweets (firehose) are > > geo-tagged and are eligible to be selected in a stream location > > bounding box. If the current firehose rate is about 140M tweets/day, > > that makes ~5M eligible tweets/day. > > > I do not know what the proportion of tweets from the US is but I would > > think 50% seem reasonable and would result in ~2.5M tweets/day. Even > > if we lower that proportion, your 50 000 tweets/day seems way off. > > > There are 3 possibilities, 1) you are being rate limited more than you > > think, 2) your bounding box is wrong or 3) your bounding box is too > > large and Twitter has reduced it somehow. I remember I read somewhere > > in the api doc that each bounding box could not be more than 1 degree > > square "enough to cover most metropolitan areas" - but I cannot find > > that back. > > > Colin > > > On Mar 31, 4:08 pm, Data Gatherer <gatherer...@gmail.com> wrote: > > > We have a bounding box set for the United States. Even though it's a > > > large box, we only receive about 50,000 tweets a day. However, I see > > > that we get rate limited at least once a week already. The box is > > > large, but the number of matching results is fairly low. Knowing how > > > the rate limiting works more specifically would be important when > > > trying to gather data for other projects (more bounding boxes, other > > > keywords). > > > > On Mar 31, 3:50 pm, Jeremy Dunck <jdu...@gmail.com> wrote: > > > > > On Thu, Mar 31, 2011 at 2:48 PM, Augusto Santos <augu...@gemeos.org> > > wrote: > > > > > No it won't. Streaming has rate limit with around 1% of firehose, if > > your > > > > > search term os too much generic. > > > > > If your search term or bouding box get too many tweets, you will > > start > > > > > receive 'limit' status message as doc said. > > > > >http://dev.twitter.com/pages/streaming_api_concepts#parsing-responses > > > > > Sure, I understand that, I just meant to say that 1% of all tweets is > > > > a lot (140M average per day now). > > > > > If your terms are not very general, you have a lot of head room. > > > -- > > Twitter developer documentation and resources:http://dev.twitter.com/doc > > API updates via Twitter:http://twitter.com/twitterapi > > Issues/Enhancements Tracker: > >http://code.google.com/p/twitter-api/issues/list > > Change your membership to this group: > >http://groups.google.com/group/twitter-development-talk > > -- > 氣 -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk