I am interested in gathering tweets from a particular geographic region - currently Nigeria. Initially I ran queries that used the coordinates of Abuja, the capital, and asked for tweets within 400 miles. This covers most of the country save the the far northeastern corner of the country. This gave me 5-6K tweets a day. Since Nigeria has just reached 1M Facebook users, and taking that as an indicator, I expected much more data.
Next I tried a query that asked for tweets within 50 miles of Lagos, the largest city in Nigeria - with a population of over 9 million - and I got 12-15K tweets a day. A query asking for tweets within 10 miles of Lagos gave me 5k tweets a day. Both these numbers still seem low, but an improvement nonetheless. Lagos was within the 400 mile radius around Abuja, so it's interesting the query at the higher resolution gave me less data while going from 10 to 50 miles gave me more data. Currently I'm querying a number of the larger cities in Nigeria, in each case using a radius of 40-50 miles, and am getting 30K tweets a day. I'm assuming that I am still missing a lot of data. My questions: How does radius effect the query? 400 miles was clearly too wide a radius. 50 miles gave me more tweets than using 400 miles, but dropping to 10 miles gave me fewer. Any explanations for this behavior? Secondly, what is the best way get get tweets from a region. I'm not convinced I am going about it in the best way. Third, is there ground truth data for the number of Twitter users and "tweet-rate" by country. It would be great to know just how many tweets per day to expect. My queries page for 15 pages at 100 tweets a page and I stop paging if I get no new tweets. I then wait for a period of time, 10 minutes for Lagos and Abuja, and hour for more sparsely populated locations. I then start paging again with the since_id argument set to the id of the last tweet I got. There may be some tweeking I can do to the wait times, but I would expect that it would only provide marginal benefit. Thanks, Clay -- Subscription settings: http://groups.google.com/group/twitter-development-talk/subscribe?hl=en
