Hi All,

We have been noticing gaps appearing in search results at times when doing geocoded searches in particular. For example with this search over South Eastern Australia :

http://search.twitter.com/search.json?rpp=100&lang=en&page=1&geocode=-35.2,144.0,1000km

Occasionally produces results with large gaps in the created_at for the tweets. For example, I just got these created_at for tweets returned :

...
Tue, 17 Nov 2009 22:59:50 +0000
Tue, 17 Nov 2009 22:59:49 +0000
Tue, 17 Nov 2009 22:59:48 +0000
Tue, 17 Nov 2009 22:59:43 +0000
Tue, 17 Nov 2009 22:52:04 +0000
Tue, 17 Nov 2009 22:52:04 +0000
Tue, 17 Nov 2009 22:51:34 +0000
Tue, 17 Nov 2009 22:50:21 +0000
Tue, 17 Nov 2009 22:50:20 +0000
Tue, 17 Nov 2009 22:43:37 +0000
...

This area is producing multiple tweets per second, but there are some gaps there many minutes long. A subsequent search 10's of seconds, to a few minutes later 'fills in these gaps' with many many more tweets from the periods in between these minutes-long gaps, confirming that the initial search was in fact sparse.

The same effect exists via the normal web search interface also.

It has previously been possible to just use the maximum id of the tweets from the previous search result set, and then only page through results until you saw that id again. But having the search results appear out of order means this method is not possible. It means you would have to search across all 15 pages x 100 rpp continuously in order to ensure something approaching a complete result set. Even then it will not always be possible if ~ 1500 of results 'appear at once'. This is not sustainable for both the app doing the searching, or from the point of view of the many additional requests that would have to hit Twitter's servers.

Is this a problem to be resolved, or moving forward are we going to continue to get results appearing via search out of order from their creation date like this?

Thanks,

JB.

Reply via email to