Apologies for the multiple posts, but as the above links no longer show the problem, you can replicate as follows:
Go to http://search.twitter.com/search?rpp=100&page=1&geocode=-40.900557,174.885971,1000km Note how long ago the last tweet on that page was posted. Click 'Older' at the bottom. The first tweets on that page are much newer than the last ones on page 1. On Nov 3, 3:17 pm, TripleM <stephenmerri...@gmail.com> wrote: > I've been trying to write a script to use the max_id parameter to loop > through all 15 pages of results (with 100 results per page) without > getting in troubles with grabbing the same tweet multiple times. > > Every time I do so, I find that not only are there a couple of > duplicates on page 1 and 2, but also that the last tweet on page 1 is > well further into the future, and has a lower ID, than a bunch of > tweets on page 2. > > For example, consider these two, both with the same max_id but page = > 1 and page = 2 respectively: > > http://search.twitter.com/search?rpp=100&page=1&geocode=-40.900557,17...http://search.twitter.com/search?rpp=100&page=2&geocode=-40.900557,17... > > (Or if you prefer json links which are what I am actually using, but I > see the same thing on the above ones which are easier to > describe:http://search.twitter.com/search.json?q=&rpp=100&geocode=-40.900557,1... > Request:http://search.twitter.com/search.json?q=&rpp=100&geocode=-40.900557,1...) > > The first result on page 2 above was posted about 4 hours before the > last tweet on page 1. There are also duplicates, eg AshleyGray00: > Fireworks! > > I've been trying to figure this bug out for a while as I'm sure I'm > missing something obvious but I'm completely stumped. Does anyone have > any clue what is going on here? The only other threads I have found > are about people trying to combine since_id and max_id which I know is > not allowed, so I can't find anyone else having similar problems.