[twitter-dev] Re: Search API: max_id and page parameters giving very weird results
Except now that I look at it a day later, the results have completely changed, and seem to be in order. Why would the results change over time when the same max_id is set, and was valid at the time of the query? Are the ids of tweets not generated in ascending order? On Nov 3, 3:17 pm, TripleM stephenmerri...@gmail.com wrote: I've been trying to write a script to use the max_id parameter to loop through all 15 pages of results (with 100 results per page) without getting in troubles with grabbing the same tweet multiple times. Every time I do so, I find that not only are there a couple of duplicates on page 1 and 2, but also that the last tweet on page 1 is well further into the future, and has a lower ID, than a bunch of tweets on page 2. For example, consider these two, both with the same max_id but page = 1 and page = 2 respectively: http://search.twitter.com/search?rpp=100page=1geocode=-40.900557,17...http://search.twitter.com/search?rpp=100page=2geocode=-40.900557,17... (Or if you prefer json links which are what I am actually using, but I see the same thing on the above ones which are easier to describe:http://search.twitter.com/search.json?q=rpp=100geocode=-40.900557,1... Request:http://search.twitter.com/search.json?q=rpp=100geocode=-40.900557,1...) The first result on page 2 above was posted about 4 hours before the last tweet on page 1. There are also duplicates, eg AshleyGray00: Fireworks! I've been trying to figure this bug out for a while as I'm sure I'm missing something obvious but I'm completely stumped. Does anyone have any clue what is going on here? The only other threads I have found are about people trying to combine since_id and max_id which I know is not allowed, so I can't find anyone else having similar problems.
[twitter-dev] Re: Search API: max_id and page parameters giving very weird results
Apologies for the multiple posts, but as the above links no longer show the problem, you can replicate as follows: Go to http://search.twitter.com/search?rpp=100page=1geocode=-40.900557,174.885971,1000km Note how long ago the last tweet on that page was posted. Click 'Older' at the bottom. The first tweets on that page are much newer than the last ones on page 1. On Nov 3, 3:17 pm, TripleM stephenmerri...@gmail.com wrote: I've been trying to write a script to use the max_id parameter to loop through all 15 pages of results (with 100 results per page) without getting in troubles with grabbing the same tweet multiple times. Every time I do so, I find that not only are there a couple of duplicates on page 1 and 2, but also that the last tweet on page 1 is well further into the future, and has a lower ID, than a bunch of tweets on page 2. For example, consider these two, both with the same max_id but page = 1 and page = 2 respectively: http://search.twitter.com/search?rpp=100page=1geocode=-40.900557,17...http://search.twitter.com/search?rpp=100page=2geocode=-40.900557,17... (Or if you prefer json links which are what I am actually using, but I see the same thing on the above ones which are easier to describe:http://search.twitter.com/search.json?q=rpp=100geocode=-40.900557,1... Request:http://search.twitter.com/search.json?q=rpp=100geocode=-40.900557,1...) The first result on page 2 above was posted about 4 hours before the last tweet on page 1. There are also duplicates, eg AshleyGray00: Fireworks! I've been trying to figure this bug out for a while as I'm sure I'm missing something obvious but I'm completely stumped. Does anyone have any clue what is going on here? The only other threads I have found are about people trying to combine since_id and max_id which I know is not allowed, so I can't find anyone else having similar problems.