It would be helpful if you could give some example output/results where you are seeing duplicates across pages. I have spent a long long time with the Search API and haven't ever had this problem (or maybe I have and never noticed it).
-Chad On Wed, Apr 15, 2009 at 9:07 PM, steve <[email protected]> wrote: > > I've been using the Search API in a project and its been working very > reliably. So today I decided to add support for pagination so I could > pull in more results and I think I've identified a couple of bugs with > the pagination code. > > Bug 1) > > The first few results of Page 2 for a query are sometimes duplicates. > To verify this do the following: > > 1. Execute the query: > http://search.twitter.com/search.atom?lang=en&q=http&rpp=100 > 2. Grab the "next" link from the results and execute that. > 3. Compare the ID's at the end of set one with the ID's at the > begining of set 2. They sometimes overlap. > > > Bug 2) > > The second bug may be the cause of the 1st bug. The link you get for > "next" in a result set is missing the "lang=en" query param. So you > end up getting non-english items in your result set. You can manually > add the "lang=en" param to your query and while you still get dupes > you get less. If you do this though you then start getting a warning > in the result set about an adjusted since_id. > > What's scarier though is that the result set seemed to get wierd on me > if I added the "lang" param and requested pages too fast. By that I > mean I would sometimes get results for Page 2 that were (time wise) > hours before my original Since ID so my code would just stop > requesting pages since it assumed it had reached the end of the set. > The scary part... Adding around a 2 seconds sleep between queries > seemed to make this issue go away... > > > In general the pagination stuff with the "next" link doesn't seem very > reliable to me. You do seem to get less dupes then just calling > search and incrementing the page number. But I'm still seeing dupes, > results for the wrong language, and sometimes totally wierd results. > > -steve >
