It would be helpful if you could give some example output/results
where you are seeing duplicates across pages.  I have spent a long
long time with the Search API and haven't ever had this problem (or
maybe I have and never noticed it).

-Chad

On Wed, Apr 15, 2009 at 9:07 PM, steve <[email protected]> wrote:
>
> I've been using the Search API in a project and its been working very
> reliably.  So today I decided to add support for pagination so I could
> pull in more results and I think I've identified a couple of bugs with
> the pagination code.
>
> Bug 1)
>
> The first few results of Page 2 for a query are sometimes duplicates.
> To verify this do the following:
>
>   1. Execute the query: 
> http://search.twitter.com/search.atom?lang=en&q=http&rpp=100
>   2. Grab the "next" link from the results and execute that.
>   3. Compare the ID's at the end of set one with the ID's at the
> begining of set 2.  They sometimes overlap.
>
>
> Bug 2)
>
> The second bug may be the cause of the 1st bug.  The link you get for
> "next" in a result set is missing the "lang=en" query param.  So you
> end up getting non-english items in your result set.  You can manually
> add the "lang=en" param to your query and while you still get dupes
> you get less.  If you do this though you then start getting a warning
> in the result set about an adjusted since_id.
>
> What's scarier though is that the result set seemed to get wierd on me
> if I added the "lang" param and requested pages too fast.  By that I
> mean I would sometimes get results for Page 2 that were (time wise)
> hours before my original Since ID so my code would just stop
> requesting pages since it assumed it had reached the end of the set.
> The scary part... Adding around a 2 seconds sleep between queries
> seemed to make this issue go away...
>
>
> In general the pagination stuff with the "next" link doesn't seem very
> reliable to me.  You do seem to get less dupes then just calling
> search and incrementing the page number.  But I'm still seeing dupes,
> results for the wrong language, and sometimes totally wierd results.
>
> -steve
>

Reply via email to