Matt...  Another thought I just had...

As Chad points out, with my particular query being high volume its
realistic to think that I'm always going to risk seeing duplicates if
I try to query for results in real time due to replication lag between
your servers.  But I see how your using max_id in the paging stuff and
I don't really need real time results so it seems like I should be
able to use an ID that's 30 - 60 minutes old and do all of my queries
using max_id instead of since_id.  In theory this would have me
trailing the edge of new results coming into the index by 30 - 60
minutes but it would give the servers more time to replicate so it
seems like there'd be less of a chance I'd encounter dupes or missing
entries.

If that approach would work (and you would know) I'd just want to make
sure you'd be ok with me using max_id instead of since_id given that
max_id isn't documented....

-steve

On Apr 16, 7:58 am, Matt Sanford <[email protected]> wrote:
> Hi all,
>
>     There was a problem yesterday with several of the search back-ends  
> falling behind. This meant that if your page=1 and page=2 queries hit  
> different hosts they could return results that don't line up. If your  
> page=2 query hit a host with more lag you would miss results, and if  
> it hit a host that was more up-to-date you would see duplicates. We're  
> working on fixing this issues and trying to find a way to prevent  
> incorrect pagination in the future. Sorry for the delay in replying  
> but I was focusing all of my attention on fixing the issue and had to  
> let email wait.
>
> Thanks;
>    — Matt Sanford / @mzsanford
>
> On Apr 15, 2009, at 09:29 PM, stevenic wrote:
>
>
>
>
>
> > Ok... So I think I know what's going on.  Well I don't know what's
> > causing the bug obviously but I think I've narrowed down where it
> > is...
>
> > I just issued the Page 1 or "previous" query for the above example and
> > the ID's don't match the ID's from the original query.  There are
> > extra rows that come back... 3 to be exact.  So the pagination queries
> > are working fine.  It's the initial query that's busted.  It looks
> > like that when you do a pagenation query you get back all rows
> > matching the filter but a query without max_id sometimes drops rows.
> > Well in my case it seems to drop rows everytime... This should get
> > fixed...
>
> > *********
> > for:  http://search.twitter.com/search.atom?max_id=1530963910&page=1&q=http
>
> > <feed xmlns:google="http://base.google.com/ns/1.0"; xml:lang="en-US"
> > xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/"; xmlns="http://
> >www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/";>
> >  <link type="application/atom+xml" rel="self" href="http://
> > search.twitter.com/search.atom?max_id=1530963910&page=1&q=http" />
> >  <twitter:warning>adjusted since_id, it was older than allowed</
> > twitter:warning>
> >  <updated>2009-04-16T03:25:30Z</updated>
> >  <openSearch:itemsPerPage>15</openSearch:itemsPerPage>
> >  <openSearch:language>en</openSearch:language>
> >  <link type="application/atom+xml" rel="next" href="http://
> > search.twitter.com/search.atom?max_id=1530963910&page=2&q=http" />
>
> >   ...Removed...
>
> > <entry>
> >  <id>tag:search.twitter.com,2005:1530963910</id>
> >  <published>2009-04-16T03:25:30Z</published>
> > </entry>
> > <entry>
> >  <id>tag:search.twitter.com,2005:1530963908</id>
> >  <published>2009-04-16T03:25:32Z</published>
>
> >  ...Where Did This Come From?...
>
> > </entry>
> > <entry>
> >  <id>tag:search.twitter.com,2005:1530963898</id>
> >  <published>2009-04-16T03:25:30Z</published>
>
> >  ...And This?...
>
> > </entry>
> >  <id>tag:search.twitter.com,2005:1530963896</id>
> >  <id>tag:search.twitter.com,2005:1530963895</id>
> >  <id>tag:search.twitter.com,2005:1530963894</id>
> > <entry>
> >  <id>tag:search.twitter.com,2005:1530963892</id>
> >  <published>2009-04-16T03:25:32Z</published>
>
> >  ...And This?...
>
> > </entry>
> >  <id>tag:search.twitter.com,2005:1530963881</id>
> >  <id>tag:search.twitter.com,2005:1530963865</id>
> >  <id>tag:search.twitter.com,2005:1530963860</id>
> >  <id>tag:search.twitter.com,2005:1530963834</id>
> >  <id>tag:search.twitter.com,2005:1530963833</id>
> >  <id>tag:search.twitter.com,2005:1530963829</id>
> >  <id>tag:search.twitter.com,2005:1530963827</id>
> >  <id>tag:search.twitter.com,2005:1530963812</id>
> > </feed>- Hide quoted text -
>
> - Show quoted text -

Reply via email to