Matt... Another thought I just had... As Chad points out, with my particular query being high volume its realistic to think that I'm always going to risk seeing duplicates if I try to query for results in real time due to replication lag between your servers. But I see how your using max_id in the paging stuff and I don't really need real time results so it seems like I should be able to use an ID that's 30 - 60 minutes old and do all of my queries using max_id instead of since_id. In theory this would have me trailing the edge of new results coming into the index by 30 - 60 minutes but it would give the servers more time to replicate so it seems like there'd be less of a chance I'd encounter dupes or missing entries.
If that approach would work (and you would know) I'd just want to make sure you'd be ok with me using max_id instead of since_id given that max_id isn't documented.... -steve On Apr 16, 7:58 am, Matt Sanford <[email protected]> wrote: > Hi all, > > There was a problem yesterday with several of the search back-ends > falling behind. This meant that if your page=1 and page=2 queries hit > different hosts they could return results that don't line up. If your > page=2 query hit a host with more lag you would miss results, and if > it hit a host that was more up-to-date you would see duplicates. We're > working on fixing this issues and trying to find a way to prevent > incorrect pagination in the future. Sorry for the delay in replying > but I was focusing all of my attention on fixing the issue and had to > let email wait. > > Thanks; > — Matt Sanford / @mzsanford > > On Apr 15, 2009, at 09:29 PM, stevenic wrote: > > > > > > > Ok... So I think I know what's going on. Well I don't know what's > > causing the bug obviously but I think I've narrowed down where it > > is... > > > I just issued the Page 1 or "previous" query for the above example and > > the ID's don't match the ID's from the original query. There are > > extra rows that come back... 3 to be exact. So the pagination queries > > are working fine. It's the initial query that's busted. It looks > > like that when you do a pagenation query you get back all rows > > matching the filter but a query without max_id sometimes drops rows. > > Well in my case it seems to drop rows everytime... This should get > > fixed... > > > ********* > > for: http://search.twitter.com/search.atom?max_id=1530963910&page=1&q=http > > > <feed xmlns:google="http://base.google.com/ns/1.0" xml:lang="en-US" > > xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http:// > >www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/"> > > <link type="application/atom+xml" rel="self" href="http:// > > search.twitter.com/search.atom?max_id=1530963910&page=1&q=http" /> > > <twitter:warning>adjusted since_id, it was older than allowed</ > > twitter:warning> > > <updated>2009-04-16T03:25:30Z</updated> > > <openSearch:itemsPerPage>15</openSearch:itemsPerPage> > > <openSearch:language>en</openSearch:language> > > <link type="application/atom+xml" rel="next" href="http:// > > search.twitter.com/search.atom?max_id=1530963910&page=2&q=http" /> > > > ...Removed... > > > <entry> > > <id>tag:search.twitter.com,2005:1530963910</id> > > <published>2009-04-16T03:25:30Z</published> > > </entry> > > <entry> > > <id>tag:search.twitter.com,2005:1530963908</id> > > <published>2009-04-16T03:25:32Z</published> > > > ...Where Did This Come From?... > > > </entry> > > <entry> > > <id>tag:search.twitter.com,2005:1530963898</id> > > <published>2009-04-16T03:25:30Z</published> > > > ...And This?... > > > </entry> > > <id>tag:search.twitter.com,2005:1530963896</id> > > <id>tag:search.twitter.com,2005:1530963895</id> > > <id>tag:search.twitter.com,2005:1530963894</id> > > <entry> > > <id>tag:search.twitter.com,2005:1530963892</id> > > <published>2009-04-16T03:25:32Z</published> > > > ...And This?... > > > </entry> > > <id>tag:search.twitter.com,2005:1530963881</id> > > <id>tag:search.twitter.com,2005:1530963865</id> > > <id>tag:search.twitter.com,2005:1530963860</id> > > <id>tag:search.twitter.com,2005:1530963834</id> > > <id>tag:search.twitter.com,2005:1530963833</id> > > <id>tag:search.twitter.com,2005:1530963829</id> > > <id>tag:search.twitter.com,2005:1530963827</id> > > <id>tag:search.twitter.com,2005:1530963812</id> > > </feed>- Hide quoted text - > > - Show quoted text -
