I can't speak for twitter on the "permission to do that" side, but that technique will work just fine, so you should be good to go technically. -chad
On Thu, Apr 16, 2009 at 9:34 PM, stevenic <[email protected]> wrote: > > Matt... Another thought I just had... > > As Chad points out, with my particular query being high volume its > realistic to think that I'm always going to risk seeing duplicates if > I try to query for results in real time due to replication lag between > your servers. But I see how your using max_id in the paging stuff and > I don't really need real time results so it seems like I should be > able to use an ID that's 30 - 60 minutes old and do all of my queries > using max_id instead of since_id. In theory this would have me > trailing the edge of new results coming into the index by 30 - 60 > minutes but it would give the servers more time to replicate so it > seems like there'd be less of a chance I'd encounter dupes or missing > entries. > > If that approach would work (and you would know) I'd just want to make > sure you'd be ok with me using max_id instead of since_id given that > max_id isn't documented.... > > -steve > > On Apr 16, 7:58 am, Matt Sanford <[email protected]> wrote: >> Hi all, >> >> There was a problem yesterday with several of the search back-ends >> falling behind. This meant that if your page=1 and page=2 queries hit >> different hosts they could return results that don't line up. If your >> page=2 query hit a host with more lag you would miss results, and if >> it hit a host that was more up-to-date you would see duplicates. We're >> working on fixing this issues and trying to find a way to prevent >> incorrect pagination in the future. Sorry for the delay in replying >> but I was focusing all of my attention on fixing the issue and had to >> let email wait. >> >> Thanks; >> — Matt Sanford / @mzsanford >> >> On Apr 15, 2009, at 09:29 PM, stevenic wrote: >> >> >> >> >> >> > Ok... So I think I know what's going on. Well I don't know what's >> > causing the bug obviously but I think I've narrowed down where it >> > is... >> >> > I just issued the Page 1 or "previous" query for the above example and >> > the ID's don't match the ID's from the original query. There are >> > extra rows that come back... 3 to be exact. So the pagination queries >> > are working fine. It's the initial query that's busted. It looks >> > like that when you do a pagenation query you get back all rows >> > matching the filter but a query without max_id sometimes drops rows. >> > Well in my case it seems to drop rows everytime... This should get >> > fixed... >> >> > ********* >> > for: http://search.twitter.com/search.atom?max_id=1530963910&page=1&q=http >> >> > <feed xmlns:google="http://base.google.com/ns/1.0" xml:lang="en-US" >> > xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns="http:// >> >www.w3.org/2005/Atom" xmlns:twitter="http://api.twitter.com/"> >> > <link type="application/atom+xml" rel="self" href="http:// >> > search.twitter.com/search.atom?max_id=1530963910&page=1&q=http" /> >> > <twitter:warning>adjusted since_id, it was older than allowed</ >> > twitter:warning> >> > <updated>2009-04-16T03:25:30Z</updated> >> > <openSearch:itemsPerPage>15</openSearch:itemsPerPage> >> > <openSearch:language>en</openSearch:language> >> > <link type="application/atom+xml" rel="next" href="http:// >> > search.twitter.com/search.atom?max_id=1530963910&page=2&q=http" /> >> >> > ...Removed... >> >> > <entry> >> > <id>tag:search.twitter.com,2005:1530963910</id> >> > <published>2009-04-16T03:25:30Z</published> >> > </entry> >> > <entry> >> > <id>tag:search.twitter.com,2005:1530963908</id> >> > <published>2009-04-16T03:25:32Z</published> >> >> > ...Where Did This Come From?... >> >> > </entry> >> > <entry> >> > <id>tag:search.twitter.com,2005:1530963898</id> >> > <published>2009-04-16T03:25:30Z</published> >> >> > ...And This?... >> >> > </entry> >> > <id>tag:search.twitter.com,2005:1530963896</id> >> > <id>tag:search.twitter.com,2005:1530963895</id> >> > <id>tag:search.twitter.com,2005:1530963894</id> >> > <entry> >> > <id>tag:search.twitter.com,2005:1530963892</id> >> > <published>2009-04-16T03:25:32Z</published> >> >> > ...And This?... >> >> > </entry> >> > <id>tag:search.twitter.com,2005:1530963881</id> >> > <id>tag:search.twitter.com,2005:1530963865</id> >> > <id>tag:search.twitter.com,2005:1530963860</id> >> > <id>tag:search.twitter.com,2005:1530963834</id> >> > <id>tag:search.twitter.com,2005:1530963833</id> >> > <id>tag:search.twitter.com,2005:1530963829</id> >> > <id>tag:search.twitter.com,2005:1530963827</id> >> > <id>tag:search.twitter.com,2005:1530963812</id> >> > </feed>- Hide quoted text - >> >> - Show quoted text - >
