Over the last couple months, we've seen some wierd behavior in the responses to search queries. First, I understand the rules about search being non-covering, and that we are at the mercy of the index. That said, I've noticed some odd behavior lately. As background material, we run many searches (and we're white-listed by IP and OAuth account), but the two I want to reference are the Mentions and the Location searches.
The Mentions search seems pretty stable and uses this typical search (and then we exclude a bunch of things like Bay St. Louis, etc.): http://search.twitter.com/search.atom?rpp=100&q=stl+OR+%23stl+OR+stlouis+OR+%22St+Louis%22+OR+%22St.+Louis%22+OR+%22Saint+Louis%22+OR+SaintLouis&since_id=26682507745 The Location search has been VERY unstable, and uses this typical search: http://search.twitter.com/search.atom?rpp=100&geocode=38.627522%2C-90.19841%2C30mi&since_id=26679538876 As the day progresses, we move up the high-water mark in the since_id to track what we've already received so we should be getting minimal gaps. We almost never see two 100-entry polls in a row, so I think we're keeping up with whatever coverage the search index is offering. I've posted in a Google Spreadsheet a graph of the tweet counts we're seeing since 7/1/2010 so you can see the trends http://bit.ly/9wnnFM (sheet two is the graph). Some interesting things to note: 1) The Mentions search is very consistent. 2) The Location search likes to bounce around a bit. 3) In mid August, we started to have issues with more 403s and error about since_id being too old. We were also getting rate-limited in our calls to get the tweep details (since the ATOM feed is so meager). Due to a bug, I wasn't committing all the tweets when this happened. 4) On or about Sept 1st, you guys did something that broke our ability to stay caught up... we started getting almost no tweets and lots of errors about since_id being too old. I thought this was due to your "new tweet id" assignment being rolled out. 5) On Sept 5th, I got back from vacation and added logic to understand and use the "no new tweets, roll the tweet id forward to this" driven by parsing the <link rel="refresh"> node in the ATOM feed. 6) I also, around this time, added better logic to the tweep-lookup detail, only asking you for tweeps I don't have at least a minimal row on. This reduced the number of rate-limiting issues. 7) We were very stable and until 9/23 when volume falls off a lot, and never really recovers. I think this is the "new search" engine rollout. To research a little more, I tried the Twitter advanced search page and asking for the RSS (atom, really) feed from the advanced search page I get this URL now: http://search.twitter.com/search.atom?geocode=38.627522,-90.19841,30.0mi&lang=en&q=+near:38.627522,-90.19841+within:30mi Which starts off like ours, but adds the (seemingly redundant) human- readable search criteria "&q=+near:38.627522,-90.19841+within:30mi". Oddly, if we remove that and do the same search at nearly the same instant, I DO get vastly different tweets sets... probably due to volume, possibly just sorting, but I would hope that with the same since_id value, I would get the same tweets... but I don't. So, I'm asking... what's going on? Why are we seeing so much volume fall-off? What can we do about it? Should I be running both searches (my current one and one with the human-readable query) to get better coverage? Is there any hope/expectation of the volume returning to normal? Doesn't anyone else care about tweep-location searches? Now, before you tell me that I should be using Site Streams (which I want to do), realize that I _NEED_ tweets from people whose profile location says they are in St. Louis (and similar) like the old Summize search honored. I can't just get by with the _tweet_ location being STL. Marc Brooks Chief guy getting yelled at, http://stltweets.com http://taste.stltweets.com http://loufest.stltweets.com -- Twitter developer documentation and resources: http://dev.twitter.com/doc API updates via Twitter: http://twitter.com/twitterapi Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list Change your membership to this group: http://groups.google.com/group/twitter-development-talk
