On Thu, Apr 08, 2010 at 05:03:29PM -0700, Naveen wrote: > However, I wanted to be clear and feel it should be made obvious that > with this change, there is a possibility that a tweet may not be > delivered to client if the implementation of how since_id is currently > used is not updated to cover the case. I still envision the situation > as more likely than you seem to believe and figure as tweet velocity > increases, the likelihood will also increase; But I am assuming have > better data to support your viewpoint than I and shall defer.
Maybe I'm just missing something here, but it seems trivial to fix on Twitter's side (enough so that I assume it's what they've been planning from the start to do): Only return tweets from closed buckets. We are guaranteed that the buckets will be properly ordered. The order will only be randomized within a bucket. Therefore, by only returning tweets from buckets which are no longer receiving new tweets, since_id works and will never miss a tweet. And, yes, this does mean a slight delay in getting the tweets out because they have to wait a few milliseconds for their bucket to close before being exposed to calls which can use since_id, plus maybe a little longer for the contents of that bucket to be distributed to multiple servers. That's still going to only take time comparable to round-trip times for an HTTP request to fetch the data for display to a user and be far, far less than the average refresh delay required by those clients which fall under the API rate limit. I submit, therefore, that any such delay caused by waiting for buckets to close will be inconsequential. -- Dave Sherohman
