Larry, moreover, I assume you checked I/O and CPU load. But even if that's not the issue, you should absolutely check if you have simplejson with c extension installed. The python included version is 1.9 which is decidedly slower than the new 2.x branch. You might see json decoding load drop by 50% or more.
Pascal On Jul 8, 2010, at 17:31 , Larry Zhang wrote: > Hi everyone, > > I have a program calling the statuses/sample method of a garden hose > of the Streaming API, and I am experiencing the following problem: the > timestamps of the tweets that I downloaded constantly drift behind > real-time, the time drift keeps increasing until it reaches around 25 > minutes, and then I get a timeout from the request, sleep for 5 > seconds and reset the connection. The time drift is also reset to 0 > when the connection is reset. > > One solution for this I have now is to proactively reset the > connection more frequently, e.g., if I reconnect every 1 minute, the > time drift I get will be at most 1 minute. But I am not sure whether > this is allow by the API. > > So could anyone tell me if you have the same problem as mine or I am > using the API in the wrong way. And is it OK to reset connection every > minute? > > I am using Tweepy (http://github.com/joshthecoder/tweepy) as the > library for accessing the Streaming API. > > Thanks a lot! > -Larry