Thank everyone for the quick reply, I have implemented a downloading program which uses curl, and it is fast enough to avoid the time drift. -Larry
On Jul 8, 5:00 pm, Pascal Jürgens <lists.pascal.juerg...@googlemail.com> wrote: > Larry, > > moreover, I assume you checked I/O and CPU load. But even if that's not the > issue, you should absolutely check if you have simplejson with c extension > installed. The python included version is 1.9 which is decidedly slower than > the new 2.x branch. You might see json decoding load drop by 50% or more. > > Pascal > > On Jul 8, 2010, at 17:31 , Larry Zhang wrote: > > > > > Hi everyone, > > > I have a program calling the statuses/sample method of a garden hose > > of the Streaming API, and I am experiencing the following problem: the > > timestamps of the tweets that I downloaded constantly drift behind > > real-time, the time drift keeps increasing until it reaches around 25 > > minutes, and then I get a timeout from the request, sleep for 5 > > seconds and reset the connection. The time drift is also reset to 0 > > when the connection is reset. > > > One solution for this I have now is to proactively reset the > > connection more frequently, e.g., if I reconnect every 1 minute, the > > time drift I get will be at most 1 minute. But I am not sure whether > > this is allow by the API. > > > So could anyone tell me if you have the same problem as mine or I am > > using the API in the wrong way. And is it OK to reset connection every > > minute? > > > I am using Tweepy (http://github.com/joshthecoder/tweepy) as the > > library for accessing the Streaming API. > > > Thanks a lot! > > -Larry