[twitter-dev] Re: Stream API Count Parameter
The Twitter Streaming API presents statuses with Best Effort ordering with the possibility of duplicates. It does not, and cannot practically, order statuses. The since_id notion (a greater than predicate) cannot reasonably be applied to an unordered column. Thus the count parameter's weaker semantics. Your code must deduplicate statuses to provide At Most Once semantics. Given all the above, over requesting with the count parameter to paper over data loss upon reconnection presents little hardship other than some minor latency upon restart. You could argue that the service already provides a k-sorted stream and therefore could provide some sort of k-aware predicate filtering based on a recent estimation of k. That wouldn't quite be since_id though. In any case, you are left with some uncertainty and some over- delivery and client-side deduplication. -John Kalucki http://twitter.com/jkalucki Services, Twitter Inc. On Sep 1, 3:33 pm, Joel Strellner wrote: > That would be very nice, but at this time it looks like count is the only > way to go. > > Twitter: +1 for a since_id > > -Joel > > On Tue, Sep 1, 2009 at 2:47 PM, Sameer wrote: > > > Hello, my question is in regards to the Stream API and how to deal > > with getting statuses when the connection was lost. I was wondering > > if there was another way to retrieve older status messages other then > > using the count parameter? It seems using the count parameter may be > > inaccurate since estimating an average number of statuses per second > > can be flawed due to spikes or other abnormal circumstances. Is it > > possible to pass a status_id of the last status and get all new > > statuses from that point on? Or perhaps pass a time stamp? > > > If you could help it would be greatly appreciated, it is important > > that I maintain a complete stream of status messages.
[twitter-dev] Re: Stream API Count Parameter
Sameer, The count parameter is the only option for historical queries. You can keep track of the number of statuses received over the last few minutes and then over request by some significant factor. The only real cost is additional latency before your stream catches up to real time. Note that not all roles, resources and parameter combinations support the count parameter. -John Kalucki http://twitter.com/jkalucki Services, Twitter Inc. On Sep 1, 2:47 pm, Sameer wrote: > Hello, my question is in regards to the Stream API and how to deal > with getting statuses when the connection was lost. I was wondering > if there was another way to retrieve older status messages other then > using the count parameter? It seems using the count parameter may be > inaccurate since estimating an average number of statuses per second > can be flawed due to spikes or other abnormal circumstances. Is it > possible to pass a status_id of the last status and get all new > statuses from that point on? Or perhaps pass a time stamp? > > If you could help it would be greatly appreciated, it is important > that I maintain a complete stream of status messages.
[twitter-dev] Re: Stream API Count Parameter
That would be very nice, but at this time it looks like count is the only way to go. Twitter: +1 for a since_id -Joel On Tue, Sep 1, 2009 at 2:47 PM, Sameer wrote: > > Hello, my question is in regards to the Stream API and how to deal > with getting statuses when the connection was lost. I was wondering > if there was another way to retrieve older status messages other then > using the count parameter? It seems using the count parameter may be > inaccurate since estimating an average number of statuses per second > can be flawed due to spikes or other abnormal circumstances. Is it > possible to pass a status_id of the last status and get all new > statuses from that point on? Or perhaps pass a time stamp? > > If you could help it would be greatly appreciated, it is important > that I maintain a complete stream of status messages. >