The Twitter Streaming API presents statuses with Best Effort ordering
with the possibility of duplicates. It does not, and cannot
practically, order statuses. The since_id notion (a greater than
predicate) cannot reasonably be applied to an unordered column. Thus
the count parameter's weaker semantics.

Your code must deduplicate statuses to provide At Most Once semantics.
Given all the above, over requesting with the count parameter to paper
over data loss upon reconnection presents little hardship other than
some minor latency upon restart.

You could argue that the service already provides a k-sorted stream
and therefore could provide some sort of k-aware predicate filtering
based on a recent estimation of k. That wouldn't quite be since_id
though. In any case, you are left with some uncertainty and some over-
delivery and client-side deduplication.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.


On Sep 1, 3:33 pm, Joel Strellner <j...@twitturly.com> wrote:
> That would be very nice, but at this time it looks like count is the only
> way to go.
>
> Twitter: +1 for a since_id
>
> -Joel
>
> On Tue, Sep 1, 2009 at 2:47 PM, Sameer <sameer.kha...@gmail.com> wrote:
>
> > Hello, my question is in regards to the Stream API and how to deal
> > with getting statuses when the connection was lost.  I was wondering
> > if there was another way to retrieve older status messages other then
> > using the count parameter? It seems using the count parameter may be
> > inaccurate since estimating an average number of statuses per second
> > can be flawed due to spikes or other abnormal circumstances.  Is it
> > possible to pass a status_id of the last status and get all new
> > statuses from that point on?  Or perhaps pass a time stamp?
>
> > If you could help it would be greatly appreciated, it is important
> > that I maintain a complete stream of status messages.

Reply via email to