[twitter-dev] Re: Stream API Count Parameter

2009-09-01 Thread John Kalucki

The Twitter Streaming API presents statuses with Best Effort ordering
with the possibility of duplicates. It does not, and cannot
practically, order statuses. The since_id notion (a greater than
predicate) cannot reasonably be applied to an unordered column. Thus
the count parameter's weaker semantics.

Your code must deduplicate statuses to provide At Most Once semantics.
Given all the above, over requesting with the count parameter to paper
over data loss upon reconnection presents little hardship other than
some minor latency upon restart.

You could argue that the service already provides a k-sorted stream
and therefore could provide some sort of k-aware predicate filtering
based on a recent estimation of k. That wouldn't quite be since_id
though. In any case, you are left with some uncertainty and some over-
delivery and client-side deduplication.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.


On Sep 1, 3:33 pm, Joel Strellner  wrote:
> That would be very nice, but at this time it looks like count is the only
> way to go.
>
> Twitter: +1 for a since_id
>
> -Joel
>
> On Tue, Sep 1, 2009 at 2:47 PM, Sameer  wrote:
>
> > Hello, my question is in regards to the Stream API and how to deal
> > with getting statuses when the connection was lost.  I was wondering
> > if there was another way to retrieve older status messages other then
> > using the count parameter? It seems using the count parameter may be
> > inaccurate since estimating an average number of statuses per second
> > can be flawed due to spikes or other abnormal circumstances.  Is it
> > possible to pass a status_id of the last status and get all new
> > statuses from that point on?  Or perhaps pass a time stamp?
>
> > If you could help it would be greatly appreciated, it is important
> > that I maintain a complete stream of status messages.


[twitter-dev] Re: Stream API Count Parameter

2009-09-01 Thread John Kalucki

Sameer,

The count parameter is the only option for historical queries. You can
keep track of the number of statuses received over the last few
minutes and then over request by some significant factor. The only
real cost is additional latency before your stream catches up to real
time.

Note that not all roles, resources and parameter combinations support
the count parameter.

-John Kalucki
http://twitter.com/jkalucki
Services, Twitter Inc.


On Sep 1, 2:47 pm, Sameer  wrote:
> Hello, my question is in regards to the Stream API and how to deal
> with getting statuses when the connection was lost.  I was wondering
> if there was another way to retrieve older status messages other then
> using the count parameter? It seems using the count parameter may be
> inaccurate since estimating an average number of statuses per second
> can be flawed due to spikes or other abnormal circumstances.  Is it
> possible to pass a status_id of the last status and get all new
> statuses from that point on?  Or perhaps pass a time stamp?
>
> If you could help it would be greatly appreciated, it is important
> that I maintain a complete stream of status messages.


[twitter-dev] Re: Stream API Count Parameter

2009-09-01 Thread Joel Strellner
That would be very nice, but at this time it looks like count is the only
way to go.

Twitter: +1 for a since_id

-Joel

On Tue, Sep 1, 2009 at 2:47 PM, Sameer  wrote:

>
> Hello, my question is in regards to the Stream API and how to deal
> with getting statuses when the connection was lost.  I was wondering
> if there was another way to retrieve older status messages other then
> using the count parameter? It seems using the count parameter may be
> inaccurate since estimating an average number of statuses per second
> can be flawed due to spikes or other abnormal circumstances.  Is it
> possible to pass a status_id of the last status and get all new
> statuses from that point on?  Or perhaps pass a time stamp?
>
> If you could help it would be greatly appreciated, it is important
> that I maintain a complete stream of status messages.
>