RE: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

Brian Smith Thu, 08 Apr 2010 16:01:42 -0700

What does “within the caveats given above” mean? Either since_id will work or 
it won’t. It seems to me that if IDs are only in a “rough” order, since_id 
won’t work—in particular, there is a possibility that paging through tweets 
using since_id will completely skip over some tweets.

My concern is that, since tweets will not be serialized at the time they are 
written, there will be a race condition between me making a request and users 
posting new statuses. That is, I could get a response with the largest id in 
the response being X that gets evaluated just before a tweet (X-1) has been 
saved in the database; If so, when I issue a request with since_id=X, my 
program will never see the newer tweet (X-1).

Are you going to change the implementation of the timeline methods so that they 
never return a tweet with ID X until all nodes in the cluster guarantee that 
they won’t create a new tweet with an ID less than X?

I implement the following logic:

1.      Let LATEST start out as the earliest tweet available in the user’s 
timeline.

2.      Make a request with since_id={LATEST}, which returns a set of tweets T.

3.      If T is empty then stop.

4.      Let LATEST= max({ id(t), for all t in T}).

5.      Goto 2.

Will I be guaranteed not to skip over any tweets in the timeline using this 
logic? If not, what do I need to do to ensure I get them all?

Thanks,

Brian

From: [email protected] 
[mailto:[email protected]] On Behalf Of Mark McBride
Sent: Thursday, April 08, 2010 5:10 PM
To: [email protected]
Subject: Re: [twitter-dev] Re: Upcoming changes to the way status IDs are 
sequenced

Thank you for the feedback.  It's great to hear about the variety of use cases 
people have for the API, and in particular all the different ways people are 
using IDs. To alleviate some of the concerns raised in this thread we thought 
it would be useful to give more details about how we plan to generate IDs

1) IDs are still 64-bit integers.  This should minimize any migration pains.

2) You can still sort on ID.  Within a few millieconds you may get out of order 
results, but for most use cases this shouldn't be an issue.  

3) since_id will still work (within the caveats given above).  

4) We will provide a way to backfill from the streaming API.

5) You cannot use the generated ID to reverse engineer tweet velocity.  Note 
that you can still use the streaming API to determine the rate of public 
statuses.

Additional items of interest

1) At some point we will likely start using this as an ID for direct messages 
too

2) We will almost certainly open source the ID generation code, probably before 
we actually cut over to using it.

3) We STRONGLY suggest that you treat IDs as roughly sorted (roughly being 
within a few ms buckets), opaque 64-bit integers.  We may need to change the 
scheme again at some point in the future, and want to minimize migration pains 
should we need to do this.

Hopefully this puts you more at ease with the changes we're making.  If it 
raises new concerns, please let us know!

  ---Mark

 <http://twitter.com/mccv> http://twitter.com/mccv

On Mon, Apr 5, 2010 at 4:18 PM, M. Edward (Ed) Borasky <[email protected]> 
wrote:

On 04/05/2010 12:55 AM, Tim Haines wrote:
> This made me laugh.  Hard.
>
> On Fri, Apr 2, 2010 at 6:47 AM, Dewald Pretorius <[email protected]> wrote:
>
>> Mark,
>>
>> It's extremely important where you have two bots that reply to each
>> others' tweets. With incorrectly sorted tweets, you get conversations
>> that look completely unnatural.
>>
>> On Apr 1, 1:39 pm, Mark McBride <[email protected]> wrote:
>>> Just out of curiosity, what applications are you building that require
>>> sub-second sorting resolution for tweets?

Yeah - my bot laughed too ;-)

--
M. Edward (Ed) Borasky
borasky-research.net/m-edward-ed-borasky

"A mathematician is a device for turning coffee into theorems." ~ Paul Erdős

--

To unsubscribe, reply using "remove me" as the subject.

RE: [twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

Reply via email to