[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

Benedek Fri, 26 Mar 2010 15:52:22 -0700

Hi,

>From a practical developer point of view having "growing" IDs are very
helpful.

In many common DB operations greatly simplifies things for the
developers. (Most application with local storage or cache need one key
less. Or complex queries need fewer values in a temporary table.) This
leads to more simple code and faster applications.

I believe that your approach, "the most significant bits being sourced
from a timestamp", solves most problem and poses no need of change in
code for most developer IF newer status messages always have bigger
IDs.

By "newer" I mean "second precision". So I see no practical reasons to
have a perfect increasing id.

But there is one exception I worry about, even with second precision:
When a user posts two status updates within a second. In this case the
second one should alway have a bigger id. This seems a bit
"theoretical", but:

 - updates can come from a "buffer" eg. possible offline twitter
client
 - updates longer than 140 characters can be split into two (or more)
updates by a possible application.

So newer updates from the same users should always have bigger IDs.

If these two are granted, no application (except "twitter statistics"
sites) should have any problem with a change like this.

Benedek

On márc. 26, 21:41, Taylor Singletary <[email protected]>
wrote:
> Hi Developers,
>
> It's no secret that Twitter is growing exponentially. The tweets keep coming
> with ever increasing velocity, thanks in large part to your great
> applications.
>
> Twitter has adapted to the increasing number of tweets in ways that have
> affected you in the past: We moved from 32 bit unsigned integers to 64-bit
> unsigned integers for status IDs some time ago. You all weathered that storm
> with ease. The tweetapoclypse was averted, and the tweets kept flowing.
>
> Now we're reaching the scalability limit of our current tweet ID generation
> scheme. Unlike the previous tweet ID migrations, the solution to the current
> issue is significantly different. However, in most cases the new approach we
> will take will not result in any noticeable differences to you the developer
> or your users.
>
> We are planning to replace our current sequential tweet ID generation
> routine with a simple, more scalable solution. IDs will still be 64-bit
> unsigned integers. However, this new solution is no longer guaranteed to
> generate sequential IDs.  Instead IDs will be derived based on time: the
> most significant bits being sourced from a timestamp and the least
> significant bits will be effectively random.
>
> Please don't depend on the exact format of the ID. As our infrastructure
> needs evolve, we might need to tweak the generation algorithm again.
>
> If you've been trying to divine meaning from status IDs aside from their
> role as a primary key, you won't be able to anymore. Likewise for usage of
> IDs in mathematical operations -- for instance, subtracting two status IDs
> to determine the number of tweets in between will no longer be possible.
>
> For the majority of applications we think this scheme switch will be a
> non-event. Before implementing these changes, we'd like to know if your
> applications currently depend on the sequential nature of IDs. Do you depend
> on the density of the tweet sequence being constant?  Are you trying to
> analyze the IDs as anything other than opaque, ordered identifiers? Aside
> for guaranteed sequential tweet ID ordering, what APIs can we provide you to
> accomplish your goals?
>
> Taylor Singletary
> Developer Advocate, Twitterhttp://twitter.com/episod

To unsubscribe from this group, send email to 
twitter-development-talk+unsubscribegooglegroups.com or reply to this email 
with the words "REMOVE ME" as the subject.

[twitter-dev] Re: Upcoming changes to the way status IDs are sequenced

Reply via email to