Quoting John Kalucki <j...@twitter.com>:
K-sorted means roughly sorted, where no item is no more than K positions
from it's totally ordered position. A sequence is k-sorted IFF, for all i,r,
1<= i <= r <= n, i<= r-k implies that a(i) <= a(r).
The generation scheme has to allow sufficient IDs to be generated in a
non-coordinated way to cover expected TPS well into the future. If you
remove bits from the timestamp, you'll need to add those bits back in to the
other fields, or you might starve for IDs. This scheme allows for 2^24 Ids
to be generated per millisecond, but that 24 bit space must be sparse to
allow for uncoordinated tweet generation.
-John
If the tweets are being generated by an abstract stochastic or
deterministic process and *must* have IDs assigned within a certain
window after they arrive, yes. But if the tweets are being generated
by a *finite* number, say, 2^30, of *human* users, a small fraction of
whom are actually posting a tweet at any given time and who have been
given only a guarantee of *eventual* consistency and who are paying
Twitter nothing to cache, deliver and archive their tweets, I'm not so
sure.
This whole discussion reminds me of the situation a few years back
when the world as we knew it was going to come to a sorry end because
someone measured Internet traffic and discovered it was "fractal", for
some definition of that overused term. "OMG - how can we do capacity
planning when the distributions don't have all the moments our
textbooks said they should?"
It turned out that they *could* do Internet capacity planning,
although it is a bit harder than processor capacity planning (but not
as hard as the Linux kernel's memory manager.) ;-) And it turned out
that Internet traffic wasn't "really" fractal after all, just the
traffic they had measured when they published the papers.
Now ... when every houseplant has its own IPV6 address and sends a
tweet to all 7562 of its followers when it needs watering ... ;-)
--
M. Edward (Ed) Borasky
http://borasky-research.net http://twitter.com/znmeb
Too old to be a futurist and too young to be an historian
--
Twitter developer documentation and resources: http://dev.twitter.com/doc
API updates via Twitter: http://twitter.com/twitterapi
Issues/Enhancements Tracker: http://code.google.com/p/twitter-api/issues/list
Change your membership to this group:
http://groups.google.com/group/twitter-development-talk