On Thu, Mar 25, 2010 at 9:20 AM, Benjamin Black <b...@b3k.us> wrote: > Cassandra is not being used to generate the Twitter identifiers. > Twitter, like most places using Cassandra, has more than one database > system in production. > > UUIDs are not at risk of conflicts with billions of rows.
Exactly: UUIDs were _designed_ not to. Biggest theoretical limit would be that with time+location based variant, you can generate "only" 10 million uuids per second due to timer resolution. And that's per physical address (ethernet/mac), which you can add more of. 128-bit space for random uuids (an alternative) can be shown to be more than adequate, although it does obviously assume good random number generator. As to question on how sequences are created: no company only uses Cassandra. Considering that rate of user account creation is rather low (in grand scheme of things), it can be done using any number of ways. Simplest: use an SQL database. :-) (and that'd be my guess as well -- right tool for the job etc) Nonetheless, it may just be an implementation detail; that is, whether it's a contiguous monotonically increasing sequence may not be an important invariant. Just needs to be unique. -+ Tatu +-