On Jan 11, 2010, at 11:21 AM, Paul Davis wrote:

>> My experience is that from time to time someone has a support request
>> where the symptom is "CouchDB is so slow as to be unusable" and the
>> answer is "set sequential uuids" and they are happen and CouchDB
>> "works" again.
>> 
>> Support requests are like cockroaches, for everyone you see there 100
>> others you don't. This math means the default random uuids is one of
>> the bigger bugs CouchDB ships with, and the switch to sequential is
>> one of the smallest patches with the biggest positive impacts we could
>> make.
> 
> Well I wouldn't characterize random UUID's as a bug, but yes they
> happen to exacerbate the worse side of the b~tree performance. Though
> I don't think that speed alone is reason enough to change the default.
> 
>> The downsides to sequential uuids are these (unless I've missed one).
>> 
>> Info leakage - the sequential uuids could give big brother an idea who
>> created a given document.
>> 
>> Gives the wrong idea - people will do stupid things like use the _id
>> in lieu of a timestamp or the local_seq for ordering.
>> 
>> Could be better - there's maybe an even better uuid algorithm we could 
>> discover.
>> 
>> I think the first case is important, but the others aren't that
>> compelling. Is there anything I'm missing?
> 
> My biggest concern is that it gives a relative ordering and proximity
> information to documents created on a given node (and can spread
> between DB's). And its a non-obvious leakage so that people may not
> realize that they're leaking such information. It may seem like an
> abstract concern but I think its real enough to force users to make
> that decision.

I was the one who asked Chris to make the change. The current ids are the worst 
case for btree insert performance, slowing and bloating both doc inserts and 
view indexing

I don't see leakage as a problem. I don't think we've ever claimed as a feature 
that our generated id are somehow secure against someone figuring out when and 
where something might have been created, and I don't know of anyone relying on 
it.

But I agree we should add to the documentation how ids are generated its 
implications. If someone wants crypto random ids, they can configure it.

-Damien


> 
> The sequential algorithm isn't time based, so its misuse doesn't
> really play into effect nearly as much as if we were going to try the
> utc_random algorithm.
> 
> HTH,
> Paul Davis

Reply via email to