Re: uuid, auto-increment, et al.

Antony Blakey Sun, 19 Oct 2008 22:35:44 -0700

If you want to ensure that the username is unique at the time the userenters it, then you need a central synchronous service. Using theusername/password as a pair isn't a good idea because it only takestwo naive/lazy users to use a similar password (based say on theirusername? :) for collision to subsequently occur.

I've been considering this in a production environment, and I saw foursolutions:

1. Append some form of unique id from the server you are currentlytalking to i.e. checksum then machine uuid + process. Any checksum isgoing to have some chance of global collision, but it could be madevanishingly small. Not great for the user because they have acomplicated username.

2. Define your user interaction such that it can deal withsubsequently needing to add some suffix to the username e.g. when youget a replication conflict (which could involve an number of conflictsequal to the number of writable replicas), you amend some/all of thenames to include a serial number and then email the user. Thiscomplicates things for the user, and they end up with a username theyhaven't chosen, or they may not see the email and end up abusing techsupport etc etc.

3. Use couchdb in a single-writer multiple-reader scenario. If youonly do that for those activities that require uniqueness then youhave consistency issues to deal with because replication isasynchronous. One way to do that is to switch a session to thewritable server as soon as you need uniqueness. The single writerbecomes a bottleneck, but this is what I'm doing because it matches myinformation architecture.

4. Use a central specialized server to check uniqueness and generatean opaque userid token that you would subsequently use as a key (youshouldn't use the username as a key). An ldap server or something likeit. Equivalent to the option above, but the single server only needsto deal with the particular operations requiring uniqueness. It'sstill a single point of failure, but I don't think you can get aroundthat if you want synchronous global uniqueness testing.


On 20/10/2008, at 3:17 PM, ara howard wrote:

i know counting objects, aka, distributed auto-increment in couch isconsider bad form. but let me propose a scenario a feel out peoplesthoughts on a specific topic, in the interest in solving what ithink *must* be solvable problem when using couch for an actual,real, live distributed system..
so let's say we want to store something

 login: foo
 password: bar
in a couchdb system, to authenticate users. clearly, when given alogin, we want to lookup a given login by said login and validate apassword.
so consider this a bit - we could store docs using "account-#{ login }" or some other permutation of of the login name - themd5.. whatever...
this obviously isn't great - two user signing up on two differentnodes will cause a collision at replication time, but not at sign uptime, meaning it'd be nearly impossible to actually create a systemwith multi-master nodes that would allow something as simple as usersignup without crazy after the fact email resolution requiring auser to re-signup iff their login was a dup.
okay, take two, let couch generate the uuid, and replicationproceeds as planned. all is well. that is, until you want toauthenticate a user... doing a search based on
 emit( doc.login, doc )
returns 14 results. two of them have the same password. which user*is* this client logging in?
so this seems like a real wart: replication is *useless* without abetter mechanism for generating uuids. clearly we cannot expect auser to login via uuid, and clearly we cannot use the login, norlogin:password combined as the uuid since that would create retro-active signup failures...
so, in a situation like this, requiring a unique set of data acrossall replicating systems, what would the 'couch way' be?
i think i'm stuck thinking inside a box and would love some insightto get out of it but, for now, i feel like the distributed andreplicated nature of couch, while solving a host of issues, seems toopen up vastly more complicated ones in the process.
kind regards.

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of beingbetter. simply reflect on that.
h.h. the 14th dalai lama


Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There are two ways of constructing a software design: One way is tomake it so simple that there are obviously no deficiencies, and theother way is to make it so complicated that there are no obviousdeficiencies.

  -- C. A. R. Hoare

Re: uuid, auto-increment, et al.

Reply via email to