If you want to ensure that the username is unique at the time the user enters it, then you need a central synchronous service. Using the username/password as a pair isn't a good idea because it only takes two naive/lazy users to use a similar password (based say on their username? :) for collision to subsequently occur.

I've been considering this in a production environment, and I saw four solutions:

1. Append some form of unique id from the server you are currently talking to i.e. checksum then machine uuid + process. Any checksum is going to have some chance of global collision, but it could be made vanishingly small. Not great for the user because they have a complicated username.

2. Define your user interaction such that it can deal with subsequently needing to add some suffix to the username e.g. when you get a replication conflict (which could involve an number of conflicts equal to the number of writable replicas), you amend some/all of the names to include a serial number and then email the user. This complicates things for the user, and they end up with a username they haven't chosen, or they may not see the email and end up abusing tech support etc etc.

3. Use couchdb in a single-writer multiple-reader scenario. If you only do that for those activities that require uniqueness then you have consistency issues to deal with because replication is asynchronous. One way to do that is to switch a session to the writable server as soon as you need uniqueness. The single writer becomes a bottleneck, but this is what I'm doing because it matches my information architecture.

4. Use a central specialized server to check uniqueness and generate an opaque userid token that you would subsequently use as a key (you shouldn't use the username as a key). An ldap server or something like it. Equivalent to the option above, but the single server only needs to deal with the particular operations requiring uniqueness. It's still a single point of failure, but I don't think you can get around that if you want synchronous global uniqueness testing.

On 20/10/2008, at 3:17 PM, ara howard wrote:


i know counting objects, aka, distributed auto-increment in couch is consider bad form. but let me propose a scenario a feel out peoples thoughts on a specific topic, in the interest in solving what i think *must* be solvable problem when using couch for an actual, real, live distributed system..

so let's say we want to store something

 login: foo
 password: bar

in a couchdb system, to authenticate users. clearly, when given a login, we want to lookup a given login by said login and validate a password.

so consider this a bit - we could store docs using "account- #{ login }" or some other permutation of of the login name - the md5.. whatever...

this obviously isn't great - two user signing up on two different nodes will cause a collision at replication time, but not at sign up time, meaning it'd be nearly impossible to actually create a system with multi-master nodes that would allow something as simple as user signup without crazy after the fact email resolution requiring a user to re-signup iff their login was a dup.

okay, take two, let couch generate the uuid, and replication proceeds as planned. all is well. that is, until you want to authenticate a user... doing a search based on

 emit( doc.login, doc )

returns 14 results. two of them have the same password. which user *is* this client logging in?

so this seems like a real wart: replication is *useless* without a better mechanism for generating uuids. clearly we cannot expect a user to login via uuid, and clearly we cannot use the login, nor login:password combined as the uuid since that would create retro- active signup failures...

so, in a situation like this, requiring a unique set of data across all replicating systems, what would the 'couch way' be?

i think i'm stuck thinking inside a box and would love some insight to get out of it but, for now, i feel like the distributed and replicated nature of couch, while solving a host of issues, seems to open up vastly more complicated ones in the process.

kind regards.

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama




Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.
  -- C. A. R. Hoare


Reply via email to