If you want to ensure that the username is unique at the time the user
enters it, then you need a central synchronous service. Using the
username/password as a pair isn't a good idea because it only takes
two naive/lazy users to use a similar password (based say on their
username? :) for collision to subsequently occur.
I've been considering this in a production environment, and I saw four
solutions:
1. Append some form of unique id from the server you are currently
talking to i.e. checksum then machine uuid + process. Any checksum is
going to have some chance of global collision, but it could be made
vanishingly small. Not great for the user because they have a
complicated username.
2. Define your user interaction such that it can deal with
subsequently needing to add some suffix to the username e.g. when you
get a replication conflict (which could involve an number of conflicts
equal to the number of writable replicas), you amend some/all of the
names to include a serial number and then email the user. This
complicates things for the user, and they end up with a username they
haven't chosen, or they may not see the email and end up abusing tech
support etc etc.
3. Use couchdb in a single-writer multiple-reader scenario. If you
only do that for those activities that require uniqueness then you
have consistency issues to deal with because replication is
asynchronous. One way to do that is to switch a session to the
writable server as soon as you need uniqueness. The single writer
becomes a bottleneck, but this is what I'm doing because it matches my
information architecture.
4. Use a central specialized server to check uniqueness and generate
an opaque userid token that you would subsequently use as a key (you
shouldn't use the username as a key). An ldap server or something like
it. Equivalent to the option above, but the single server only needs
to deal with the particular operations requiring uniqueness. It's
still a single point of failure, but I don't think you can get around
that if you want synchronous global uniqueness testing.
On 20/10/2008, at 3:17 PM, ara howard wrote:
i know counting objects, aka, distributed auto-increment in couch is
consider bad form. but let me propose a scenario a feel out peoples
thoughts on a specific topic, in the interest in solving what i
think *must* be solvable problem when using couch for an actual,
real, live distributed system..
so let's say we want to store something
login: foo
password: bar
in a couchdb system, to authenticate users. clearly, when given a
login, we want to lookup a given login by said login and validate a
password.
so consider this a bit - we could store docs using "account-
#{ login }" or some other permutation of of the login name - the
md5.. whatever...
this obviously isn't great - two user signing up on two different
nodes will cause a collision at replication time, but not at sign up
time, meaning it'd be nearly impossible to actually create a system
with multi-master nodes that would allow something as simple as user
signup without crazy after the fact email resolution requiring a
user to re-signup iff their login was a dup.
okay, take two, let couch generate the uuid, and replication
proceeds as planned. all is well. that is, until you want to
authenticate a user... doing a search based on
emit( doc.login, doc )
returns 14 results. two of them have the same password. which user
*is* this client logging in?
so this seems like a real wart: replication is *useless* without a
better mechanism for generating uuids. clearly we cannot expect a
user to login via uuid, and clearly we cannot use the login, nor
login:password combined as the uuid since that would create retro-
active signup failures...
so, in a situation like this, requiring a unique set of data across
all replicating systems, what would the 'couch way' be?
i think i'm stuck thinking inside a box and would love some insight
to get out of it but, for now, i feel like the distributed and
replicated nature of couch, while solving a host of issues, seems to
open up vastly more complicated ones in the process.
kind regards.
a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama
Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
There are two ways of constructing a software design: One way is to
make it so simple that there are obviously no deficiencies, and the
other way is to make it so complicated that there are no obvious
deficiencies.
-- C. A. R. Hoare