On May 26, 2009, at 9:39 AM, Brian Candler wrote:
On Mon, May 25, 2009 at 12:36:11PM -0700, Chris Anderson wrote:
a database in couchdb is the place where work comes together, in
our case
this is the location where a group of people shares. combining
information
from different databases will be necessary. and i really have no
clue yet
how to approach this problem. so anyone?
The easiest thing is to merge the databases with replication.
In some ways this is the easiest - and in some ways it is the hardest.
It will be easy if your various sources are creating distinct
documents. It
will be hard if the various sources are editing the same documents -
because
you will start having to deal with replication conflicts.
i think replication is not the solution for the specific problem i
tried to sketch. i am talking about simple aggregate information (10
most recent documents per user, for example) over potentially
thousands of different databases. if i have to replicate all my
databases into one big database i would start with a big one and
replicate out to handle load. that feels like 'missing the point'.
(though i am still struggling which point exactly :) )
Whilst CouchDB's model for this is logically self-consistent, I
personally
still believe that it's difficult to use for real-world
applications. For
example, if you GET a document, you will get one arbitrary version.
You will
get no indication that conflicting versions of this document may or
may not
exist. If you want to ensure that your user always sees the latest,
resolved
version of the document, then you need to explicitly ask for all the
conflicting revisions, and then you need to fetch them individually
(AFAICS,
even a regular "bulk fetch" using _all_docs and keys can't do this),
and
then merge them in an application-specific way, and then put back
the merged
version and delete all the conflicting revs.
yes and no, it all depends on how you regard your users. i think in an
environment where many people create something together the conflicts
have meaning. i choose to expose the conflict, meaningfully, and
'help' the user resolve it herself.
but this is different from what we have been taught. we are all a bit
afraid of difficult and critical users, because we might have to
explain why things break :P i do feel it fits very well with couchdb
though.
in an environment where the conflicts are not results of user actions
there is no problem as well, because the conflicts are logical. (not
necessarily the same as meaningful, though :P )
groet,
jurg.
Regards,
Brian.