On 30/01/2009, at 6:30 PM, Chris Anderson wrote:
Ahh, I didn't consider the validation function as being replicated as
well. I suppose I'm imagining that validation functions will define
the borders of applications, and thinking of these data flows as
within a particular application.
This is a straightforward consequence of the fact that design docs are
documents just like any other, which of course has so many good
effects that it's hard to find fault with the system because of edge
cases like this.
No fault with design docs being normal docs, and in fact I can't see
how making it otherwise could solve this problem.
Another edge case from validation functions (which can happen even on
a single node) is that documents which have been added to a db, can be
invalidated by the addition of a validation function after they have
been saved. Having a view of all newly-invalid docs will definitely be
useful.
It seems like if you want to ensure that your system follows some of
the stricter principles you outlined, you'll have to avoid use of
(changing) validation functions in your applications. Or at least be
very thoughtful about code roll-outs.
That's assuming that there is such a thing as a roll-out. If your code
is replicated with your data, which it must be with design doc
functions because they are applicable to the db in which they reside,
then I can't seen an effective way to use those features with mesh
deployments.
An alternative might be to regard these non-functional features as
being a layer above the canonical store maintained by replication,
something like a view. So replication would never block, and the
underlying model would always guarantee that a global steady state is
reachable regardless of ordering.
And thinking of your 'newly-invalid' docs view, that feels like the
same kind of thing I'm suggesting.
I haven't thought more than that about what a solution would look
like, and I'm not sure at this point if partial replication is an
identical problem or not. My gut feel is that intermingling the
CouchDB-as-application features with CouchDB-as-replicated-document-
store functionality is problematic and requires enormous care, and a
more layered and partitioned approach might be prudent.
Regardless of the issue for meshes, it seems that using validation, or
any other non-functional feature that impacts the canonical data, as
opposed to derived data such as views, opens a real can of worms for
developers. Given your IRC comment about unfortunate memes being
generated by naive developers (c.f. single-node transactions), it
seems to me that if the underlying model presented by CouchDB becomes
more difficult to use or requires a more subtle understanding of the
very very hairy problem of distribution/global state etc, then the
meme will be 'CouchDB is impossible to get right'.
Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
The trouble with the world is that the stupid are cocksure and the
intelligent are full of doubt.
-- Bertrand Russell