Re: couchdb transactions changes

Damien Katz Sun, 08 Feb 2009 21:36:51 -0800


On Feb 8, 2009, at 11:27 PM, Antony Blakey wrote:

On 09/02/2009, at 2:35 PM, Paul Davis wrote:
There is no concept of an "MVCC boundary" anywhere in the code that
I'm aware of.
Database updates create an MVCC commit, reads are all wrt an MVCCcommit. MVCC boundaries e.g. commit points, are a fundamental portof the Couch low-level architecture. When _bulk_docs was ACID, theywere exposed in the user-level API.
I think the bigger point here is that what you're asking for violates
a huge swath of assumptions baked into the core of CouchDB. Asking
CouchDB to do consistent inter-document writes is going to requireyou
to either change a large amount of internal code or write some very
specific app code to get what you want.
But it already did consistent inter-document writes - the removal ofthat is what this discussion is about.
You may be able to get atomic
interdocument updates on a single node, but this is violated if youdo
so much as try and replicate.
And 'so much as try and replicate' is the issue, because thereplication model varies for different use cases. In my previousposts you'll see that I'm promoting the idea that the local,exclusive-replication use-case is significant, and useful. The areuseful models where replication is a fundamentally differentoperation than local use.
IMO, it would be better to not support _bulk_docs for exactly this
reason. People that use _bulk_docs will end up assuming that the
atomic properties will carry over into places it doesn't actually get
passed on to.
But it can for local operations, and replications conflicts can bedealt with separately from normal operation.
It occurs to me that once you get to the point of writing source and
target database locking, you no longer need _bulk_docs. You'd have
enough code to do all the atomic interdoc writes you need.
Only by giving up all local concurrency. Locking is only wrt.replication vs. local operation. And I think the most recent emailsare showing that source locking is not as black-and-white as youthink - it's only wrt compaction, and even then I think it'srestricted to a requirement to no compact past the MVCC state beingused by the replication process, which IMO is a trivial issuebecause compaction cannot invalidate the head MVCC state, andreplication request will always use the head state in effect atrequest-time.
Though it'd
be rather un-couchy.
CouchDB has wide applicability, and what you regard as un-couchy isonly relative to a certain use-case. I'm trying to promote a moregenerous interpretation of what CouchDB is, and can be.

I see the critical problem being with consistent updates ofreplication. Unless you do it one big transaction, the intermediatereplication states of the database are inconsistent, so the targetdatabase is unusable during replication. A bulk transaction is limitedin how many docs it can handle, so it only works for smallishdatabases. That alone means MVCC replication isn't useful in thegeneral case.

But for your purposes, it's maybe possible. You'll need to write aspecial replicator and create a single HTTP request to give youeverything from the source in one go. Then you'll need to write-lock(or disable) the target database during replication, unless they arealways small databases in which case the special replicator can use asingle bulk transaction.

You'll also need to serialize the updates to the database in theapplication layer and add the conflict checking there. That will giveyou the desired transaction semantics.

If you build what I've described, and assuming you can live with thelimitations, you will have a always consistent one-way replicationdocument distribution platform.


-Damien

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
Human beings, who are almost unique in having the ability to learnfrom the experience of others, are also remarkable for theirapparent disinclination to do so.
 -- Douglas Adams

Re: couchdb transactions changes

Reply via email to