On 09/02/2009, at 5:45 PM, Paul Davis wrote:

A write takes the most recent status of the database. It performs the
write using the append only semantics of editing btrees. When the
write completes it uses an atomic write to the db header. This means
that no matter what, new readers get a consistent view of the entire
database.

The atomic write of the root is the commit. A reader, by virtue of an atomic read of the header, sees a commit point.

As I read your emails you seem to be assuming that CouchDB could walk
back through the valid database commits. As far as I understand, this
is not possible given the current database format. Furthermore, making
it possible would require a large amount of engineering to accomplish.

No, you don't have to walk the commits. There is no record of commits, except in as much as you might have a number of different roots in use by concurrent processes at any given time e.g. multiple commit points are an ephemeral thing. There is only every *one* durable commit point.

AFAIK, we supported inter-document consistency to a single node. Now
that we're more seriously contemplating multi-node setups its becoming
apparent that the single the atomicity was a special case when it can
be violated by something as simple as a replication.

Well, I believe I've shown that a simple change can make replication (optionally) respect MVCC commit points, involves very little change to the source algorithm, doesn't impact the current semantics at all unless you wish it to, and works on a per-replication request basis.

This is orthogonal to the problem of cluster-ACID, which is also do- able, but I'm trying to work through this replication issue right now.

I'm uncertain by what you mean by 'replication model'.

According to my use-case list e.g whether replication is exclusive with normal operation, and whether it can result in conflict (i.e. Single master deployments).

My current
understanding of replication is that it violates the promises of
_bulk_docs. As Damien mentions further down, to support what you're
asking for, you more or less need to repeat all _bulk_docs calls to
your central server in app code. This is quite possible. If enough
other people chimed in and voiced an opinion that this is something
they are interested in, I can see it as a valid reason for supporting
_bulk_docs like functionality in the future.

I don't want to replicate reified transactions. The current state of the source wrt. an MVCC commit point is all that is required iff your MVCC commit point is exposed to the user. You can build local transactions and a useful (NOT generic) form of distributed transactional consistency on top of that.

If it's trivial, then post a patch to JIRA.

We're discussing a proposed patch, against which this idea would be a patch :)

I'm just addressing the idea that you can't compact the source database while replication occurs if replication is made MVCC aware.

The thing is, your interpretation is asking CouchDB to prove the CAP
theorem incorrect.

Not at all. I'm saying that there are application/deployment models and use-cases that

a) distinguish between replication and normal operation e.g. thesystem moves from normal, conflict free operation, to replication, to conflict-resolution, back to normal operation; and/or

b) have a model that doesn't generate replication conflicts e.g. single-master replication doesn't fall under CAP.

Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

One should respect public opinion insofar as is necessary to avoid starvation and keep out of prison, but anything that goes beyond this is voluntary submission to an unnecessary tyranny.
  -- Bertrand Russell


Reply via email to