On Mon, Feb 9, 2009 at 2:40 AM, Antony Blakey <antony.bla...@gmail.com> wrote: > > On 09/02/2009, at 5:45 PM, Paul Davis wrote: > >> A write takes the most recent status of the database. It performs the >> write using the append only semantics of editing btrees. When the >> write completes it uses an atomic write to the db header. This means >> that no matter what, new readers get a consistent view of the entire >> database. > > The atomic write of the root is the commit. A reader, by virtue of an atomic > read of the header, sees a commit point. > >> As I read your emails you seem to be assuming that CouchDB could walk >> back through the valid database commits. As far as I understand, this >> is not possible given the current database format. Furthermore, making >> it possible would require a large amount of engineering to accomplish. > > No, you don't have to walk the commits. There is no record of commits, > except in as much as you might have a number of different roots in use by > concurrent processes at any given time e.g. multiple commit points are an > ephemeral thing. There is only every *one* durable commit point. > >> AFAIK, we supported inter-document consistency to a single node. Now >> that we're more seriously contemplating multi-node setups its becoming >> apparent that the single the atomicity was a special case when it can >> be violated by something as simple as a replication. > > Well, I believe I've shown that a simple change can make replication > (optionally) respect MVCC commit points, involves very little change to the > source algorithm, doesn't impact the current semantics at all unless you > wish it to, and works on a per-replication request basis. > > This is orthogonal to the problem of cluster-ACID, which is also do-able, > but I'm trying to work through this replication issue right now. > >> I'm uncertain by what you mean by 'replication model'. > > According to my use-case list e.g whether replication is exclusive with > normal operation, and whether it can result in conflict (i.e. Single master > deployments). > >> My current >> understanding of replication is that it violates the promises of >> _bulk_docs. As Damien mentions further down, to support what you're >> asking for, you more or less need to repeat all _bulk_docs calls to >> your central server in app code. This is quite possible. If enough >> other people chimed in and voiced an opinion that this is something >> they are interested in, I can see it as a valid reason for supporting >> _bulk_docs like functionality in the future. > > I don't want to replicate reified transactions. The current state of the > source wrt. an MVCC commit point is all that is required iff your MVCC > commit point is exposed to the user. You can build local transactions and a > useful (NOT generic) form of distributed transactional consistency on top of > that. > >> If it's trivial, then post a patch to JIRA. > > We're discussing a proposed patch, against which this idea would be a patch > :) > > I'm just addressing the idea that you can't compact the source database > while replication occurs if replication is made MVCC aware. > >> The thing is, your interpretation is asking CouchDB to prove the CAP >> theorem incorrect. > > Not at all. I'm saying that there are application/deployment models and > use-cases that > > a) distinguish between replication and normal operation e.g. thesystem moves > from normal, conflict free operation, to replication, to > conflict-resolution, back to normal operation; and/or > > b) have a model that doesn't generate replication conflicts e.g. > single-master replication doesn't fall under CAP. > > Antony Blakey > ------------- > CTO, Linkuistics Pty Ltd > Ph: 0438 840 787 > > One should respect public opinion insofar as is necessary to avoid > starvation and keep out of prison, but anything that goes beyond this is > voluntary submission to an unnecessary tyranny. > -- Bertrand Russell > > >
I would be convinced by an implementation. Until then I'll remain skeptical. HTH, Paul Davis