As a user, when I chose couchdb for my most recent project, I chose it because I didn't care about transactions. I would've used RDBMS if that were important. I chose it because couch solved the problems I needed solved very well.
I don't think transactions should be a big dev focus. On Fri, May 22, 2009 at 4:30 PM, Chris Anderson <jch...@apache.org> wrote: > On Thu, May 21, 2009 at 8:30 PM, Yuval Kogman <nothingm...@woobling.org> > wrote: > > 2009/5/21 Adam Kocoloski <kocol...@apache.org>: > >> Hi Yuval, thanks for this well-written proposal. I don't really want to > >> rehash all the discussion from back in February (see the thread > beginning at > >> > http://mail-archives.apache.org/mod_mbox/couchdb-dev/200902.mbox/%3c84f66023-030a-4669-b75c-3dcc92d71...@yahoo.com%3e > for > >> a particularly detailed discussion), but I do want to comment on one > aspect. > >> > >> Updating the replicator to be smart about atomic bulk transactions is > doable > >> (although a major undertaking), but when you throw DB compaction and > >> revision stemming into the mix things get really hairy. Recall that > CouchDB > >> revisions are used for concurrency control, not for maintaining history. > >> Consider the following sequence of events: > >> > >> 1) Generate foo/1 and bar/1 in an atomic _bulk_docs operation > >> 2) Update foo -> foo/2 > >> Compact the DB (foo/1 is deleted) > >> Start replicating to a mirror > >> Replication crashes before it reaches foo/2 > > > > By crash you mean an error due to a conflict between foo/2 and foo/1' > > (the mirror's version of foo), right? > > > >> In your proposal, we should expect foo/1 to exist on the mirror, right? > I > >> think this means we'd need to modify the compaction algorithm to keep > >> revisions of documents if a) the revision was part of an atomic > _bulk_docs, > >> and b) any of the documents in that transaction are still at the > revision > >> generated by the transaction. Same thing goes for revision stemming -- > we > >> can never drop revisions if they were part of an atomic upload and at > least > >> one of the document revs in the upload is still current. > > > > Yep. Personally I see this is a tradeoff, not a limitation per se. If > > you specify 'atomic' then you must pay more in terms of data size, > > performance, etc. > > The problem as I see it is that someone else's bulk transaction will > have to sit around in my database, until I edit all the docs in it. > Hopefully I won't get any distributed conflicts on other old versions > of docs in the group because this would put edits that I've done > locally to other documents in the bulk group, somehow less valid. > > Distributed bulk transactions would make for chaotic behavior, as > someone's mostly unrelated change on a remote node could eventually > replicate to me (months later) and knock an entire line of work that > I've done into a conflict state. > > If you want atomicity, put it in a single document. > > Chris > > > > > In 0.8 you would have theoretically had to pay by default, but didn't > > because replication broke transactions. > > > > The basic algorithm is still the same, but the garbage collected unit > > is changed (instead of garbage collecting document revisions it > > garbage collects revision sets, with the current case being a set with > > one member. The rules still apply (if this object is wholly shadowed > > by non conflicting changes then it can be disposed of)). IIRC the > > algorithm is a copying garbage collector, so this is pretty easy to do > > (you walk a DAG instead of a linked list). > > > > Under the proposed model you'd choose which operations are > > transactional and will have to pay for those. > > > > > > Anwyay, thanks for your link as well, I was reading through a rather > > boring thread and didn't see this one, so I guess I did miss out. It > > seemed to imply the discussion was done only on IRC. > > > > Anyway, here goes... > > > > The fundamental problem is that any consistent data model needs at the > > very least to have atomic primitives and ordered message passing (with > > transactional message handlers) at the per-partition level, or > > atomicity and consistency is restricted to a single document. > > > > What concerns me is Damien's post > > ( > http://mail-archives.apache.org/mod_mbox/couchdb-dev/200902.mbox/%3c451872b8-152c-42a6-9324-dd52534d9...@apache.org%3e > ): > > > >> No, CouchDB replication doesn't support replicating the transactions. > >> Never has, never will. That's more like transaction log replication > >> that's in traditonal dbs, a different beast. > >> > >> For the new bulk transaction model, I'm only proposing supporting > >> eventual consistency. All changes are safe to disk, but the db may not > >> be in a consistent state right away. > > > > From what I know this assumption is wrong. Eventual consistency still > > needs atomic primitives, it's not about whether or not you have > > transactions, it's about what data they affect (eventual consistency > > involves breaking them down). > > > > Anyway, "never will" sounds pretty binding, but for the sake of argument: > > > > By using only insertions and idempotent updates for the bulk of the > > data changes and a message queue whose handlers use atomic updates to > > integrate this data one can implement a truly atomic distributed > > model, or an eventual consistency, but without this updates need to be > > restricted to exactly one document. > > > > Eventual consistency is still possible using either locks or by > > breaking down what would have been large distributed transactions into > > smaller ones, but the key is that the code that will make things > > actually consistent must still have ACID guarantees (and be dispatched > > in order). > > > > The 0.9 model CouchDB is effectively MyISAM without data loss, but > > just because the data is around doesn't mean it's possible to know > > what to do with it (loss of context), or even fix it safely (the > > conflict resolution code is susceptible to conflicts too). > > > > Unfortunately for eventual consistency to actually work the breaking > > down of operations must be done on application level, the database > > can't decide which data can be deferred and which data cannot. > > > > All immutable data and all new data can obviously be added to the > > database outside of a transaction, but eventually a transaction > > linking this data must be part of an atomic mutation. > > > > The only way to support this without atomic operations on a unit > > larger than a document, is to have a "master" document for every > > transitive closure the graph structure requiring consistency, which in > > effect only actually relates to immutable snapshot documents (e.g. > > where the ID is a hash of the data). If these closures overlap then a > > single "master" for the whole graph will be needed. > > > > > > To illustrate, let's make up a social networking example. Let's say > > you are adding a friend on this social network, and that this > > operation involves 3 updates, one to add a link from your profile to > > your friend's ID, another for the inverse, and a third update to > > update to send a "hello" message to the friend, updating their inbox. > > The first update lives in one partition, and the second and third > > updates are on a second one. > > > > The back pointers in your new friends must be updated. In an fully > > transactional model this would lock the friend's document and yours at > > the same time, in an eventual consistency model this would queue a > > message for the friend's partition, and a message handler on the > > friend's partition would update this atomically "eventually". It's > > fine for the link to be out of date for a while, but eventually it > > needs to be fixed (e.g. if you want to remove the friend, message > > them, etc). > > > > In couchdb 0.9 one of the writes will get a "conflict" error back, and > > they could refetch the updated version and try the edit again. The > > problem is that if the wrote the third update update to another > > document on the same node making assumptions about the same data, that > > write may have succeeded, leaving the data inconsistent. Under an > > eventual consistency model you still use transactions to do these > > updates, you just must design your model to break them down into > > smaller units. > > > > The reason a graph structure is more susceptible to inconsistency is > > that while in a relational model many data linkage operations can be > > done with a single insert/update (e.g. `insert into edges (node1_id, > > node2_id)`), in a document based database this type of opreation > > involves modifying all the affected documents. The chance of > > inconsistency is increased because contention is higher and there is > > more data that must be synchronized. > > > > However, in another post Damien said: > > > >> Which is why in general you want to avoid inter-document dependencies, > >> or be relaxed in how you deal with them. > > > > So I think I best shut up after this without some decision maker > > telling me not to, if my use case is not covered by the intended > > design then that's that, but I do think this thread sort of covers > > this: > > > >> As far as distributed transactions go, I'd be thrilled if we could > >> implement it and also support the rest of couchdb, like views and bi- > >> directional replication. Please start up a discussion here in dev@ > >> about it and see if you can work out a design. > > > > Without going too pie-in-the-sky. > > > > Cheers, > > Yuval > > > > > > -- > Chris Anderson > http://jchrisa.net > http://couch.io >