On Thu, Dec 22, 2011 at 20:46, Randall Leeds <[email protected]> wrote: > On Thu, Dec 22, 2011 at 20:18, Alexander Uvarov > <[email protected]> wrote: >> >> On Dec 23, 2011, at 1:49 AM, Paul Davis wrote: >> >>> On Thu, Dec 22, 2011 at 11:31 AM, Robert Newson <[email protected]> wrote: >>>> In my opinion, and I believe the majority opinion of the group, the >>>> CouchDB API should be the same everywhere. This specifically includes >>>> not doing things on a single box that will not work in a >>>> clustered/sharded situation. It's why our transactions are scoped to a >>>> single document, for example. >>>> >>>> I will also note that all_or_nothing does not provide multi-document >>>> ACID transactions. The batches used in bulk_docs are not recorded, so >>>> those items will be replicated individually (and in parallel, so not >>>> even in a predictable order), which would break the C and I >>>> characteristics on the receiving server. The old semantic would abort >>>> the whole update if any one of the documents couldn't be updated but >>>> the new semantic simply introduces a conflict in that case. >>>> >>> >>> Slight nit pick, but new behavior just returns the error that the >>> update would *cause* the conflict. (Assuming default non-replicator >>> _bulk_docs calls.) >>> >> >> Am I missing something? Current bulk_docs implementation will introduce a >> conflict in case of conflict, not just reject and return the error. >> >>>> B. >>>> >>>> On 22 December 2011 16:48, Alexander Uvarov <[email protected]> >>>> wrote: >>>>> And can become much easier with multi-document transactions as an option. >>>>> >>>>> On Thu, Dec 22, 2011 at 10:43 PM, Pepijn de Vos <[email protected]> >>>>> wrote: >>>>>> But not everyone needs a cluster. I like CouchDB because it's easy, not >>>>>> because "it scales", and in some situations, all_or_nothing is easy. >>>>>> >>> >>> Robert mentions it in passing, but the biggest reason that we dropped >>> the original _bulk_docs behavior doesn't have anything to do with >>> clustering. It was because the semantics are violated as soon as you >>> try and replicate. Since there's no tracking of the group of docs >>> posted to _bulk_docs then as soon as your mobile client tried to move >>> data in or out you'd lose all three of ACI in ACID. >> >> Ain't every system with multi-master architecture will cause problems as >> soon as you try to replicate? Should this force people to design for >> replication even them don't need it? In my first message I mentioned that >> not every application need to be replicated. There are a thousands of such >> apps in the world. Even it's possible to design some app for replication, it >> can be very hard to do and developer and probably future users will spend a >> lot of time for superfluous. > > It's possible, but expensive, to have multi-master architecture and > transaction isolation, but it involves distributed commit protocols. > > The wiki documentation is maybe slightly misleading in that the > guarantees provided by the current Apache CouchDB around > all_or_nothing have nothing to do with database crashes. All > _bulk_docs requests are written as a single group commit with a single > database header write, so either all valid, non-conflicting writes are > durably stored or none are. all_or_nothing lets validation functions > reject the whole bulk rather than just the failing write, and then > during the commit phase create conflicts rather than returning an > error. > > Here's the key: if your documents are known to be valid (or you don't > have a validate_doc_update function in your database), then the > difference is only whether or not conflicts are created or rejected, > not whether all writes hit disk durably or not, as the wiki might seem > to suggest. > > The replicator uses a flag on the query parameter to create conflicts > rather than rejecting them: ?new_edits=false. If you can tolerate > conflicts please feel free to create your own revision ids (bump the > leading number, create a random id, and slap them together with a > dash) and use ?new_edits=false. You'll get the same semantics with > respect to conflicts as all_or_nothing. You lose little by generating > your own revision ids since deterministic revisions is an optimization > for replication. Maybe that lets you move forward with your use case. > > More to the point though... I find replication is one of CouchDB's > killer features and that's why some devs (like me and Paul) would > rather see all_or_nothing vanish completely. If you need relational > consistency but not replication you might be better served elsewhere. > I won't tell you to go away (I love our users, and so I'm offering a > lesser-known workaround with ?new_edits) but I won't mislead you about > the goals of the project either. > > -Randall
I didn't realize when I wrote this that new_edits is actually documented [1]. I hope that helps! Cheers, Randall [1] https://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Posting_Existing_Revisions
