On Thu, Dec 22, 2011 at 20:46, Randall Leeds <[email protected]> wrote:
> On Thu, Dec 22, 2011 at 20:18, Alexander Uvarov
> <[email protected]> wrote:
>>
>> On Dec 23, 2011, at 1:49 AM, Paul Davis wrote:
>>
>>> On Thu, Dec 22, 2011 at 11:31 AM, Robert Newson <[email protected]> wrote:
>>>> In my opinion, and I believe the majority opinion of the group, the
>>>> CouchDB API should be the same everywhere. This specifically includes
>>>> not doing things on a single box that will not work in a
>>>> clustered/sharded situation. It's why our transactions are scoped to a
>>>> single document, for example.
>>>>
>>>> I will also note that all_or_nothing does not provide multi-document
>>>> ACID transactions. The batches used in bulk_docs are not recorded, so
>>>> those items will be replicated individually (and in parallel, so not
>>>> even in a predictable order), which would break the C and I
>>>> characteristics on the receiving server. The old semantic would abort
>>>> the whole update if any one of the documents couldn't be updated but
>>>> the new semantic simply introduces a conflict in that case.
>>>>
>>>
>>> Slight nit pick, but new behavior just returns the error that the
>>> update would *cause* the conflict. (Assuming default non-replicator
>>> _bulk_docs calls.)
>>>
>>
>> Am I missing something? Current bulk_docs implementation will introduce a 
>> conflict in case of conflict, not just reject and return the error.
>>
>>>> B.
>>>>
>>>> On 22 December 2011 16:48, Alexander Uvarov <[email protected]> 
>>>> wrote:
>>>>> And can become much easier with multi-document transactions as an option.
>>>>>
>>>>> On Thu, Dec 22, 2011 at 10:43 PM, Pepijn de Vos <[email protected]> 
>>>>> wrote:
>>>>>> But not everyone needs a cluster. I like CouchDB because it's easy, not 
>>>>>> because "it scales", and in some situations, all_or_nothing is easy.
>>>>>>
>>>
>>> Robert mentions it in passing, but the biggest reason that we dropped
>>> the original _bulk_docs behavior doesn't have anything to do with
>>> clustering. It was because the semantics are violated as soon as you
>>> try and replicate. Since there's no tracking of the group of docs
>>> posted to _bulk_docs then as soon as your mobile client tried to move
>>> data in or out you'd lose all three of ACI in ACID.
>>
>> Ain't every system with multi-master architecture will cause problems as 
>> soon as you try to replicate? Should this force people to design for 
>> replication even them don't need it? In my first message I mentioned that 
>> not every application need to be replicated. There are a thousands of such 
>> apps in the world. Even it's possible to design some app for replication, it 
>> can be very hard to do and developer and probably future users will spend a 
>> lot of time for superfluous.
>
> It's possible, but expensive, to have multi-master architecture and
> transaction isolation, but it involves distributed commit protocols.
>
> The wiki documentation is maybe slightly misleading in that the
> guarantees provided by the current Apache CouchDB around
> all_or_nothing have nothing to do with database crashes. All
> _bulk_docs requests are written as a single group commit with a single
> database header write, so either all valid, non-conflicting writes are
> durably stored or none are. all_or_nothing lets validation functions
> reject the whole bulk rather than just the failing write, and then
> during the commit phase create conflicts rather than returning an
> error.
>
> Here's the key: if your documents are known to be valid (or you don't
> have a validate_doc_update function in your database), then the
> difference is only whether or not conflicts are created or rejected,
> not whether all writes hit disk durably or not, as the wiki might seem
> to suggest.
>
> The replicator uses a flag on the query parameter to create conflicts
> rather than rejecting them: ?new_edits=false. If you can tolerate
> conflicts please feel free to create your own revision ids (bump the
> leading number, create a random id, and slap them together with a
> dash) and use ?new_edits=false. You'll get the same semantics with
> respect to conflicts as all_or_nothing. You lose little by generating
> your own revision ids since deterministic revisions is an optimization
> for replication. Maybe that lets you move forward with your use case.
>
> More to the point though... I find replication is one of CouchDB's
> killer features and that's why some devs (like me and Paul) would
> rather see all_or_nothing vanish completely. If you need relational
> consistency but not replication you might be better served elsewhere.
> I won't tell you to go away (I love our users, and so I'm offering a
> lesser-known workaround with ?new_edits) but I won't mislead you about
> the goals of the project either.
>
> -Randall

I didn't realize when I wrote this that new_edits is actually
documented [1]. I hope that helps!

Cheers,
Randall

[1] 
https://wiki.apache.org/couchdb/HTTP_Bulk_Document_API#Posting_Existing_Revisions

Reply via email to