Re: Bulk Docs

Dean Landolt Thu, 12 Mar 2009 16:42:40 -0700

On Thu, Mar 12, 2009 at 6:38 PM, Antony Blakey <[email protected]>wrote:


> On 13/03/2009, at 1:46 AM, Damien Katz wrote:
>
>  Atomic bulk docs is in the patch, it just doesn't do conflict checking. If
>> any docs are conflicts, they are saved anyway as conflicts. This means it's
>> really for message queue functionality, not database consistency, your data
>> is safe and committed but might not be immediately available or consistent
>> between docs. The reasons we are removing all or nothing with conflict
>> checking as it doesn't work with replication (both offline and clustering)
>> as docs are not replicated in a single transaction or even in update order.
>> And getting it to work with partitioning would cause unacceptable write
>> performances. If we leave it, people will rely on the behavior not
>> understanding it doesn't really work with the rest of CouchDB.
>>
>> So if you are currently using bulk docs to guarantee inter-document
>> consistency, it already doesn't work with replication. It only works on a
>> single machine, so no master-slave and no hot stand-by setup would work as
>> neither are guaranteed to be in a consistent state at any point.
>>
>
> The current bulk docs IS useful in a particular scenario.
>
> It allows me, on a single node, to do transactional updates in response to
> e.g. a web submit/AJAX call, without having to expose the conflict model to
> the user and deal with conflicts in my single-node code.
>
> I then have two distinct phases of operation for peers:
>
> 1. Replication is triggered by the user and they do nothing else until
> replication commpletes, after which they have to resolve the conflicts
> generated by replication. This code deals with conflicts and a resolution UI
> and nothing else.
>
> 2. Normal operation - concurrent access by multiple applications, multiple
> users. The code never sees a conflict, and hence the user interaction and
> programming model is considerable simpler
>
> There are a few additional features useful in this model, the principal
> ones being either 1) the ability to roll back a partial replication to deal
> with network failures; or b) the ability to maintain monotonic source writes
> which ensures that each replication step is consistent. To date neither of
> these features have gained sufficient community support to be considered.
>
> I've presented this model before, and it has been rejected as being
> incompatible with the initial couchdb intentions, but in response to Tim
> Parkin, this is the reason for my fork. There are more details to my effort
> - pure binary bodies rather than JSON, unification of attachments with
> documents, strict metadata/content separation, map/reduce over arbitrary
> data, generalised derivation, an immutable model of fully reified state,
> replication of operations rather than data - but maybe anyone interested can
> contact me offlist - it's no longer CouchDB and I'm sure everyone's sick of
> saying/reading "forget it, it's not going to happen" :)


Will this code still be Apache? Meaning, will some of this features be able
to meander their way back into couch? I can totally understand the need for
a fork (differing goals sometimes cannot be reconciled), but if it's a
friendly fork, so to speak, everybody benefits -- especially if some of
these features get rolled back in to make it easier for you to keep up with
trunk otherwise.

Re: Bulk Docs

Reply via email to