On Thu, Mar 26, 2009 at 05:00:22PM +0000, Tim Parkin wrote: > > In what way is that not atomicity? > > Well the difference is I'm more interested the ability to rollback than > the atomicity.
But won't you need atomicity to guarantee the ability to rollback? Consider the following sequence where I want to apply changes to A and B, but B fails. 1. Update A to A' 2. Try to update B to B' but it fails 3. Revert A' to A Now, what happens if someone else updates A in the middle? 1. Update A to A' 2. Another user updates A' to A" 3. Try to update B to B' but it fails 4. Err, what do I do now? I can't revert A" to A because that would also undo someone else's changes. At best, it could be handled like a replication conflict: both A and A" exist in the database simultaneously. However, the person making the update in (2) saw it as a successful, non-conflicting update. The next person to read A will (if they ask) see two different versions. A' will have vanished if the database has been compacted, making it hard to resolve them back into one version. You also need to consider what happens when either the database or your application crashes in the middle of such a sequence. Unless your application maintains a separate transaction log, you would require the partial update to be rolled back by the database itself. In order to make sense of this, can I just step back a bit and work out exactly under what circumstances you need to roll back the transaction. Is your primary concern concurrency failure? That is, you tried to update B to B', but someone else had changed B to B" in the mean time? That's fine, but remember that concurrency control only works in the context of a single node anyway. A PUT will guarantee that my update from B to B' is not stomped on by someone else on the same node trying to update B to B". However, as soon as you introduce replication into the mix, all bets are off; you *will* get multiple conflicting versions in the database anyway. In essence I agree with you though: operations which are only atomic on a single node are useful (and that includes PUT concurrency control). Not everyone has a cluster or plans to go there. I also think the reason given on the wiki for dropping _bulk_docs atomic operations doesn't make sense. The old behaviour won't work on a sharded cluster without two-phase commit. But as far as I can see, the replacement "all_or_nothing" mode won't work on a sharded cluster without two-phase commit either. Of course, what's written on the wiki doesn't necessarily represent the views of the authors, who probably have a better reason for including this behaviour. Regards, Brian.
