On Fri, Oct 23, 2009 at 01:44:50PM -0500, Alex P wrote: > We're building an infrastructure that we will eventually need to scale out > singificantly, but in the short term needs to be able to handle several > thousand updates per day, and 10s of thousands of docs. we'd like to use > continuous active-active rep on .10. are there any special considerations > for this? most of the documentation that i've read (including the book on > the apache site) don't go into too much detail on active-active nodes, and > talk about master-slave configurations.
With active-active, if you have concurrent updates to the same documents, you're going to end up with replication conflicts. Your application will need to be written with this in mind. There are several approaches you can take: 1. have a periodic sweep process which runs every few minutes which looks in a view for conflicts, reads the conflicting documents and resolves them. The trouble with this is that documents may appear to "disappear" until the next sweep. e.g. user 1 saves document X1, user 2 saves document X2. When user 2 reads the database she may see only X1 until the next sweep, and believe her changes have been lost. 2. resolve conflicts on read This is what I believe is the "right" approach, but couchdb doesn't make it easy to implement. Every time you read document X, you check for conflicts, then send additional read requests to read the other versions, then resolve them, then save the updated version back, *then* show the result to the client. In couchdb this is a non-atomic, multi-step operation so it's inherently race-prone, i.e. if some other process is performing the same resolution. In addition: if two clients concurrently resolve a set of conflicting documents to the exact same result, these will still be written as two separate versions and then appear to be a new "conflict" until subsequently resolved again. All the other K-V stores I've looked at take a different approach. When you ask for a particular document, you receive back *all* the conflicting versions as equal peers, and some sort of context tag which allows you to supercede all those versions in one write. This makes resolve-on-read straightforward. So IMO this is currently couchdb's weak spot. OTOH, the awesomeness of incremental map-reduce is couchdb's unique selling point. I rely on this, and just stay away from active-active updates. Regards, Brian. P.S. If you want to exercise your conflict resolution strategy more easily, then instead of doing a regular PUT of a document you should do a POST to _bulk_docs with all_or_nothing=true. Then if a conflict arises, the update will be stored as a conflicting version instead of failing with a 409. This gives you the same semantics as an active-active cluster when your app is running on a single node.
