Adam Kocoloski wrote:
Simon Metson reminded me that I wrote down something like this for him a few 
months back.  Here it is.  It describes the replication workflow using inline 
document attachments, rather than the more efficient multipart requests which 
are supported in 0.11.  Hope it helps.  Regards,

A good starting point. Thanks! Probably worth putting somewhere on the wiki for future reference.

And...

It terms of broader architectural overviews, you may find Ricky Ho's set of 
articles useful:

http://horicky.blogspot.com/2008/10/couchdb-implementation.html

Exactly what I was looking for. Thanks again! (And now that I know what to look for, I found the link to it on the couch wiki).

Miles



So, the sequence of calls depends on whether you're pulling updates from this 
remote server or pushing updates to it.  Let's consider the two cases 
separately:

## Pull Replication (remote source, local target)

### HEAD /db
Respond with a 200 status code and you're good.

### GET /db/_local/<rep id>
The replicator checkpoints its progress in these _local documents.  You can 
respond with a 404 if you like, otherwise the response should be JSON that 
looks very much like a replication response, e.g. the one described here:

http://books.couchdb.org/relax/reference/replication#Replication%20in%20Detail

Basically, if the _local doc exists and both the source and target DBs, and the documents 
agree on the value of "source_last_seq", the replicator will start from the 
update sequence on the source.

### GET /db/_changes?style=all_docs&heartbeat=10000&since=N[&feed=continuous]

This is the hard part.  The replicator makes this request on a separate 
connection to your server, asking for a list of changes since N (the 
source_last_seq from the previous step).  If the replication is meant to be 
permanent, the feed=continuous parameter will be supplied.  The best reference 
for the response format is definitely the O'Reilly book:

http://books.couchdb.org/relax/reference/change-notifications

### GET /db/docid?revs=true&latest=true&open_revs["1-23420432",...]

You'll see one of these for each updated document if the update does not 
already exist on the target. I believe the response is a JSON Array

[{"ok":{"_id":"docid","_rev":"1-23420432", ..rest of doc}, 
{"missing":"some-bad-rev"}]

The "missing" case is very rare and is usually the result of somebody racing 
the replicator.

### GET /db/docid/attachment?rev=1-234923042

Attachments are downloaded separately during pull replication.  The correct 
response is the binary data.

### PUT /db/_local/<rep id>

Periodically the replicator will try to save an updated _local doc with the new replication 
history. The response is {"ok":true, "rev":NewRevId}

That's it for pull replication.

## Push replication (local source, remote target)

The _local doc calls are still there, but now we have two new POSTs:

POST /db/_missing_revs -d '{"docid1":["1-24323423"], "docid2":"["2-23434534"]}

This is the replicator asking the target if these document revisions are 
already saved there.  The response is a list of the ones that are missing:

{"missing_revs":{"docid2":["2-23434534"]}}

POST /db/_bulk_docs -d '{"new_edits":false, "docs":[... array of documents ...]}

This one is exactly like the regular _bulk_docs call.  The new_edits:false 
parameter tells the target not to throw conflict, but instead save all these 
updates, as conflict revisions if necessary. Currently attachments are inlined, 
although in 0.11 we'll be doing special multipart PUTs for documents with 
attachments instead of using _bulk_docs (so we don't need to Base64 encode 
them). Best,

Adam


--
In theory, there is no difference between theory and practice.
In<fnord>  practice, there is.   .... Yogi Berra


Reply via email to