On Mar 26, 2013, at 7:00 PM, Ethan <[email protected]> wrote:

> Hi! I'd like to build an "offline replication" system for CouchDB.
> Basically, I'd like to build a system that can synchronize over an
> intermediary that isn't trusted (like a typical web host). Ideally, I'd
> grab whatever CouchDB sends over the wire to do replication, compress,
> encrypt, and sign it, and then upload it (say via SSH).

The replication protocol is interactive, so you can’t do things the same way. 
For instance, a push replication first sends a _revs_diff request to tell the 
remote server which new revisions it has that the remote _might_ be interested 
in; then the remote responds by listing the subset of those revisions that it 
doesn’t have yet, and what prior revisions of those documents it does have. 
Then the source PUTs the revisions one by one.

You could write something that would do this in a less-optimal non-interactive 
way. It would remember the last time it synced to the target, gather up all the 
revisions that have happened since that sequence, and bundle them up into a 
file. It would work fairly well if the target server only replicates with the 
source server, i.e. it has no way to get these replications from somewhere 
else. If you have a more complex set of replications, then the source can end 
up sending way too much stuff because the target may already have gotten those 
same revisions from somewhere else.

> I'm looking for
> more information on how to build something that does that. So far all I've
> found is http://comments.gmane.org/gmane.comp.db.couchdb.user/164.

I’ve documented the replication protocol here:
        https://github.com/couchbaselabs/TouchDB-iOS/wiki/Replication-Algorithm
The APIs that the replicator calls are all documented in the CouchDB wiki 
(partly because I made sure to add documentation for the ones that weren’t 
documented as I ran into them):
        http://wiki.apache.org/couchdb/Complete_HTTP_API_Reference

> First, although _replicate can take "source" or "target" as URLs, anything
> that isn't an HTTP URL gets "invalid database". So much for replicating
> file:// :)

Well yeah; the destination has to be an HTTP server that handles at least the 
passive side of the replication protocol.

> I thought I
> would try to do the same thing against CouchDB. When I do, the database
> hangs.
> 
> r$ curl -vX HEAD http://127.0.0.1:5984/mydb/

I don’t think curl likes -X HEAD. You should use -I or --head to send a HEAD 
request.

—Jens

Reply via email to