Re: Incremental replication over unreliable link -- how granular is replication restart

Damien Katz Thu, 14 May 2009 14:53:27 -0700


On May 14, 2009, at 10:36 AM, Matt Goodall wrote:

2009/5/14 Adam Kocoloski <[email protected]>:
Hi Ben, welcome! At the moment, CouchDB does not have any capacityforintra-document replication checkpointing. And you're right, in thespecific
situation you describe Couch would have a difficult time making any
replication progress.
Given that replication over slow, unreliable links is absolutely aCouchDBdesign goal, I think we might eventually conjure up some more magicto makesome sort of intra-document (or at least intra-attachment)checkpointing
possible.  I think it will be post-1.0, though.  Best,

Adam

On May 14, 2009, at 7:12 AM, Ben Cohen wrote:
Hi all --
This is my first message to the list. I've been watching it for alittlewhile now and so far everything I read about the design of couchdbI like a
lot!  Thanks so much for all the cool work!
One of the uses I'm planning for couchdb involves replicating adatabaseacross a slow, unreliable link which will never become anythingother thanslow and unreliable. I understand the replication is incrementaland
designed to 'pick up where it left off' in the case of replication
interruption.  From the technical overview on the website:
The replication process is incremental. At the database level,
replication only examines documents updated since the lastreplication. Thenfor each updated document, only fields and blobs that havechanged arereplicated across the network. If replication fails at any step,due tonetwork problems or crash for example, the next replicationrestarts at the
same document where it left off.
Is this actually accurate? It suggests that documents are replicated
one-by-one and that replication can be interrupted at any point and
will continue from wherever it got to before the interruption.

Firstly, I believe the whole replication has to complete before any
updates are visible in the target database.


No, each update is seen on the target as it's written by the replicator.

If I restart the server in
charge of replication and then restart the replication it always seems
to start from the beginning. i.e. the Futon's "Processed source update
#xxx" status starts from 0 (when replicating an empty database).

It can start scanning from the beginning, but it will not copy againdocuments it's already replicated.

The checkpointing work prevents it from scanning back from 0, butthere are failure scenarios where it might start from 0 anyway. Adamhas some ideas for a simple fix we can make this far less likely tohappen.


Secondly, if the network connection fails in the middle of replication
(closing an ssh tunnel is a good way to test this ;-)) then it seems
to retry a few (10) times before the replicator process terminates. If
the network connection becomes available again (restart the ssh
tunnel) the replicator doesn't seem to notice. Also, I just noticed
that Futon still lists the replication on its status page.

There is lot of work we can do here, right now replication is strictlya batch operation. Eventually we will have permanent replications,where replicators are always working in near realtime, andindefinitely retrying when network connections are failing.


-Damien

If I'm correct, and I really hope I'm missing something, then
couchdb's replication is probably not currently suitable for
replicating anything but very small database differences over an
unstable connection. Does anyone have any real experience in this sort
of scenario?

- Matt
I've got a question about this process. Say you have a documentto bereplicated with a 1 megabyte attachment. A replication processstarts, halfthe doc is transferred successfully and then the connection dies.Assuming
no changes to the source doc, when the replication restarts will the
transfer start from the beginning of the document or will it pick up
somewhere within the doc?
For my use case I have a slow link that will periodically comeonline for
a certain fixed amount of time and initiate a replication.  If the
replication isn't incremental 'within' a single document, then adocument inthe database above a certain size will for me, never make itacross and Iwould imagine cause the replication to never make forwardprogress ...
Does couchdb's replication magic avoid the issue for me andeventually
transfer the document across my link?

Thanks much,
Ben Cohen

Re: Incremental replication over unreliable link -- how granular is replication restart

Reply via email to