Hi Blair, all good questions, I'll try to answer inline:
On Aug 25, 2009, at 5:10 PM, Blair Zajac wrote:
Hello,
We're looking at using CouchDB's replication to allow us to easily
have multi-master replicating databases across multiple facilities,
(e.g. Los Angeles, Albuquerque, Bristol, England, etc). It looks
like it'll be the perfect tool for the job.
Some questions on the current implementation and the work that I've
read is going to be in forthcoming releases.
1) What's the most robust automatic replication mechanism? While
continuous replication looks nice, I see there's some tickets open
with it and that it has issues with four nodes. Is a more robust
solution, but a little slower and heavier, it to have an
update_notification that manually POSTs to _replicate?
We're committed to making continuous replication as robust and
performant as possible. The entire replication codebase went through
a significant refactoring after 0.9, and what you're seeing is us
ironing out a few of the kinks before 0.10 gets out the door. I'd
encourage you to give "continuous":true a shot, provided my answer to
2) isn't a deal-breaker.
2) With the persistent continuous replication feature, is there a
way to stop continuous replication without restarting couchdb?
At the moment, no. We just didn't have time to add that feature to
0.10. It's coming soon, though.
Will there be a way to manage the list of replicant databases when
the persistent continuous replication feature is complete?
Absolutely yes. It will probably be a special DB called _replication
where you can PUT and DELETE documents that configure continuous
replications.
3) How does continuous replication deal with network outages, say if
a link goes down between the Los Angeles and Bristol data centers?
Does CouchDB deal with a hanging TCP connection ok?
CouchDB retries requests using a timeout that doubles with every
failure. It does this for about 5 minutes, then gives up.
4) It would be nice for CouchDB to have in it a list of replicant
databases that it will automatically push changes to, so this list
could also be maintained in CouchDB, instead of with an external
script. Is there any work on a feature like this? This could be
easily done with an external update_notification script.
Yep, definitely planned for the near future (certainly by 0.11).
5) I wrote the following Bourne shell script and after running it
for an hour, it consumes 100% of a CPU. This is even after stopping
the shell script and compacting both databases. What would explain
this behavior?
I couldn't quite get that script to work ($HOST2 was undefined, and
then something else failed), but can you try it again with a fresh
checkout? I fixed a bug last night that could very well have caused
this. Best,
Adam