fwiw: I use a cron job to establish continuous replication precisely because they are not persistent. POST'ing to _replicate with the same source and target is idempotent, so a cron job that mindlessly resubmits all your replication tasks is harmless.
I go further, since I use pairs of servers, and read _all_dbs from the other side and kick off a continuous pull replication task, and this runs every 5 minutes. B. On Fri, Mar 5, 2010 at 12:29 PM, Peter Bengtson <pe...@peterbengtson.com> wrote: > After conferring with our sysadmins, I found out that there indeed was a > backup task running nightly at approximately the time of the crashes. They > have turned it off now. I'll let you know after the weekend how this affects > the replication setup. Keeping my fingers crossed until then. Thanks! > > / Peter > > > 5 mar 2010 kl. 18.24 skrev Adam Kocoloski: > >> That would be my guess, too. >> >> On Mar 5, 2010, at 12:22 PM, Randall Leeds wrote: >> >>> Could there be a cron job that's causing a lot of disk contention at the >>> same time every night? >>> >>> On Mar 5, 2010 7:24 AM, "Peter Bengtson" <pe...@peterbengtson.com> wrote: >>> >>> Adam, that's interesting. These crashes occur every night with alarming >>> regularity, but the staging system on which this runs is under no load to >>> speak about. And there are only two DBs in the system at this point, both of >>> which were opened at least 12 hours earlier. I'll ask our sysadmins to >>> double-check the load, but I'd like to know one thing: >>> >>> Why do these crashes occur system-wide? On three nodes and six servers? And >>> at the same time? Somehow, we didn't quite expect that CouchDB should go >>> quite so far as to replicate the crashes... ;-) >>> >>> / Peter >>> >>> >>> 5 mar 2010 kl. 15.57 skrev Adam Kocoloski: >>> >>> >>>> From that log we can tell that CouchDB crashed completely on node0-couch2 >>> (because of the "Apache... >> > >