Re: Entire CouchDB cluster crashes simultaneously

Robert Newson Fri, 05 Mar 2010 10:14:18 -0800

fwiw: I use a cron job to establish continuous replication precisely
because they are not persistent. POST'ing to _replicate with the same
source and target is idempotent, so a cron job that mindlessly
resubmits all your replication tasks is harmless.


I go further, since I use pairs of servers, and read _all_dbs from the
other side and kick off a continuous pull replication task, and this
runs every 5 minutes.

B.

On Fri, Mar 5, 2010 at 12:29 PM, Peter Bengtson <pe...@peterbengtson.com> wrote:
> After conferring with our sysadmins, I found out that there indeed was a 
> backup task running nightly at approximately the time of the crashes. They 
> have turned it off now. I'll let you know after the weekend how this affects 
> the replication setup. Keeping my fingers crossed until then. Thanks!
>
>        / Peter
>
>
> 5 mar 2010 kl. 18.24 skrev Adam Kocoloski:
>
>> That would be my guess, too.
>>
>> On Mar 5, 2010, at 12:22 PM, Randall Leeds wrote:
>>
>>> Could there be a cron job that's causing a lot of disk contention at the
>>> same time every night?
>>>
>>> On Mar 5, 2010 7:24 AM, "Peter Bengtson" <pe...@peterbengtson.com> wrote:
>>>
>>> Adam, that's interesting. These crashes occur every night with alarming
>>> regularity, but the staging system on which this runs is under no load to
>>> speak about. And there are only two DBs in the system at this point, both of
>>> which were opened at least 12 hours earlier. I'll ask our sysadmins to
>>> double-check the load, but I'd like to know one thing:
>>>
>>> Why do these crashes occur system-wide? On three nodes and six servers? And
>>> at the same time? Somehow, we didn't quite expect that CouchDB should go
>>> quite so far as to replicate the crashes... ;-)
>>>
>>>      / Peter
>>>
>>>
>>> 5 mar 2010 kl. 15.57 skrev Adam Kocoloski:
>>>
>>>
>>>> From that log we can tell that CouchDB crashed completely on node0-couch2
>>> (because of the "Apache...
>>
>
>

Re: Entire CouchDB cluster crashes simultaneously

Reply via email to