Dear all, Following a recent conversation with Paul I'm sending you the log of a CouchDB instance that starts using 100% the CPU and stops responding.
Scenario: We have a single CouchDB master aggregating data from 2 CouchDB 'slave' DBs. The scenario is such that we have a single DB in use per day, e.g. quarantine_day_month_year and the average DB length is ~1500 docs. So far DB size < 50M. As I said before, the instance that stopped working is the one receiving replication requests from the slaves. It had been working just fine until yesterday. It so happens to be that we added the second slave just yesterday. Strangely enough we did so in the morning and it worked just fine during the whole day. I came in this morning to find that CouchDB was using 100% the CPU (though futon was usable) and the whole server thrashing like crazy making our system unusable (as it depends on a MySQL which also is hosted in the same server). I can't really begin to understand what's going on from the logs, sorry. I did notice that there's something fishy going on with the DB named quarantine_17_12_2008 (yesterday's DB) and mochiweb timing out as apparently that DB can't be opened. Oh coincidence we added a new slave yesterday. Could it be a concurrent replication issue? I don't know, hence my cry for help. Needless to say restarting CouchDB solved the issue. Best regards, Ulises
