Sorry for cross-posting, but I'm wondering if CouchDB devs have any opinion on this issue?
Cheers, Cliff. ---------- Forwarded message ---------- From: Cliffano Subagio <[email protected]> Date: Tue, May 1, 2012 at 11:36 AM Subject: COUCHDB-901 COUCHDB-1429 and stray/zombie couchjs processes To: [email protected] Hi, Searching through the mailing list, there are some threads [1] [2] related to stray/zombie couchjs processes with no obvious resolution yet. Similar to those threads, I'm also getting stray/zombie couchjs processes when the server (couchdb 1.1.1) is under heavy load and couchjs starts timing out until the total (ps ax | grep couchjs | wc -l) is greater than os_process_limit, and couchdb would then consistently logs timeout errors. At that point, the client code can no longer read/write any doc to couchdb until couchdb is restarted or the stray couchjs processes are killed. Adam mentioned on the first thread [1] that there's a branch [3] which might fix the issue. The associated JIRA issue COUCHDB-901 [4] has since moved the target release from 1.1 to 1.2 to 1.3. Is COUCHDB-901 strictly related to BigCouch or also applicable to CouchDB? Could 'losing track of OS processes' possibly contribute to stray couchjs processes? Is there any chance of including this fix in 1.3? There is also COUCHDB-1429 [5] which describes a similar situation with zombie couchjs processes. I checked with Nate, who raised COUCHDB-1429, to confirm that this is not related to a reduce function so we can rule out COUCHDB-1246 as the culprit. Any couchdb dev willing to have a look at COUCHDB-1429? Thanks in advance. [1] http://mail-archives.apache.org/mod_mbox/couchdb-dev/201104.mbox/%[email protected]%3E [2] http://mail-archives.apache.org/mod_mbox/couchdb-dev/201203.mbox/%[email protected]%3E [3] https://github.com/kocolosk/couchdb/tree/COUCHDB-901 [4] https://issues.apache.org/jira/browse/COUCHDB-901 [5] https://issues.apache.org/jira/browse/COUCHDB-1429 [6] https://issues.apache.org/jira/browse/COUCHDB-1246 Cheers, Cliff.
