Eli Stevens created COUCHDB-1817:
------------------------------------
Summary: OS Process Error <0.21247.103> :: {os_process_error,
{exit_status,0}}
Key: COUCHDB-1817
URL: https://issues.apache.org/jira/browse/COUCHDB-1817
Project: CouchDB
Issue Type: Bug
Components: JavaScript View Server
Reporter: Eli Stevens
We have started seeing errors crop up in our application that we have not seen
before, and we're at a loss for how to start debugging it.
[~dch] Said that we might look into system resource limits, so we started
collecting all of the output from _stats into RRD (along with memory, load,
etc. that we were already collecting), but nothing is jumping out at us as
obviously problematic.
We can semi-reliably reproduce the problem, but it's far from a minimal test
case (basically, we load up several large chunks of data, and then halfway
through the processing run, we get the error). The error doesn't seem to
happen if we load up each chunk by itself.
The DB in question has about 100 docs in it, none particularly large (nothing
over a couple KB would be my guess), with a couple hundred MB in attachments.
10ish design docs, coffeescript. In general, there isn't anything that seems
obviously resource intensive.
We have seen this issue on 1.2.0, 1.2.1, and we're working on getting a machine
with 1.3.0 set up (the PPA we'd been using hasn't been updated yet). Ubuntu
12.04, spinning disk, etc. The system is under load when it happens, but the
load isn't more than 1.5x the number of cores. I don't have disk IO numbers at
hand, but I'd be surprised if that was being strained.
Error as it appears in couch.log:
https://gist.github.com/wickedgrey/e7fd3fc14b6d43e95564
The design doc in question:
https://gist.github.com/wickedgrey/db41b0c3c75a590e2109
An example document: https://gist.github.com/wickedgrey/a8422aab261ddd2ce4fe
We have some preliminary evidence that the problem persists after the system
goes quiet, but we're not certain.
Either CouchDB isn't handling things correctly, in which case this bug is "prz
fix" or we're doing something wrong (hitting a resource limit, or something),
in which case this bug is "prz make the error message more informative".
Thanks!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira