I managed to reproduce the error:
[Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] OAuth Params: []
[Sat, 18 Aug 2012 00:58:37 GMT] [debug] [<0.114.0>] Include Doc:
<<"_design/_replicator">> {1,
<<91,250,44,153,
238,254,43,46,
180,150,45,181,
10,163,207,212>>}
[Sat, 18 Aug 2012 00:58:37 GMT] [info] [<0.32.0>] Apache CouchDB has
started on http://0.0.0.0:5984/
...and I think I identified also the problem: too long/large JSON.
Here is how to reproduce the error:
1. CouchDB error level: debug
2. an extra-huge JSON file: echo -n "{\"docs\":[{\"key\":\"1\"}" >
my_json.json && for var in $(seq 2 2000000) ; do echo -n
",{\"key\":\"${var}\"}" >> my_json.json ; done && echo -n "]}" >>
my_json.json
3. attempting to send it with curl (requires to have database "test"
already existing and preferably empty):
curl -X POST http://127.0.0.7:5984/test/_bulk_docs -H 'Content-Type:
application/json' -d @my_json.json > /dev/null
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left
Speed
100 33.2M 0 0 100 33.2M 0 856k 0:00:39 0:00:39 --:--:--
0
curl: (52) Empty reply from server
Erlang shell report for the same problem:
=INFO REPORT==== 18-Aug-2012::03:12:57 ===
alarm_handler: {set,{system_memory_high_watermark,[]}}
=INFO REPORT==== 18-Aug-2012::03:12:57 ===
alarm_handler: {set,{process_memory_high_watermark,<0.149.0>}}
/usr/local/lib/erlang/lib/os_mon-2.2.9/priv/bin/memsup: Erlang has
closed.Erlang has closed
Tim, try to split your JSON in smaller pieces. Bulk operations tend to use
a lot of memory.
The _design/_replicator error comes with multipart file set by cURL by
default in such cases. Once a second piece is sent toward the server, the
crash is registered. The first piece report looks like:
[Sat, 18 Aug 2012 00:57:38 GMT] [debug] [<0.170.0>] 'POST' /test/_bulk_docs
{1,1} from "127.0.0.1"
I hope this info may help.
CGS
On Fri, Aug 17, 2012 at 7:30 PM, Tim Tisdall <[email protected]> wrote:
> Okay, so it always states that _replicator line any time I manually
> restart the server. I think it's just a standard logging message when
> the level is set to "debug".
>
> On Fri, Aug 17, 2012 at 1:13 PM, Tim Tisdall <[email protected]> wrote:
> > No. All my ids (except for design documents) are strings containing
> > integers. Also, none of my design documents are called anything like
> > "_replicator". The only thing with that name is in the _replicator
> > database which I'm not doing anything with.
> >
> > Why does it say "Include Doc"? And what's that series of numbers
> > afterwards? That log message seems to consistently occur just before
> > the log message about the server starting. Is that just a normal
> > message you get when the server restarts and you have logging set to
> > "debug"?
> >
> >
> > On Fri, Aug 17, 2012 at 1:03 PM, Robert Newson <[email protected]>
> wrote:
> >>
> >> Does app_stats_test contain a document called _design/_replicator or is
> a document with that id in the body of your bulk post?
> >>
> >> B.
> >>
> >> On 17 Aug 2012, at 17:52, Tim Tisdall wrote:
> >>
> >>> I do have UTF8 characters in the JSON, but isn't that acceptable? I
> >>> have no problem retrieving UTF8 encoded content from the server and I
> >>> have a bunch of it saved in there already too.
> >>>
> >>> On Fri, Aug 17, 2012 at 10:35 AM, CGS <[email protected]> wrote:
> >>>> Hi,
> >>>>
> >>>> Do you have somehow special characters (non-latin1 ones) in your
> JSON? That
> >>>> error looks strangely close to trying to transform a list of unicode
> >>>> characters into a binary. I might be wrong though.
> >>>>
> >>>> CGS
> >>>>
> >>>>
> >>>>
> >>>> On Fri, Aug 17, 2012 at 4:09 PM, Tim Tisdall <[email protected]>
> wrote:
> >>>>
> >>>>> I thought I added that to the init script before when you mentioned
> >>>>> it, but I checked and it was gone. I added a "cd ~couchdb" in there
> >>>>> and now I no longer get eaccess errors, but the process still crashes
> >>>>> with very little information:
> >>>>>
> >>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>] 'POST'
> >>>>> /app_stats_test/_bulk_docs {1,0} from "127.0.0.1"
> >>>>> Headers: [{'Accept',"*/*"},
> >>>>> {'Content-Length',"3902444"},
> >>>>> {'Content-Type',"application/json"},
> >>>>> {'Host',"localhost:5984"}]
> >>>>> [Fri, 17 Aug 2012 14:01:44 GMT] [debug] [<0.1372.0>] OAuth Params: []
> >>>>> [Fri, 17 Aug 2012 14:02:16 GMT] [debug] [<0.115.0>] Include Doc:
> >>>>> <<"_design/_replicator">> {1,
> >>>>>
> >>>>> <<91,250,44,153,
> >>>>>
> >>>>> 238,254,43,46,
> >>>>>
> >>>>> 180,150,45,181,
> >>>>>
> >>>>> 10,163,207,212>>}
> >>>>> [Fri, 17 Aug 2012 14:02:17 GMT] [info] [<0.32.0>] Apache CouchDB has
> >>>>> started on http://127.0.0.1:5984/
> >>>>>
> >>>>>
> >>>>> Someone mentioned seeing the JSON that I'm submitting... Wouldn't
> >>>>> mal-formed JSON throw an error?
> >>>>>
> >>>>> -Tim
> >>>>>
> >>>>>
> >>>>> On Fri, Aug 17, 2012 at 4:33 AM, Robert Newson <[email protected]>
> wrote:
> >>>>>>
> >>>>>> I've seen couchdb start despite the eacces errors before and
> tracked it
> >>>>> down to the current working directory setting. It seems that the cwd
> is
> >>>>> searched first, and then erlang looks elsewhere. So, if our startup
> script
> >>>>> doesn't change it to somewhere that the couchdb user can read, you
> get
> >>>>> spurious eacces errors.
> >>>>>>
> >>>>>> Don't ask me how I know this.
> >>>>>>
> >>>>>> B.
> >>>>>>
> >>>>>> On 16 Aug 2012, at 20:19, Tim Tisdall wrote:
> >>>>>>
> >>>>>>> Paul, did you ever solve the eaccess problem you had described
> here:
> >>>>>>>
> >>>>>
> http://mail-archives.apache.org/mod_mbox/couchdb-user/201106.mbox/%[email protected]%3E
> >>>>>>> I found that post from doing Google searches for my issue.
> >>>>>>>
> >>>>>>> On Tue, Aug 14, 2012 at 11:41 PM, Paul Davis
> >>>>>>> <[email protected]> wrote:
> >>>>>>>> On Tue, Aug 14, 2012 at 9:38 PM, Tim Tisdall <[email protected]>
> >>>>> wrote:
> >>>>>>>>> I'm still having problems with couchdb, but I'm trying out
> different
> >>>>>>>>> things to see if I can narrow down what the problem is...
> >>>>>>>>>
> >>>>>>>>> I stopped using fsockopen() in PHP and am using curl now to
> hopefully
> >>>>>>>>> be able to see more debugging info.
> >>>>>>>>>
> >>>>>>>>> I get an empty response when sending a POST to _bulk_docs. From
> the
> >>>>>>>>> couch logs it seems like the server restarts in the middle of
> >>>>>>>>> processing the request. Here's what I have in my logs: (I have
> no
> >>>>>>>>> idea what the _replicator portion is about there, I'm currently
> not
> >>>>>>>>> using it)
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT] [debug] [<0.1255.0>] 'POST'
> >>>>>>>>> /app_stats_test/_bulk_docs {1,0} from "127.0.0.1"
> >>>>>>>>> Headers: [{'Accept',"*/*"},
> >>>>>>>>> {'Content-Length',"2802300"},
> >>>>>>>>> {'Content-Type',"application/json"},
> >>>>>>>>> {'Host',"localhost:5984"}]
> >>>>>>>>> [Wed, 15 Aug 2012 02:27:30 GMT] [debug] [<0.1255.0>] OAuth
> Params: []
> >>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT] [debug] [<0.115.0>] Include Doc:
> >>>>>>>>> <<"_design/_replicator">> {1,
> >>>>>>>>>
> >>>>> <<91,250,44,153,
> >>>>>>>>>
> >>>>> 238,254,43,46,
> >>>>>>>>>
> >>>>> 180,150,45,181,
> >>>>>>>>>
> >>>>> 10,163,207,212>>}
> >>>>>>>>> [Wed, 15 Aug 2012 02:27:45 GMT] [info] [<0.32.0>] Apache CouchDB
> has
> >>>>>>>>> started on http://127.0.0.1:5984/
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> In my code logs I have the following by running curl in verbose
> mode:
> >>>>>>>>>
> >>>>>>>>> * About to connect() to localhost port 5984 (#0)
> >>>>>>>>> * Trying 127.0.0.1... * connected
> >>>>>>>>> * Connected to localhost (127.0.0.1) port 5984 (#0)
> >>>>>>>>>> POST /app_stats_test/_bulk_docs HTTP/1.0
> >>>>>>>>> Host: localhost:5984
> >>>>>>>>> Accept: */*
> >>>>>>>>> Content-Type: application/json
> >>>>>>>>> Content-Length: 2802300
> >>>>>>>>>
> >>>>>>>>> * Empty reply from server
> >>>>>>>>> * Connection #0 to host localhost left intact
> >>>>>>>>> curl error: 52 : Empty reply from server
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I also tried using HTTP/1.1 and I get an empty response after
> >>>>>>>>> receiving only a "100 Continue", but the end result appears the
> same.
> >>>>>>>>>
> >>>>>>>>> -Tim
> >>>>>>>>
> >>>>>>>> If you have a request that triggers this, a good way to catch it
> is
> >>>>> like such:
> >>>>>>>>
> >>>>>>>> $ /usr/local/bin/couchdb # or however you start it
> >>>>>>>> $ ps ax | grep beam.smp # Get the pid of couchdb
> >>>>>>>> $ gdb
> >>>>>>>> (gdb) attach $pid # Where $pid was just found with ps. Might
> >>>>>>>> throw up an access prompt
> >>>>>>>> (gdb) continue
> >>>>>>>> # At this point, run the command that makes couchdb reboot
> in a
> >>>>>>>> # different console. If it happens you should see Gdb notice
> the
> >>>>>>>> # error. Then the following:
> >>>>>>>> (gdb) t a a bt
> >>>>>>>>
> >>>>>>>> And that should spew out a bunch of stack traces. If you can get
> that
> >>>>>>>> we should be able to fairly specifically narrow down the issue.
> >>>>>>
> >>>>>
> >>
>