I've now got a huge rash of crashes on another (slightly less critical)
production server. Some are similar to this. I did find one thread, which seems
about as inconclusive but does kind of match and at least has a few sort of
cryptic suggestions:
http://mail-archives.apache.org/mod_mbox/couchdb-user/201408.mbox/%3C3B310764-6F57-4208-ADEC-381D23CC170B%40apache.org%3E
The "nice" thing about the current situation is that it stays up for fifteen
minutes tops before crashing. So I flipped debug logging on. Didn't get much
more — most crashes are signalled by nothing other than the "CouchDB has
started" log line — but here's one with a bit more info. I'll try figure out
which vhosts setting is being referred to in the linked thread and see if it
has any affect.
-nvw
[Wed, 29 Oct 2014 19:54:30 GMT] [debug] [<0.444.0>] OAuth Params: []
[Wed, 29 Oct 2014 19:54:30 GMT] [debug] [<0.102.0>] DDocProc found for DDocKey:
{<<"_design/ipcalf">>,
<<"46-bb2d975f3e712e0077c884153c48ad09">>}
[Wed, 29 Oct 2014 19:54:31 GMT] [debug] [<0.253.0>] OS Process #Port<0.2532>
Input :: ["reset",{"reduce_limit":true,"timeout":5000}]
[Wed, 29 Oct 2014 19:54:32 GMT] [debug] [<0.253.0>] OS Process #Port<0.2532>
Output :: true
[Wed, 29 Oct 2014 19:54:32 GMT] [debug] [<0.253.0>] OS Process #Port<0.2532>
Input ::
["ddoc","_design/ipcalf",["shows","address"],[null,{"info":{"db_name":"public","doc_count":60,"doc_del_count":2,"update_seq":184,"purge_seq":0,"compact_running":false,"disk_size":16244847,"data_size":16042203,"instance_start_time":"1414612231302700","disk_format_version":6,"committed_update_seq":184},"id":null,"uuid":"b9fd1dce73cc155b922bda0c230091de","method":"GET","requested_path":[],"path":["public","_design","ipcalf","_show","address"],"raw_path":"/public/_design/ipcalf/_show/address/","query":{},"headers":{"Connection":"close","Host":"ipcalf.com","User-Agent":"Mozilla/5.0
(compatible; monitis - premium monitoring service;
http://www.monitis.com)","x-couchdb-vhost-path":"/","X-Forwarded-For":"174.36.220.194"},"body":"undefined","peer":"174.36.220.194","form":{},"cookie":{},"userCtx":{"db":"public","name":null,"roles":[]},"secObj":{}}]]
[Wed, 29 Oct 2014 19:54:33 GMT] [debug] [<0.253.0>] OS Process #Port<0.2532>
Output ::
["resp",{"headers":{"Access-Control-Allow-Origin":"*","Content-Type":"text/html;
charset=utf-8"},"body":"Your IP address is: <h1>174.36.220.194</h1> Have a
nice day.\n"}]
[Wed, 29 Oct 2014 19:54:33 GMT] [info] [<0.444.0>] 174.36.220.194 - - GET
/public/_design/ipcalf/_show/address/ 200
[Wed, 29 Oct 2014 19:55:13 GMT] [error] [<0.107.0>] {error_report,<0.31.0>,
{<0.107.0>,crash_report,
[[{initial_call,
{mochiweb_acceptor,init,
['Argument__1','Argument__2','Argument__3']}},
{pid,<0.107.0>},
{registered_name,[]},
{error_info,
{exit,
{noproc,
{gen_server,call,[couch_httpd_vhost,get_state]}},
[{gen_server,call,2,
[{file,"gen_server.erl"},{line,180}]},
{couch_httpd_vhost,dispatch_host,1,
[{file,
"/home/ubuntu/bc3/dependencies/couchdb/src/couchdb/couch_httpd_vhost.erl"},
{line,96}]},
{couch_httpd,handle_request,5,
[{file,
"/home/ubuntu/bc3/dependencies/couchdb/src/couchdb/couch_httpd.erl"},
{line,232}]},
{mochiweb_http,headers,5,
[{file,
"/home/ubuntu/bc3/dependencies/couchdb/src/mochiweb/mochiweb_http.erl"},
{line,94}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,239}]}]}},
{ancestors,
[couch_httpd,couch_secondary_services,
couch_server_sup,<0.32.0>]},
{messages,[]},
{links,[<0.106.0>,#Port<0.2074>]},
{dictionary,[{couch_rewrite_count,0}]},
{trap_exit,false},
{status,running},
{heap_size,1598},
{stack_size,27},
{reductions,1011}],
[]]}}
[Wed, 29 Oct 2014 19:55:14 GMT] [info] [<0.32.0>] Apache CouchDB has started on
http://127.0.0.1:5984/
On Oct 9, 2014, at 10:33 AM, Nathan Vander Wilt <[email protected]>
wrote:
> Any idea what might have caused the second crash, at bottom of this email?
> Yesterday the same CouchDB server went down like this and didn't come back up:
>
> -- first crash
> heart: Wed Oct 8 10:31:25 2014: Erlang has closed.
> Segmentation fault (core dumped)
> sh: echo: I/O error
> heart: Wed Oct 8 10:31:26 2014: Executed
> "/home/natevw/bc16/build/bin/couchdb -k" -> 256. Terminating.
>
> …which have been because I was just starting it from crontab and hoping the
> `-b -r 5` options would actually work. By today I've got the daemonization
> more properly setup, using upstart and its respawn option.
>
> No big outage today, however I did notice another crash in the logs — I'd
> like to avoid the daemon restarting at all in routine use if possible. I
> don't see anything particularly useful/interesting as to the cause of the
> crash…does the backtrace below imply anything in particular?
>
> The main difference the last two days is that this system is now back under
> some load (maybe 50 users, up from maybe one or two in preceding weeks).
> Right now (under "higher" load) the server is showing "0.00, 0.01, 0.05" load
> average and 2.6 of 3.7GB memory free, so it doesn't seem offhand we're
> pushing the system too hard. Besides basic reads/writes/view stuff, we still
> haven't migrated off use of per-user filtered changes, which is the only
> thing I can think might lead to a load-related problem.
>
> thanks,
> -natevw
>
>
>
> -- second crash
>
> [Thu, 09 Oct 2014 15:23:24 GMT] [info] [<0.21979.2>] 127.0.0.1 - - GET
> /production-db/org.couchdb.user%3Au123456 200
> [Thu, 09 Oct 2014 15:23:26 GMT] [error] [<0.108.0>] {error_report,<0.31.0>,
> {<0.108.0>,crash_report,
> [[{initial_call,
> {mochiweb_acceptor,init,
> ['Argument__1','Argument__2','Argument__3']}},
> {pid,<0.108.0>},
> {registered_name,[]},
> {error_info,
> {exit,
> {noproc,
> {gen_server,call,[couch_httpd_vhost,get_state]}},
> [{gen_server,call,2,
> [{file,"gen_server.erl"},{line,180}]},
> {couch_httpd_vhost,dispatch_host,1,
> [{file,
>
> "/home/natevw/bc16/dependencies/couchdb/src/couchdb/couch_httpd_vhost.erl"},
> {line,96}]},
> {couch_httpd,handle_request,5,
> [{file,
>
> "/home/natevw/bc16/dependencies/couchdb/src/couchdb/couch_httpd.erl"},
> {line,217}]},
> {mochiweb_http,headers,5,
> [{file,
>
> "/home/natevw/bc16/dependencies/couchdb/src/mochiweb/mochiweb_http.erl"},
> {line,94}]},
> {proc_lib,init_p_do_apply,3,
> [{file,"proc_lib.erl"},{line,239}]}]}},
> {ancestors,
> [couch_httpd,couch_secondary_services,
> couch_server_sup,<0.32.0>]},
> {messages,[]},
> {links,[<0.107.0>,#Port<0.2017>]},
> {dictionary,[{couch_rewrite_count,0}]},
> {trap_exit,false},
> {status,running},
> {heap_size,2586},
> {stack_size,27},
> {reductions,1173}],
> []]}}
> [Thu, 09 Oct 2014 15:23:26 GMT] [info] [<0.32.0>] Apache CouchDB has started
> on http://127.0.0.1:55984/
>