Thanks Adam, We just tried that and it seems to hold up. Just wondering if there is some kind of formula on what to set ERL_FLAGS to?
Herman On 2014-05-01, at 10:51 AM, Adam Kocoloski <[email protected]> wrote: > Hi Herman, I think those are just the view groups shutting down after the > parent DB crashed because you ran out of processes. > > You can increase the maximum number of processes via the ERL_FLAGS > environment variable, e.g. > >> $ ERL_FLAGS="+P 512000" erl >> Erlang R14B01 (erts-5.8.2) [source] [64-bit] [smp:4:4] [rq:4] >> [async-threads:0] [hipe] [kernel-poll:false] >> >> Eshell V5.8.2 (abort with ^G) >> 1> erlang:system_info(process_limit). >> 512000 > > The default is 256k, assuming you've got enough RAM you can bump that up to > 1M with impunity. Regards, > > Adam > > On May 1, 2014, at 10:43 AM, Herman Chan <[email protected]> wrote: > >> We do have 1000+ connection to the db, which we are trying to dial down. >> However, even with lower connection, we hit the crash again, this time I was >> able to get a better log. You are right that we are hitting some limit, >> >> before the crash, the log shows that couch is still trying to open up index >> from a reboot that we did. Once it crash, the log start print out with >> "Index shutdown by monitor". Is there any limit parameter that we can >> increase? >> >> [Thu, 01 May 2014 14:28:04 GMT] [error] [emulator] Too many processes >> [Thu, 01 May 2014 14:28:04 GMT] [error] [emulator] Error in process >> <0.3672.477> with exit value: >> {system_limit,[{erlang,spawn_opt,[proc_lib,init_p,[<0.3672.477>,[],gen,init_it,[ >> gen_server,<0.3672.477>,<0.3672.477>,couch_db,{<<42 >> bytes>>,"/usr/local/var/lib/couchdb/group_370c0635-e593-45ed-ac96-75e6b318cb35.couch",<0.21556.480>,[{user_ctx,{user_ctx,null, >> [<<6 bytes>>],undefined... >> >> >> [Thu, 01 May 2014 14:28:04 GMT] [error] [<0.21556.480>] ** Generic server >> <0.21556.480> terminating >> ** Last message in was {'EXIT',<0.3672.477>, >> {system_limit, >> [{erlang,spawn_opt, >> [proc_lib,init_p, >> [<0.3672.477>,[],gen,init_it, >> [gen_server,<0.3672.477>,<0.3672.477>,couch_db, >> >> {<<"group_370c0635-e593-45ed-ac96-75e6b318cb35">>, >> >> "/usr/local/var/lib/couchdb/group_370c0635-e593-45ed-ac96-75e6b318cb35.couch", >> <0.21556.480>, >> [{user_ctx, >> {user_ctx,null,[<<"_admin">>],undefined}}]}, >> []]], >> [link]]}, >> {proc_lib,start_link,5}, >> {couch_db,start_link,3}, >> {couch_server,'-open_async/5-fun-0-',4}]}} >> ** When Server state == {file, >> {file_descriptor,prim_file, >> {#Port<0.898531>,307709}}, >> 1261681} >> ** Reason for termination == >> ** {system_limit, >> [{erlang,spawn_opt, >> [proc_lib,init_p, >> [<0.3672.477>,[],gen,init_it, >> [gen_server,<0.3672.477>,<0.3672.477>,couch_db, >> {<<"group_370c0635-e593-45ed-ac96-75e6b318cb35">>, >> >> "/usr/local/var/lib/couchdb/group_370c0635-e593-45ed-ac96-75e6b318cb35.couch", >> <0.21556.480>, >> [{user_ctx,{user_ctx,null,[<<"_admin">>],undefined}}]}, >> []]], >> [link]]}, >> {proc_lib,start_link,5}, >> {couch_db,start_link,3}, >> {couch_server,'-open_async/5-fun-0-',4}]} >> >> [Thu, 01 May 2014 14:28:04 GMT] [error] [<0.21556.480>] >> {error_report,<0.31.0>, >> {<0.21556.480>,crash_report, >> [[{initial_call,{couch_file,init,['Argument__1']}}, >> {pid,<0.21556.480>}, >> {registered_name,[]}, >> {error_info, >> {exit, >> {system_limit, >> [{erlang,spawn_opt, >> [proc_lib,init_p, >> [<0.3672.477>,[],gen,init_it, >> [gen_server,<0.3672.477>,<0.3672.477>, >> couch_db, >> >> {<<"group_370c0635-e593-45ed-ac96-75e6b318cb35">>, >> >> "/usr/local/var/lib/couchdb/group_370c0635-e593-45ed-ac96-75e6b318cb35.couch", >> <0.21556.480>, >> [{user_ctx, >> {user_ctx,null, >> [<<"_admin">>], >> undefined}}]}, >> []]], >> [link]]}, >> {proc_lib,start_link,5}, >> {couch_db,start_link,3}, >> {couch_server,'-open_async/5-fun-0-',4}]}, >> [{gen_server,terminate,6}, >> {proc_lib,init_p_do_apply,3}]}}, >> {ancestors,[<0.3672.477>]}, >> {messages,[]}, >> {links,[]}, >> {dictionary,[]}, >> {trap_exit,true}, >> {status,running}, >> {heap_size,610}, >> {stack_size,24}, >> {reductions,973}], >> []]}} >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.20971.87>] Index shutdown by >> monitor notice for db: group_5747d16f-4b3b-4522-af10-1dc7d0d644aa idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4883.35>] Index shutdown by >> monitor notice for db: group_15ccf331-257d-4b54-b457-997d342816b9 idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4892.35>] Index shutdown by >> monitor notice for db: group_15ccf331-257d-4b54-b457-997d342816b9 idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.12040.33>] Index shutdown by >> monitor notice for db: group_d006a71d-b0de-4d71-b2f7-06abeeb34e00 idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.20971.87>] Closing index for db: >> group_5747d16f-4b3b-4522-af10-1dc7d0d644aa idx: _design/filters sig: >> "3e823c2a4383ac0c18d4e574135a5b08" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.12032.33>] Index shutdown by >> monitor notice for db: group_d006a71d-b0de-4d71-b2f7-06abeeb34e00 idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4892.35>] Closing index for db: >> group_15ccf331-257d-4b54-b457-997d342816b9 idx: _design/filters sig: >> "3e823c2a4383ac0c18d4e574135a5b08" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4292.4>] Index shutdown by >> monitor notice for db: group_ae50933f-de22-4879-9624-b760106060b3 idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4285.4>] Index shutdown by >> monitor notice for db: group_ae50933f-de22-4879-9624-b760106060b3 idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.20956.87>] Index shutdown by >> monitor notice for db: group_5747d16f-4b3b-4522-af10-1dc7d0d644aa idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4883.35>] Closing index for db: >> group_15ccf331-257d-4b54-b457-997d342816b9 idx: _design/hub sig: >> "4f6edcabc4b7a6357b714e1391ed93ac" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.12040.33>] Closing index for db: >> group_d006a71d-b0de-4d71-b2f7-06abeeb34e00 idx: _design/filters sig: >> "3e823c2a4383ac0c18d4e574135a5b08" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.18850.44>] Index shutdown by >> monitor notice for db: group_721d99a3-2257-48d0-8a1e-89294874d06e idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.18842.44>] Index shutdown by >> monitor notice for db: group_721d99a3-2257-48d0-8a1e-89294874d06e idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.12032.33>] Closing index for db: >> group_d006a71d-b0de-4d71-b2f7-06abeeb34e00 idx: _design/hub sig: >> "4f6edcabc4b7a6357b714e1391ed93ac" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.27768.43>] Index shutdown by >> monitor notice for db: group_56a0df90-c79e-4863-ae71-2bde3cb0d801 idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.27775.43>] Index shutdown by >> monitor notice for db: group_56a0df90-c79e-4863-ae71-2bde3cb0d801 idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4292.4>] Closing index for db: >> group_ae50933f-de22-4879-9624-b760106060b3 idx: _design/filters sig: >> "3e823c2a4383ac0c18d4e574135a5b08" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.4285.4>] Closing index for db: >> group_ae50933f-de22-4879-9624-b760106060b3 idx: _design/hub sig: >> "4f6edcabc4b7a6357b714e1391ed93ac" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.6010.43>] Index shutdown by >> monitor notice for db: group_7f082ae6-f41d-4a14-a836-2360303b2e9a idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.6003.43>] Index shutdown by >> monitor notice for db: group_7f082ae6-f41d-4a14-a836-2360303b2e9a idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.20956.87>] Closing index for db: >> group_5747d16f-4b3b-4522-af10-1dc7d0d644aa idx: _design/hub sig: >> "4f6edcabc4b7a6357b714e1391ed93ac" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.5933.42>] Index shutdown by >> monitor notice for db: group_8c49d7e8-b61e-41e5-a220-11df59b9cce4 idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.5940.42>] Index shutdown by >> monitor notice for db: group_8c49d7e8-b61e-41e5-a220-11df59b9cce4 idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.18842.44>] Closing index for db: >> group_721d99a3-2257-48d0-8a1e-89294874d06e idx: _design/hub sig: >> "4f6edcabc4b7a6357b714e1391ed93ac" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.17529.33>] Index shutdown by >> monitor notice for db: group_98ff493c-63e8-4714-9940-ccea514d4b1d idx: >> _design/hub >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.18850.44>] Closing index for db: >> group_721d99a3-2257-48d0-8a1e-89294874d06e idx: _design/filters sig: >> "3e823c2a4383ac0c18d4e574135a5b08" >> reason: normal >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.17536.33>] Index shutdown by >> monitor notice for db: group_98ff493c-63e8-4714-9940-ccea514d4b1d idx: >> _design/filters >> [Thu, 01 May 2014 14:28:04 GMT] [info] [<0.27768.43>] Closing index for db: >> group_56a0df90-c79e-4863-ae71-2bde3cb0d801 idx: _design/hub sig: >> "4f6edcabc4b7a6357b714e1391ed93ac" >> >> On 2014-05-01, at 9:18 AM, Adam Kocoloski <[email protected]> wrote: >> >>> On May 1, 2014, at 8:47 AM, Interactive Blueprints >>> <[email protected]> wrote: >>> >>>> 2014-05-01 13:14 GMT+02:00 Herman Chan <[email protected]>: >>>>> Thanks Adam, >>>>> >>>>> It seems like it is happening again, with more info this time. It looks >>>>> like I am hitting some sort of system limit, can anyone point out where >>>>> to look next? >>>> >>>> Just guessing here.. >>>> What could be is that you hit the max open file limit of your system. >>>> With "ulimit -a" you can see the limits on your system. >>>> Usually the max open file limit is somewhere around 1024. >>>> I noticed that couchdb loves to have a lot of files open simultaneously. >>>> >>>> Iin the same shell you start couchdb, right before you start couchdb, >>>> you can do a "ulimit -a 4096" (or another large value), this should >>>> give coudhb the ability to open more files. >>>> >>>> Hope this helps. >>>> >>>> Pieter van der Eems >>>> Interactive Blueprints >>> >>> That's a good thought Pieter, though typically in that case you'll see an >>> 'emfile' error in the logs. This particular system_limit error (with >>> {erlang, spawn_link, ...} following it) occurs when the Erlang VM has >>> reached the maximum number of processes it's allowed to spawn. Judging from >>> the *long* list of processes linked to couch_httpd in this stacktrace I'd >>> say Herman's client is improperly leaving connections open. Herman, did you >>> intend to have 1000s of open TCP connections on this server? Regards, >>> >>> Adam >> >
