Thanks for the explanation :) Well, all of this is now making sense a little better now. With Paul's tip on what sort of requirements of open files I need, it's clear that now the time has come to refactor a bit of my code to make it more infrequent that we open up all of the databases. For now, I can change the init script to properly set the limits, and that will be a bandaid until I can properly test that refactoring :)
Thanks for your help, everyone! I went off to lunch, and then thought to myself, "I wonder if I have any responses." Lo and behold, I did :) -Jon On Wed, Dec 1, 2010 at 2:00 PM, Robert Newson <[email protected]> wrote: > You aren't launching couchdb with anything that supports PAM. > > look in /etc/pam.d for a list of services that will honor limits.conf. > > On my system (Debian), /etc/pam.d/su does not honor limits.conf by > default. Even if you enable it, the couchdb startup script doesn't use > su anyway, so it still doesn't help. > > shorter version: PAM and limits.conf is for interactive users, not daemons. > > B. > > On Wed, Dec 1, 2010 at 7:55 PM, Jonathan Johnson <[email protected]> wrote: >> Ah, you're absolutely right -- it didn't work. I'm still at 1024 >> files. Well, that answers part of the question. If all else fails, I >> could use your method by updating the init.d script a little. >> >> Does anyone have any ideas as to why the limits.conf doesn't work? I >> know my way around setting up a system, but this level of >> configuration is currently a little above my head :) >> >> -Jon >> >> >> On Wed, Dec 1, 2010 at 12:29 PM, Robert Newson <[email protected]> >> wrote: >>> look in /proc/<pid/limits to see if your tweak to limits.conf works. I >>> doubt it does. >>> >>> The way I increase fd limits from the miserly Linux default of 1024 is >>> with this run script, where couchdb is launched by runit; >>> >>> #!/bin/bash >>> exec 2>&1 >>> export HOME=<dir> >>> ulimit -n 10000 >>> exec chpst -u <user> couchdb -f >>> >>> B. >>> >>> >>> >>> >>> On Wed, Dec 1, 2010 at 6:21 PM, Jonathan Johnson <[email protected]> wrote: >>>> Our couch setup has around 100 databases with a significant number of >>>> views in each database. Every once in a while, couch takes a dive. I >>>> happened to be around this time, and saw this in the logs: >>>> >>>> >>>> [Wed, 01 Dec 2010 18:09:19 GMT] [error] [<0.102.0>] {error_report,<0.31.0>, >>>> {<0.102.0>,std_error, >>>> {mochiweb_socket_server,225,{acceptor_error,{error,accept_failed}}}}} >>>> >>>> [Wed, 01 Dec 2010 18:09:19 GMT] [error] [<0.10711.1125>] >>>> {error_report,<0.31.0>, >>>> {<0.10711.1125>,std_error, >>>> [{application,mochiweb}, >>>> "Accept failed error","{error,emfile}"]}} >>>> >>>> [Wed, 01 Dec 2010 18:09:19 GMT] [error] [<0.10711.1125>] >>>> {error_report,<0.31.0>, >>>> {<0.10711.1125>,crash_report, >>>> >>>> [[{initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}}, >>>> {pid,<0.10711.1125>}, >>>> {registered_name,[]}, >>>> {error_info, >>>> {exit, >>>> {error,accept_failed}, >>>> [{mochiweb_socket_server,acceptor_loop,1}, >>>> {proc_lib,init_p_do_apply,3}]}}, >>>> {ancestors, >>>> >>>> [couch_httpd,couch_secondary_services,couch_server_sup,<0.32.0>]}, >>>> {messages,[]}, >>>> {links,[<0.102.0>]}, >>>> {dictionary,[]}, >>>> {trap_exit,false}, >>>> {status,running}, >>>> {heap_size,233}, >>>> {stack_size,24}, >>>> {reductions,202}], >>>> []]}} >>>> >>>> [Wed, 01 Dec 2010 18:09:19 GMT] [error] [<0.102.0>] {error_report,<0.31.0>, >>>> {<0.102.0>,std_error, >>>> {mochiweb_socket_server,225,{acceptor_error,{error,accept_failed}}}}} >>>> >>>> I had run into an open files limit before, and had adjusted a few >>>> settings. Here are some of the config values I think are relevant: >>>> >>>> max_dbs_open = 100 >>>> max_connections = 2048 >>>> >>>> From /etc/security/limits.conf >>>> couchdb hard nofile 4096 >>>> couchdb soft nofile 4096 >>>> >>>> The installed version is 1.0.1. >>>> >>>> I'm not sure how to debug this issue further. It only happens after >>>> several days of usage, and once it happens, I can't even ask for the >>>> stats page to see what the current numbers are :) >>>> >>>> Thanks in advance for any help! >>>> -Jon >>>> >>> >> >
