Thanks Robert. That makes sense. On Thu, Aug 23, 2012 at 11:34 AM, Robert Newson <[email protected]> wrote:
> > CouchDB is already being monitored by "heart". Monit should monitor heart. > > B. > > On 23 Aug 2012, at 16:07, Nestor Urquiza wrote: > > > Hello folks, > > > > We are using monit to manage our processes including couchdb. We have > > noticed alerts about couchdb pid changing and after some inspection we > > determined couchdb main process pid is not the one saved. For example: > > > > sampleadmin@serverint2:~$ ps -ef|grep couch > > couchdb 16344 1 0 10:34 ? 00:00:00 /bin/sh -e > > /usr/local/bin/couchdb -a /usr/local/etc/couchdb/default.ini -a > > /usr/local/etc/couchdb/local.ini -b -r 5 -p > > /usr/local/var/run/couchdb/couchdb.pid -o /dev/null -e /dev/null -R > > couchdb 16351 16344 0 10:34 ? 00:00:00 /bin/sh -e > > /usr/local/bin/couchdb -a /usr/local/etc/couchdb/default.ini -a > > /usr/local/etc/couchdb/local.ini -b -r 5 -p > > /usr/local/var/run/couchdb/couchdb.pid -o /dev/null -e /dev/null -R > > couchdb 16352 16351 34 10:34 ? 00:00:00 > > /usr/local/lib/erlang/erts-5.8.5/bin/beam.smp -Bd -K true -A 4 -- -root > > /usr/local/lib/erlang -progname erl -- -home /usr/local/var/lib/couchdb > -- > > -noshell -noinput -os_mon start_memsup false start_cpu_sup false > > disk_space_check_interval 1 disk_almost_full_threshold 1 -sasl > errlog_type > > error -couch_ini /usr/local/etc/couchdb/default.ini > > /usr/local/etc/couchdb/local.ini /usr/local/etc/couchdb/default.ini > > /usr/local/etc/couchdb/local.ini -s couch -pidfile > > /usr/local/var/run/couchdb/couchdb.pid -heart > > couchdb 16368 16352 0 10:34 ? 00:00:00 heart -pid 16352 -ht 11 > > couchdb 16371 16352 0 10:34 ? 00:00:00 sh -s disksup > > 1000 16374 15324 0 10:34 pts/0 00:00:00 grep --color=auto couch > > sampleadmin@serverint2:~$ cat /usr/local/var/run/couchdb/couchdb.pid > > 16352 > > > > Our option is just to check if couchdb is responding and if not then to > > restart it instead of checking the pid. That works but I am wondering if > > there is something we are missing and perhaps we should be looking at a > > different pid file to monit the process just as we do the rest. > > > > Thanks in advanced for any help, > > > > -Nestor > > PS: Is there a way to get the history of this list in a searchable form > > like using MARC or Nabble? > >
