this looks and sounds like the same problem I'm having. A restart of a process causes 2 copies to run, when they are both talking to the same serial port this is not good.
here is my verbose monit (4.10) output [MST Feb 4 08:27:38] debug : 'bs1' cpu usage check passed [current cpu usage=0.0%] [MST Feb 4 08:27:38] debug : 'bs1' total mem amount check passed [current total mem amount=776kB] [MST Feb 4 08:28:00] debug : 'bs1' zombie check passed [status_flag=0000] [MST Feb 4 08:28:00] debug : 'bs1' PID has not changed since last cycle [MST Feb 4 08:28:00] debug : 'bs1' PPID has not changed since last cycle [MST Feb 4 08:28:00] debug : 'bs1' cpu usage check passed [current cpu usage=0.0%] [MST Feb 4 08:28:00] debug : 'bs1' total mem amount check passed [current total mem amount=776kB] Mon Feb 4 08:28:07 MST 2008 restart bs1 [MST Feb 4 08:28:07] info : restart service 'bs1' on user request [MST Feb 4 08:28:07] info : 'bs1' trying to restart [MST Feb 4 08:28:07] debug : Monitoring disabled -- service bs1 [MST Feb 4 08:28:07] info : 'bs1' stop: /opt/unb/bin/bs.sh [MST Feb 4 08:28:08] debug : 'bs1' Error testing process id [10793] -- No such process [MST Feb 4 08:28:08] debug : 'bs1' Error testing process id [10793] -- No such process [MST Feb 4 08:28:08] debug : 'bs1' Error testing process id [10793] -- No such process [MST Feb 4 08:28:08] info : 'bs1' start: /opt/unb/bin/bs.sh [MST Feb 4 08:28:08] debug : 'bs1' Error testing process id [10793] -- No such process [MST Feb 4 08:28:08] debug : Monitoring enabled -- service bs1 [MST Feb 4 08:28:08] debug : 'bs1' check skipped -- service already handled in a dependency chain [MST Feb 4 08:28:08] debug : 'bs1' Error testing process id [10793] -- No such process [MST Feb 4 08:28:09] debug : monit: pidfile '/var/run/bs1.pid' does not exist [MST Feb 4 08:28:10] debug : monit: pidfile '/var/run/bs1.pid' does not exist which continues until [MST Feb 4 08:30:06] debug : monit: pidfile '/var/run/bs1.pid' does not exist [MST Feb 4 08:30:07] debug : monit: pidfile '/var/run/bs1.pid' does not exist [MST Feb 4 08:30:08] debug : monit: pidfile '/var/run/bs1.pid' does not exist [MST Feb 4 08:30:08] error : 'bs1' process is not running [MST Feb 4 08:30:08] info : 'bs1' trying to restart [MST Feb 4 08:30:08] debug : Monitoring disabled -- service bs1 [MST Feb 4 08:30:08] debug : monit: pidfile '/var/run/bs1.pid' does not exist [MST Feb 4 08:30:08] debug : monit: pidfile '/var/run/bs1.pid' does not exist [MST Feb 4 08:30:08] info : 'bs1' start: /opt/unb/bin/bs.sh [MST Feb 4 08:30:08] debug : monit: pidfile '/var/run/bs1.pid' does not exist [MST Feb 4 08:30:08] debug : Monitoring enabled -- service bs1 [MST Feb 4 08:30:08] debug : monit: pidfile '/var/run/bs1.pid' does not exist [MST Feb 4 08:30:08] error : 'bs1' failed to start [MST Feb 4 08:30:09] info : 'bs1' started [MST Feb 4 08:32:08] info : 'bs1' process is running with pid 1370 [MST Feb 4 08:32:08] debug : 'bs1' zombie check passed [status_flag=0000] [MST Feb 4 08:32:08] debug : 'bs1' cpu usage check passed [current cpu usage=0.0%] [MST Feb 4 08:32:08] debug : 'bs1' total mem amount check passed [current total mem amount=556kB] [MST Feb 4 08:34:08] debug : 'bs1' zombie check passed [status_flag=0000] [MST Feb 4 08:34:08] debug : 'bs1' PID has not changed since last cycle [MST Feb 4 08:34:08] debug : 'bs1' PPID has not changed since last cycle [MST Feb 4 08:34:08] debug : 'bs1' cpu usage check passed [current cpu usage=0.0%] [MST Feb 4 08:34:08] debug : 'bs1' total mem amount check passed [current total mem amount=556kB] but there are now 2 copies of the bs1 process running This is on fc5. The start/stop script does use the standard start/stop routines for fc5 which include a remove of the pid file On 04/02/2008, Martin Pala <[EMAIL PROTECTED]> wrote: > Hi, > > there was similar problem which was fixed in monit 4.9: > > --8<-- > * Fix the extra restart action which was called by monit > in addition to user requested start action of stopped > process. This didn't occured in the case that the 'every' > statement was used on the service definition as well. Thanks > to Aaron Scamehorn for help. > --8<-- > > It seems however that some applications still has this or similar > problem (reported by several users). > > I'll look on it ... > > > Martin > > > > Navaneethakrishnan Goapl wrote: > > > > Hi, > > > > Monit Version : 4.9 > > OS Version : CentOS release 4.4 > > > > I am facing the following issue more often. Monit is working fine for > > some time. But at some point of time, if I restart the process, Monit > > span multiple instance of that process. I see that this is the problem > > with earlier releases of MONIT. Is this issue still persist in the > > latest version? Could some one reply to this? > > > > root 8580 1 0 05:31 ? 00:00:00 /bin/sh > > /opt/CSCOacsvw/resources/monit/monit_script.sh jobmanager start > > root 8591 1 0 05:31 ? 00:00:00 /bin/sh > > /opt/CSCOacsvw/resources/monit/monit_script.sh jobmanager start > > > > > > Monitrc > > ------- > > > > check process Jobmanager with pidfile > > "/opt/CSCOacsvw/resources/monit/jobmanager.pid" > > start program = "/opt/CSCOacsvw/resources/monit/monit_script.sh > > jobmanager start" > > stop program = "/opt/CSCOacsvw/resources/monit/monit_script.sh > > jobmanager stop" > > > > monit -vc ./monitrc start all > > > > Runtime constants: > > Control file = ./monitrc > > Log file = /opt/CSCOacsvw/log/monit_errors.log > > Pid file = /var/run/monit.pid > > Debug = True > > Log = True > > Use syslog = False > > Is Daemon = True > > Use process engine = True > > Poll time = 60 seconds > > Mail server(s) = localhost > > Mail from = (not defined) > > Mail subject = (not defined) > > Mail message = (not defined) > > Start monit httpd = True > > httpd bind address = Any/All > > httpd portnumber = 2812 > > httpd signature = True > > Use ssl encryption = False > > httpd auth. style = Host/Net allow list > > > > > > Process Name = Jobmanager > > Pid file = /opt/CSCOacsvw/resources/monit/jobmanager.pid > > Monitoring mode = active > > Start program = '/opt/CSCOacsvw/resources/monit/monit_script.sh > > jobmanager start' timeout 1 cycle(s) > > Stop program = '/opt/CSCOacsvw/resources/monit/monit_script.sh > > jobmanager stop' timeout 1 cycle(s) > > Pid = if changed 1 times within 1 cycle(s) then alert > > Ppid = if changed 1 times within 1 cycle(s) then alert > > > > > > Regards, > > navanee > > > > > > -- > To unsubscribe: > http://lists.nongnu.org/mailman/listinfo/monit-general > -- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
