I see this happen often as well, sometimes forced to restart monitd entirely.
On Thu, Sep 27, 2012 at 10:53 AM, Brano Zarnovican <[email protected]>wrote: > Hi, > > when I restart service manually, via init script (service foo restart) > it works every time. > When you try the same with monit (monit restart foo), it will end up > in Execution failed most of the time. > > Root cause: > On restart action, monit will fork and execute start program as soon > as the monitored process disappears, irrespective if stop program has > finished or it is still running, leading to a partial overlap of the > end of stop execution and beginning of start. > > Typical init script > > start() { > start service & > echo $! > /var/run/foo.pid > } > stop() { > kill `cat /var/run/foo.pid` > rm -f /var/run/foo.pid > } > > > State #1: process 'foo' is running with pid 100, pid file exists > monit restart foo > > stop: kill `cat /var/run/foo.pid` > start: start service & > start: echo $! > /var/run/foo.pid > stop: rm -f /var/run/foo.pid > > State #2: process 'foo' is running with pid 200, pid file is missing > > (later, monit attempts to start a process which he consider to be down) > start: start service & > start: echo $! > /var/run/foo.pid > > depending on how good your scripts are, you end up with either > State #3a: process 'foo' is running with pid 200, pid file contains > 300 (failed second process) > or > State #3b: process 'foo' is running with pid 200, pid file is still missing > > Workaround is to insert few sleeps here and there (best place is > pre-startup). Or save the timestamp of pid file before kill-ing and > check if it was changed just before 'rm'. Or, don't delete pid file at > all.. > > The root of the problem is that there might be a code which is > executed after the process is stopped which simply cannot overlap with > start. Pid file is just one example. Imagine that you would delete > some tmp or persistent state file during stop which are also created > during startup. > > Suggested solution: > Introduce an option that would make monit to wait for the end of stop > program instead of process termination. Respectively, the later of the > two events. Only then it would call start program. > > Regards, > > BranoZ > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general >
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
