Hi, the refactoring of the test scheduler mentioned in the manual with fix for program execution already begun.
Regards, Martin > On 22 Jun 2015, at 20:05, Struan Bartlett <[email protected]> > wrote: > > Hi > > I'd like to query the rationale for a behaviour I've experiencing in monit. > I'm testing with the following config: > > # Test config start > set daemon 10 > > check program MyProgram with path "/bin/dash -c 'echo OK!; exit 1'" > every "06 * * * *" > if status != 0 then alert > # Test config end > > As expected, monit runs the dash test program at 6 minutes past the hour. The > dash script finishes immediately. However, Monit doesn't pick up, report or > alert on the exit code in a timely manner. Until the next time Monit is > scheduled to run the test script, the dash script remains as a zombie. But > that is an hour later, which is a long time to wait to be alerted to the > script failing. > > If the 'every' schedule was "06 0 * * *" then it would seem one should expect > to wait 24 hours before being alerted to the script failing! > > I realise the Monit manual explains: > > "The asynchronous nature of the program check [...] comes with a side-effect: > when the program has finished executing and is waiting for Monit to collect > the result, it becomes a so-called "zombie" process [...] the zombie process > is removed from the system as soon as Monit collects the exit status. This > means that every "check program" will be associated with either a running > process or a temporary zombie. This unwanted zombie side-effect will be > removed in a later release of Monit." > > That may be so, however why doesn't Monit reap the child and collect the exit > code at the *next poll cycle after the child exits* (i.e. within 10 seconds > of the test script finishing given the 'set daemon 10' line in the test > config above) rather than when the program is next scheduled to be run? Maybe > I'm missing something, but the current behaviour seems to undermine the > entire purpose of providing alerts on program failure (when used in > conjunction with cron-style scheduling). That is the behaviour I'd like to > query the rationale for. > > Thanks in advance. > > Kind regards > > Struan > > -- > Struan Bartlett > NewsNow Publishing Limited > > Tel: +44 (0)845 838 8890 > Fax: +44 (0)845 838 8898 > The UK's #1 News Portal: > > www.NewsNow.co.uk <http://www.newsnow.co.uk/> (est. 1998) > > Also tailored for Mobile: > > mobile.NewsNow.co.uk <http://mobile.newsnow.co.uk/> > Now with FREE Personalisation: > > Register <http://www.newsnow.co.uk/register/> > Bespoke B2B Internet News Monitoring: > > Internet News Monitoring <http://www.newsnow.co.uk/services/newsmonitoring/> > Bespoke B2B Headlines for Websites: > > Editorial-In-A-Box <http://www.newsnow.co.uk/services/websites/> > NewsNow Publishing Limited, trading also as NewsNow.co.uk, is a company > registered in England and Wales under company no. 3435857 with registered > office The Euston Office, 1 Euston Square, 40 Melton Street, London NW1 2FD > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
