Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?

Sergio Trejo Sun, 27 Jan 2008 17:36:42 -0800

Hello Martin,

Just dropping in to check on my mailing list mail. Sorry for the delay in
responding -- I am trying to finish another project and once I do, I can
check things out in more detail per your request!


Thanks,

Serg

On 1/24/08, Martin Pala <[EMAIL PROTECTED]> wrote:
>
> This seems strange. Monit alerts are generated on each action
> according to the configuration.
>
> Can you run monit in verbose mode (-v option)  and send the log?
>
> It is possible that the mailserver rejected the messages or you have
> set the alert filter in the monit configuration to suppress particular
> alerts?
>
> By default monit will drop the email notification on mailserver error.
> There is also support for events queue which allows to retry the
> message delivery next cycle - to enable it us:
>
> --8<--
>   set eventqueue
>       basedir /var/monit  # set the base directory where events will
> be stored
>       slots 100           # optionaly limit the queue size
> --8<--
>
> Anyway - the verbose mode will reveal what happens with the alert
> messages and whether event queue is needed because of mailserver
> problems.
>
>
> Thanks,
> Martin
>
>
>
> On Jan 19, 2008, at 12:06 PM, Sergio Trejo wrote:
>
> > This is an update to my previous message posted herein. The version
> > 4.10.1 of monit most definitely has a bug in it and its not related
> > to Mac OS X 10.5 because version 4.9 of monit runs just perfectly on
> > Mac OS X 10.5.
> >
> > The bug is that monit 4.10.1 does not send out multiple email
> > messages when, very cycle, it encounters multiple daemons not
> > running (whether the daemons have crashed or have been torn down
> > intentionally by a sys admin).
> >
> > Regards,
> >
> > Sergio
> >
> > On 1/19/08, Sergio Trejo <[EMAIL PROTECTED]> wrote: Hello,
> >
> > I have monit (version 4.10.1) running on an Apple machine which is
> > Mac OS X Server (Leopard, 10.5.1). My installation of monit monitors
> > six separate daemons for these programs: Apache, Postfix,
> > PostgreSQL, Tomcat, OpenLDAP, and MySQL. My monit configuration file
> > has entries that look like this for all of the six aforementioned
> > programs (taking Apache for example):
> >
> > check process apache with pidfile "/opt/local/apache2/logs/
> > httpd.pid" every 10 cycles
> >     start = "/opt/local/apache2/bin/apachectl start"
> >     stop = "/opt/local/apache2/bin/apachectl stop"
> >     if failed port 80 and protocol http then restart
> >     if 5 restarts within 5 cycles then timeout
> >
> > Where my daemon frequency is set to 60 seconds as in:
> >
> > set daemon 60
> >
> > What is interesting is that I had all six of my daemons running as a
> > starting point and monit confirmed this (using the little http
> > server built into monit on port 2812). I then, very intentionally
> > (as sort of an auditing process) killed five out of my six daemons
> > (the only daemon I left running was the Postfix daemon because I
> > still wanted to have monit be capable of sending email alerts since
> > I use the internal mail server running on the same machine as
> > Postfix, as in "set mailserver 127.0.0.1"). So, with five of the six
> > daemons intentionally killed, monit did successfully later catch up
> > and successfully re-started all five daemons. However, monit only
> > generated two mail message alerts:1
> >
> > 1. A message stating that the apache daemon did not exist
> >
> > 2. A message stating that the postgres daemon did exist (seemed to
> > have sent this message after re-starting PostgreSQL)
> >
> > But, why didn't I receive ten messages, five of them for each daemon
> > that I intentionally killed stating that they did not exist, and
> > then later on five more messages stating that the five daemons
> > (after being restarted) did indeed exist again?
> >
> > Also, why did I get the first message for apache saying it didn't
> > exist whereas the second message, should it also have stated that
> > the apache daemon existed again (instead of telling me that the
> > postgres daemon existed)?
> >
> > It doesn't make sense. Is it possible that monit was "overwhelmed"
> > or overloaded in some way and became "confused"? I know that doesn't
> > sound appropriate for a binary system but there is nothing in the
> > monit log file to give me any hints. Perhaps, did monit experience a
> > race condition?
> >
> > The log file shows that all five daemons which I had manually killed
> > were restarted successfully (and indeed they were -- I ssh'ed into
> > my server and saw them all running again as processes and monit also
> > reported their successful running again on its http server on port
> > 2812).
> >
> > If this was a race condition, could there be an issue with
> > threading? Mac OS X 10.5 (Leopard and Leopard Server) might be
> > different enough compared to previous versions of Mac OS X with
> > regard to a change to how threading works (but I am writing this
> > very vaguely without much information at the moment other than some
> > fuzzy recollection that something related to threading on Leopard
> > might have changed).
> >
> > Thanks for any suggestions,
> >
> > Serg
> >
> > --
> > To unsubscribe:
> > http://lists.nongnu.org/mailman/listinfo/monit-general
>
>
>
> --
> To unsubscribe:
> http://lists.nongnu.org/mailman/listinfo/monit-general
>

--
To unsubscribe:
http://lists.nongnu.org/mailman/listinfo/monit-general

Re: [monit] Re: monit race conditions on Mac OS X 10.5 Leopard?

Reply via email to