On 1/19/08, Jan-Henrik Haukeland <[EMAIL PROTECTED]> wrote: > > Ok, thanks for the report, we'll look into it. That is, Martin will look > into it when he comes back from the US, I hope. For my part m/monit has > priority one right now. I'm in the flow :) >
Thank you Jan-Henrik, for flowing with the go and going with the flow ;-) I hope Martin is having a nice time in the U.S. -Serg On 19. jan.. 2008, at 21.06, Sergio Trejo wrote: > > This is an update to my previous message posted herein. The version 4.10.1of > monit most definitely has a bug in it and its not related to Mac OS X > 10.5 because version 4.9 of monit runs just perfectly on Mac OS X 10.5. > > The bug is that monit 4.10.1 does not send out multiple email messages > when, very cycle, it encounters multiple daemons not running (whether the > daemons have crashed or have been torn down intentionally by a sys admin). > > Regards, > > Sergio > > On 1/19/08, Sergio Trejo <[EMAIL PROTECTED]> wrote: > > > > Hello, > > > > I have monit (version 4.10.1) running on an Apple machine which is Mac > > OS X Server (Leopard, 10.5.1). My installation of monit monitors six > > separate daemons for these programs: Apache, Postfix, PostgreSQL, Tomcat, > > OpenLDAP, and MySQL. My monit configuration file has entries that look like > > this for all of the six aforementioned programs (taking Apache for example): > > > > > > check process apache with pidfile "/opt/local/apache2/logs/httpd.pid" > > > every 10 cycles > > > start = "/opt/local/apache2/bin/apachectl start" > > > stop = "/opt/local/apache2/bin/apachectl stop" > > > if failed port 80 and protocol http then restart > > > if 5 restarts within 5 cycles then timeout > > > > > > > Where my daemon frequency is set to 60 seconds as in: > > > > set daemon 60 > > > > > > > What is interesting is that I had all six of my daemons running as a > > starting point and monit confirmed this (using the little http server built > > into monit on port 2812). I then, very intentionally (as sort of an > > auditing process) killed five out of my six daemons (the only daemon I left > > running was the Postfix daemon because I still wanted to have monit be > > capable of sending email alerts since I use the internal mail server running > > on the same machine as Postfix, as in "set mailserver 127.0.0.1"). So, > > with five of the six daemons intentionally killed, monit did successfully > > later catch up and successfully re-started all five daemons. However, monit > > only generated two mail message alerts:1 > > > > 1. A message stating that the apache daemon did not exist > > > > 2. A message stating that the postgres daemon did exist (seemed to have > > sent this message after re-starting PostgreSQL) > > > > But, why didn't I receive ten messages, five of them for each daemon > > that I intentionally killed stating that they did not exist, and then later > > on five more messages stating that the five daemons (after being restarted) > > did indeed exist again? > > > > Also, why did I get the first message for apache saying it didn't exist > > whereas the second message, should it also have stated that the apache > > daemon existed again (instead of telling me that the postgres daemon > > existed)? > > > > It doesn't make sense. Is it possible that monit was "overwhelmed" or > > overloaded in some way and became "confused"? I know that doesn't sound > > appropriate for a binary system but there is nothing in the monit log file > > to give me any hints. Perhaps, did monit experience a race condition? > > > > The log file shows that all five daemons which I had manually killed > > were restarted successfully (and indeed they were -- I ssh'ed into my server > > and saw them all running again as processes and monit also reported their > > successful running again on its http server on port 2812). > > > > If this was a race condition, could there be an issue with threading? > > Mac OS X 10.5 (Leopard and Leopard Server) might be different enough > > compared to previous versions of Mac OS X with regard to a change to how > > threading works (but I am writing this very vaguely without much information > > at the moment other than some fuzzy recollection that something related to > > threading on Leopard might have changed). > > > > Thanks for any suggestions, > > > > Serg > > > > -- > To unsubscribe: > http://lists.nongnu.org/mailman/listinfo/monit-general > > > > -- > To unsubscribe: > http://lists.nongnu.org/mailman/listinfo/monit-general >
-- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
