Having multiple checks for the same resource (like mentioned "file1" and "file1recover") isn't problem.
Martin On Feb 28, 2011, at 2:39 AM, John (yt) Hogenmiller wrote: > Thanks for the clarification. > > So in this case, instead of telling the dependent (file2) about the > parent (file1), I would have the parent (file1) start and stop > monitoring on its dependents (file2). > > I could see that working. In my case though, I would probably have to > define groups and start and stop monitoring on a group. I have one > device that connects almost everything together. I actually have a > sort of cascade going on. One access point has three subscriber units > connected to it, and then each subscriber unit has its own access > point attached. One of the subscriber units has a router and a server > behind it. > > Here is my network layout if the formatting holds up > > / -> fv3su -> fv3ap > monit server -> fvinside -> fv1ap -> fv4su -> fv4ap > \-> fv2su -> fv2ap > \-> fvoffice -> > fvofficeserv > > Again, these are all physically discrete devices with no way to > automatically restart them. The biggest one is if FV1 or FVINSIDE > goes down, we'll get 8-9 other devices also showing down. If FV2SU > goes down, only 3 other devices show down. > > I could perhaps create some monitoring like so (going back to the file > example): > > > check file1 with path /tmp/file1 > if failed permission 555 then exec "/usr/sbin/monit start file1recover" > if failed permission 555 then stop > > check file1recover with path /tmp/file1 > if succeeded permission 555 for 2 cycles then exec "/usr/bin/monit > start file1" > if succeeded permission 555 for 2 cycles then exec "/usr/bin/monit > -g subfiles start" > if succeeded permission 555 for 2 cycles then exec "/usr/bin/monit > stop file1recover" > > check file2 with path /tmp/file2 > if failed permission 555 then alert > group "subfiles" > depends "file1" > > check file3 with path /tmp/file3 > if failed permission 555 then alert > group "subfiles" > depends "file1" > > I haven't had a chance to test this yet, does monit have any issues > with multiple checks being the same? Any other suggestions would be > appreciated. I've been working with nagios and mrtg on this network > already. Nagios even has a really nice network map built in. > However, I like the straightforward configuration presented with > monit, and I even like the list of status up/downs monit provides on > the web interface. With nagios, it might show all services as > up/green on the network map, but it's not until you click on a > specific service that you see that 1 service (like ssh) is timing out. > Also, I'm running the monitoring on a system with 128MB of memory, > so lean and fast is good. > > > -John > > > > On Sun, Feb 27, 2011 at 12:38 PM, Martin Pala <[email protected]> wrote: >> Hello, >> >> The action "monitor" really doesn't exist - i have fixed the documentation. >> The "monitor" action wouldn't make sense, as the service is monitored >> already. >> >> The "stop" action stops the service and disables monitoring => monit doesn't >> check the service anymore until the monitoring is enabled again (using >> "monit monitor ... or "monit start ..."). >> >> The setup which should work in your case: >> >> --8<-- >> check file file1 with path "/tmp/file1" >> if failed permission 555 then exec "/usr/bin/monit stop file2" else if >> succeeded then exec "/usr/bin/monit start file2" >> >> check file file2 path "/tmp/file2" >> if failed permission 555 then alert >> --8<-- >> >> => if the permissions fail, the "file2" service is stopped, but the >> monitoring of "file1" service continues. If "file1" recovers, the "file2" is >> started again. >> >> Regards, >> Martin >> >> >> On Feb 27, 2011, at 1:58 AM, John (yt) Hogenmiller wrote: >> >>> Hello list, >>> >>> I've been playing with monit in hopes of using it to monitor a >>> wireless installation. At first, it looked like >>> it was doing ok, but then I noticed the "depends on" wasn't working as >>> I had hoped. If deviceA is unreachable, deviceB >>> and deviceC will also be unreachable, so I setup my depends on >>> accordingly, but I still got alerts for all three services. >>> >>> After looking further into the documentation, it seems "depends on" >>> requires monitoring to be stopped on a service for the depends >>> on service to stop monitoring. That's fine, but I'm looking for a way >>> to restart monitoring automatically. In our scenario, if a device >>> goes >>> unpingable, someone would have to physically power cycle it to bring >>> it back online (or potentially replace the device). >>> >>> The documentation wasn't too clear (at least to me) on a way to >>> configure monit this way, so setup an instance that >>> polled every 10 seconds and monitored two files. All the steps I took >>> are below. If anyone can look at my testing and offer advice, >>> I'd appreciate it. Perhaps I'm reading the documentation wrong, or >>> perhaps there's just now way to do what I'm trying (perhaps >>> M/Monit has such capabilities). >>> >>> I originally tested under 5.0.3 (latest with Ubuntu/apt-get), but then >>> upgraded to 5.2.4 hoping for different results. >>> >>> First, my checks: >>> >>> >>> check file file1 with path "/tmp/file1" >>> if failed permission 555 then unmonitor >>> # manul implies that I can do "else if succeeded then >>> monitor", but >>> this fails syntax >>> else if succeeded then alert >>> >>> check file file2 path "/tmp/file2" >>> if failed permission 555 then alert >>> depends on file1 >>> >>> >>> changing /tmp/file1 to 500 does indeed stop monitoring on file1 and file2 >>> >>> [EST Feb 26 13:30:47] debug : monitor service 'file1' on user request >>> [EST Feb 26 13:30:47] info : Awakened by User defined signal 1 >>> [EST Feb 26 13:30:47] info : monit daemon at 31932 awakened >>> [EST Feb 26 13:30:47] info : 'file1' monitor action done >>> >>> >>> On a lark, I updated my config like so: >>> >>> check file file1 with path "/tmp/file1" >>> if failed permission 555 then stop >>> else if succeeded then start >>> >>> check file file2 path "/tmp/file2" >>> if failed permission 555 then alert >>> depends on file1 >>> >>> >>> Upon changing file1 to 500, both services went into not monitored >>> >>> Upong changing file1 back to 555, services did not resume. If >>> manually tell it to start monitoring file1, file2 does not >>> automatically begin monitoring again. >>> >>> >>> >>> Other notes: >>> I had a whole bug report showing that you can't restart monitoring a >>> service from the command line, but I realised that was a bug >>> in 5.0.3, which is the latest Ubuntu provides, but this was fixed once >>> I downloaded 5.2.4. I only mention this for anyone else using monit >>> from the Ubuntu repositories. >>> >>> -- >>> To unsubscribe: >>> http://lists.nongnu.org/mailman/listinfo/monit-general >> >> >> -- >> To unsubscribe: >> http://lists.nongnu.org/mailman/listinfo/monit-general >> > > > > -- > John Hogenmiller - [email protected] > Used for mailing lists - sporadic response > > -- > To unsubscribe: > http://lists.nongnu.org/mailman/listinfo/monit-general -- To unsubscribe: http://lists.nongnu.org/mailman/listinfo/monit-general
