Re: [Nagios-users] Notifications on passive service checks
In my opinion it'useless in this scenario; the official doc says: Volatile services differ from "normal" services in three important ways. */Each time/ they are checked* when they are in a hard non-OK state, and the check returns a non-OK state (i.e. no state change has occurred)... But in my setup passive services are never checked (they are trap collectors, and my devices send traps on state change). Let's suppose this scenario: 1. interface Gi0/1 on catalyst switch goes down 2. the switch sends a linkDown trap to the manager 3. the manager decodes the event and submits the alert via nagios.cmd 4. the service associated with that switch changes it's state to critical 5. the contacts are notified (the first time) 6. that's all...contacts will never be notified again until a new linkDown trap is processed Martin Melin wrote: Just use the built in feature for this: is_volatile. See http://nagios.sourceforge.net/docs/2_0/volatileservices.html Regards, Martin Melin On Tue, Nov 16, 2010 at 10:41 PM, Alberto Menichetti wrote: Hi all, I noticed the same strange behavior, but I don't think it's the right behavior. Operating in this way, a linkDown trap will be notified only once (in fact the sender device will generate a single trap in response to state change). Is it possibile to modify this behavior? Hall, JC wrote: After some testing, it looks like it will only re-notify after receiving another passive check result. It won't simply re-notify because it's still in a non-ok state after the notification_interval has expired. So to combat this I just used the check freshness attribute to re-execute my external script and feed the passive check result into nagios and thus re-sending a non-ok notification at what would have been the interval for notifications. So technically my external scripts are running at every interval to check the freshnes, not only when it's called for by my event_handler from another active service check... which I'm ok with. -Original Message- From: Andreas Ericsson [mailto:a...@op5.se] Sent: Friday, November 12, 2010 5:00 AM To: Nagios Users List Cc: Hall, JC Subject: Re: [Nagios-users] Notifications on passive service checks On 11/11/2010 11:27 PM, Hall, JC wrote: Is it accurate that Nagios will only send 1 notification for a passive service check? IE, the notification_interval definition for a passively checked service won't instruct Nagios to re-send a notification such as with actively checked services? To be honest, I haven't got the faintest idea. An educated guess is that it will re-send the notification if it receives another passive check-result and enough time has passed though, or that it simply re-sends the notification when enough time has passed. If you try and find out, let me know either way and I'll amend the docs. -- TAI S.r.l. Alberto Menichetti Area Mercato - Ingegneria dei Sistemi System Engineer 50141 Firenze - Via Pazzagli, 2 Voice: +39 055 42661 - Fax +39 055 4266356 56125 Pisa - Viale Gramsci, 12 Voice: +39 050 220221 - Fax: +39 050 24421 e-mail: alb.meniche...@tai.it http://www.tai.it --- COMUNICAZIONE AI SENSI LEGGE 196/03 Il presente messaggio di posta elettronica viene inviato al Vostro indirizzo email, che abbiamo acquisito da Vostre Visite, da incontri commerciali, elenchi di pubblico dominio, Vostre precedenti comunicazioni. Il Vostro dato in questione e' in possesso di TAI S.r.l., che lo ha immagazzinato in formato elettronico. Tali informazioni non saranno divulgate a terzi. Se desiderate verificare, cancellare o modificare i dati in nostro possesso, inviate fax al numero 0554266356. -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios ver
Re: [Nagios-users] Notifications on passive service checks
Just use the built in feature for this: is_volatile. See http://nagios.sourceforge.net/docs/2_0/volatileservices.html Regards, Martin Melin On Tue, Nov 16, 2010 at 10:41 PM, Alberto Menichetti wrote: > Hi all, > > I noticed the same strange behavior, but I don't think it's the right > behavior. > Operating in this way, a linkDown trap will be notified only once (in fact > the sender device will generate a single trap in response to state change). > Is it possibile to modify this behavior? > > > > Hall, JC wrote: > > After some testing, it looks like it will only re-notify after receiving > another passive check result. It won't simply re-notify because it's still > in a non-ok state after the notification_interval has expired. So to combat > this I just used the check freshness attribute to re-execute my external > script and feed the passive check result into nagios and thus re-sending a > non-ok notification at what would have been the interval for notifications. > > So technically my external scripts are running at every interval to check > the freshnes, not only when it's called for by my event_handler from another > active service check... which I'm ok with. > > -Original Message- > From: Andreas Ericsson [mailto:a...@op5.se] > Sent: Friday, November 12, 2010 5:00 AM > To: Nagios Users List > Cc: Hall, JC > Subject: Re: [Nagios-users] Notifications on passive service checks > > On 11/11/2010 11:27 PM, Hall, JC wrote: > > > Is it accurate that Nagios will only send 1 notification for a > passive service check? > > IE, the notification_interval definition for a passively checked > service won't instruct Nagios to re-send a notification such as with > actively checked services? > > > > To be honest, I haven't got the faintest idea. An educated guess is > that it will re-send the notification if it receives another passive > check-result and enough time has passed though, or that it simply > re-sends the notification when enough time has passed. > > If you try and find out, let me know either way and I'll amend the > docs. > > > > -- > TAI S.r.l. > > Alberto Menichetti > Area Mercato - Ingegneria dei Sistemi > System Engineer > > 50141 Firenze - Via Pazzagli, 2 > Voice: +39 055 42661 - Fax +39 055 4266356 > 56125 Pisa - Viale Gramsci, 12 > Voice: +39 050 220221 - Fax: +39 050 24421 > > e-mail: alb.meniche...@tai.it > http://www.tai.it > > --- > COMUNICAZIONE AI SENSI LEGGE 196/03 > Il presente messaggio di posta elettronica viene inviato al Vostro indirizzo > email, che abbiamo acquisito da Vostre Visite, da incontri commerciali, > elenchi di pubblico dominio, Vostre precedenti comunicazioni. Il Vostro dato > in questione e' in possesso di TAI S.r.l., che lo ha immagazzinato in > formato elettronico. Tali informazioni non saranno divulgate a terzi. Se > desiderate verificare, cancellare o modificare i dati in nostro possesso, > inviate fax al numero 0554266356. > > > -- > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today > http://p.sf.net/sfu/msIE9-sfdev2dev > ___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Notifications on passive service checks
Escalations can resolve this - if you set an escalation to trigger at notification 1 and set a re-notify interval in the escalation you will be re-notified at the set interval while the service remains in the configured alarm state. - Max On Tue, Nov 16, 2010 at 4:41 PM, Alberto Menichetti wrote: > Hi all, > > I noticed the same strange behavior, but I don't think it's the right > behavior. > Operating in this way, a linkDown trap will be notified only once (in fact > the sender device will generate a single trap in response to state change). > Is it possibile to modify this behavior? > > > > Hall, JC wrote: > > After some testing, it looks like it will only re-notify after receiving > another passive check result. It won't simply re-notify because it's still > in a non-ok state after the notification_interval has expired. So to combat > this I just used the check freshness attribute to re-execute my external > script and feed the passive check result into nagios and thus re-sending a > non-ok notification at what would have been the interval for notifications. > > So technically my external scripts are running at every interval to check > the freshnes, not only when it's called for by my event_handler from another > active service check... which I'm ok with. > > -Original Message- > From: Andreas Ericsson [mailto:a...@op5.se] > Sent: Friday, November 12, 2010 5:00 AM > To: Nagios Users List > Cc: Hall, JC > Subject: Re: [Nagios-users] Notifications on passive service checks > > On 11/11/2010 11:27 PM, Hall, JC wrote: > > > Is it accurate that Nagios will only send 1 notification for a > passive service check? > > IE, the notification_interval definition for a passively checked > service won't instruct Nagios to re-send a notification such as with > actively checked services? > > > > To be honest, I haven't got the faintest idea. An educated guess is > that it will re-send the notification if it receives another passive > check-result and enough time has passed though, or that it simply > re-sends the notification when enough time has passed. > > If you try and find out, let me know either way and I'll amend the > docs. > > > > -- > TAI S.r.l. > > Alberto Menichetti > Area Mercato - Ingegneria dei Sistemi > System Engineer > > 50141 Firenze - Via Pazzagli, 2 > Voice: +39 055 42661 - Fax +39 055 4266356 > 56125 Pisa - Viale Gramsci, 12 > Voice: +39 050 220221 - Fax: +39 050 24421 > > e-mail: alb.meniche...@tai.it > http://www.tai.it > > --- > COMUNICAZIONE AI SENSI LEGGE 196/03 > Il presente messaggio di posta elettronica viene inviato al Vostro indirizzo > email, che abbiamo acquisito da Vostre Visite, da incontri commerciali, > elenchi di pubblico dominio, Vostre precedenti comunicazioni. Il Vostro dato > in questione e' in possesso di TAI S.r.l., che lo ha immagazzinato in > formato elettronico. Tali informazioni non saranno divulgate a terzi. Se > desiderate verificare, cancellare o modificare i dati in nostro possesso, > inviate fax al numero 0554266356. > > > -- > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today > http://p.sf.net/sfu/msIE9-sfdev2dev > ___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Next possible notification time bug
Not using exclude works perfectly. -Original Message- From: Andreas Ericsson [mailto:a...@op5.se] Sent: Tuesday, November 16, 2010 4:56 PM To: Nagios Users List Cc: Chung, Jeff Subject: Re: [Nagios-users] Next possible notification time bug On 11/16/2010 10:43 PM, Andreas Ericsson wrote: > Please set your MUA to wrap long lines at something sensible (72 chars > is the standard, I think). > > On 11/16/2010 09:44 PM, Chung, Jeff wrote: >> Hi, Here is the problem I'm trying to solve. We have services that >> have a set maintenance window, for example every Tuesday from 13:30 >> to 14:00. So to stop Nagios from sending notifications during this >> maintenance window I have created a time period that excludes >> "tuesday 13:30-14:00" and use it as the notification_period for the >> service. When testing this it seems like Nagios isn't correctly >> picking the next available time to send notifications out. I have >> configured a service called "TEST_SERVICE2" to return CRITICAL status >> starting at 13:57:28 (which is during the maintenance window). In >> Nagios' debug log it says "Next possible notification time: Wed Nov >> 17 00:00:00 2010", but I think the next possible time should be Nov >> 16 14:00 or soon after. Anyone else came across this issue? >> > > exclude is a fairly new feature, which surprisingly few people use. > I have no doubts there are bugs in it. Thanks for reporting this > though. I should probably write up a test-case for it, but that'll > have to wait til next time I'm fiddling with the Nagios sources. > On a side-note though; Does it work properly if you create your 'test' timeperiod like so: define timeperiod { use 24x7 timeperiod_name test alias Test timeperiod tuesday 00:00-13:30,14:00-24:00 } If it does, we'll know for sure that it's a bug with the 'exclude' directive. Thanks. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. IMPORTANT: The information contained in this email and/or its attachments is confidential. If you are not the intended recipient, please notify the sender immediately by reply and immediately delete this message and all its attachments. Any review, use, reproduction, disclosure or dissemination of this message or any attachment by an unintended recipient is strictly prohibited. Neither this message nor any attachment is intended as or should be construed as an offer, solicitation or recommendation to buy or sell any security or other financial instrument. Neither the sender, his or her employer nor any of their respective affiliates makes any warranties as to the completeness or accuracy of any of the information contained herein or that this message or any of its attachments is free of viruses. -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Notifications on passive service checks
Hi all, I noticed the same strange behavior, but I don't think it's the right behavior. Operating in this way, a linkDown trap will be notified only once (in fact the sender device will generate a single trap in response to state change). Is it possibile to modify this behavior? Hall, JC wrote: After some testing, it looks like it will only re-notify after receiving another passive check result. It won't simply re-notify because it's still in a non-ok state after the notification_interval has expired. So to combat this I just used the check freshness attribute to re-execute my external script and feed the passive check result into nagios and thus re-sending a non-ok notification at what would have been the interval for notifications. So technically my external scripts are running at every interval to check the freshnes, not only when it's called for by my event_handler from another active service check... which I'm ok with. -Original Message- From: Andreas Ericsson [mailto:a...@op5.se] Sent: Friday, November 12, 2010 5:00 AM To: Nagios Users List Cc: Hall, JC Subject: Re: [Nagios-users] Notifications on passive service checks On 11/11/2010 11:27 PM, Hall, JC wrote: Is it accurate that Nagios will only send 1 notification for a passive service check? IE, the notification_interval definition for a passively checked service won't instruct Nagios to re-send a notification such as with actively checked services? To be honest, I haven't got the faintest idea. An educated guess is that it will re-send the notification if it receives another passive check-result and enough time has passed though, or that it simply re-sends the notification when enough time has passed. If you try and find out, let me know either way and I'll amend the docs. -- TAI S.r.l. Alberto Menichetti Area Mercato - Ingegneria dei Sistemi System Engineer 50141 Firenze - Via Pazzagli, 2 Voice: +39 055 42661 - Fax +39 055 4266356 56125 Pisa - Viale Gramsci, 12 Voice: +39 050 220221 - Fax: +39 050 24421 e-mail: alb.meniche...@tai.it http://www.tai.it --- COMUNICAZIONE AI SENSI LEGGE 196/03 Il presente messaggio di posta elettronica viene inviato al Vostro indirizzo email, che abbiamo acquisito da Vostre Visite, da incontri commerciali, elenchi di pubblico dominio, Vostre precedenti comunicazioni. Il Vostro dato in questione e' in possesso di TAI S.r.l., che lo ha immagazzinato in formato elettronico. Tali informazioni non saranno divulgate a terzi. Se desiderate verificare, cancellare o modificare i dati in nostro possesso, inviate fax al numero 0554266356. -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Next possible notification time bug
On 11/16/2010 10:43 PM, Andreas Ericsson wrote: > Please set your MUA to wrap long lines at something sensible (72 chars > is the standard, I think). > > On 11/16/2010 09:44 PM, Chung, Jeff wrote: >> Hi, Here is the problem I'm trying to solve. We have services that >> have a set maintenance window, for example every Tuesday from 13:30 >> to 14:00. So to stop Nagios from sending notifications during this >> maintenance window I have created a time period that excludes >> "tuesday 13:30-14:00" and use it as the notification_period for the >> service. When testing this it seems like Nagios isn't correctly >> picking the next available time to send notifications out. I have >> configured a service called "TEST_SERVICE2" to return CRITICAL status >> starting at 13:57:28 (which is during the maintenance window). In >> Nagios' debug log it says "Next possible notification time: Wed Nov >> 17 00:00:00 2010", but I think the next possible time should be Nov >> 16 14:00 or soon after. Anyone else came across this issue? >> > > exclude is a fairly new feature, which surprisingly few people use. > I have no doubts there are bugs in it. Thanks for reporting this > though. I should probably write up a test-case for it, but that'll > have to wait til next time I'm fiddling with the Nagios sources. > On a side-note though; Does it work properly if you create your 'test' timeperiod like so: define timeperiod { use 24x7 timeperiod_name test alias Test timeperiod tuesday 00:00-13:30,14:00-24:00 } If it does, we'll know for sure that it's a bug with the 'exclude' directive. Thanks. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios Core 3.2.3 host check retry interval
On 11/16/2010 09:59 PM, Chris Beattie wrote: > I noticed something curious. It looks like Nagios 3.2.3 is making > on-demand host checks faster than the retry_interval should allow. The > interval_length is set to 60 and the retry_interval is set to 1. Nagios > and the plugins were compiled from source on CentOS 5.5 x64. > Very curious indeed. The only thing I can see that might trigger something like this is the following patch: http://git.op5.org/git/?p=nagios.git;a=commitdiff;h=1149d275011d7c4d8631b44dbba30ebdb4d7e83f That one was in 3.2.2 too though. Could you try un-commenting the lines mentioned there and see if that helps? I won't revert that patch, but it would give me a pretty good idea of where to start the bug-hunt. Thanks. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Next possible notification time bug
Please set your MUA to wrap long lines at something sensible (72 chars is the standard, I think). On 11/16/2010 09:44 PM, Chung, Jeff wrote: > Hi, Here is the problem I'm trying to solve. We have services that > have a set maintenance window, for example every Tuesday from 13:30 > to 14:00. So to stop Nagios from sending notifications during this > maintenance window I have created a time period that excludes > "tuesday 13:30-14:00" and use it as the notification_period for the > service. When testing this it seems like Nagios isn't correctly > picking the next available time to send notifications out. I have > configured a service called "TEST_SERVICE2" to return CRITICAL status > starting at 13:57:28 (which is during the maintenance window). In > Nagios' debug log it says "Next possible notification time: Wed Nov > 17 00:00:00 2010", but I think the next possible time should be Nov > 16 14:00 or soon after. Anyone else came across this issue? > exclude is a fairly new feature, which surprisingly few people use. I have no doubts there are bugs in it. Thanks for reporting this though. I should probably write up a test-case for it, but that'll have to wait til next time I'm fiddling with the Nagios sources. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Macros in notes?
>> toner part number etc; routers nearest service center, circuit identifier, >> etc. Works great, hard to maintain. > Agreed. IMHO information like that shouldn't be kept in the Nagios config. A > trick we've used a few times is to have a wiki installed, then have notes_url > be http://wiki/$HOSTNAME$ Generally, I agree. I have the information in great detail on my intranet, including escalation contacts, methods, detailed troubleshooting guides but with me being the only IT person, and if I get a problem phone call I can generally walk somebody through checking something in Nagios for me and have them read the info to me right there is easier than having to send them through more hoops. Thankfully, my environment does not change that often though. > I'm pretty sure but haven't confirmed that all custom macro names are > converted to uppercase. If that's done when defining custom > macros but not when referring to macros, $_HOSTprnMake$ should instead be > $_HOSTPRNMAKE$. Let me know if that works. This seems to be the problem, I did not however try it with lower case definitions and uppercase usage, I just did everything uppercase, problem solved. The simple things right! Thank you for your help --Mark Mark A. Lappin, CCNA, MCITP: Enterprise Administrator | Lee Michaels Fine Jewelry Director of Information Technology 11314 Cloverland Ave | Baton Rouge, LA 70809 Ph: 225.291.9094 ext 245 | Fax: 225.368.3675 | Mobile: 225-362-2770 www.lmfj.com This communication is privileged and confidential. If you are not the intended recipient, please notify the sender by reply e-mail and destroy all copies of this communication . -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Nagios Core 3.2.3 host check retry interval
I noticed something curious. It looks like Nagios 3.2.3 is making on-demand host checks faster than the retry_interval should allow. The interval_length is set to 60 and the retry_interval is set to 1. Nagios and the plugins were compiled from source on CentOS 5.5 x64. I'm not sure if this is related to Yu Watanabe's problem (http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg34042 .html) because I didn't start having it until after I upgraded to 3.2.3. Here are some alerts from October when I was running Nagios 3.2.1. There were service alerts too, but the host checks do not occur less than one minute from each other: -- [10-10-2010 06:41:29] HOST ALERT: wwwhost;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 50.10 ms [10-10-2010 06:28:40] HOST ALERT: wwwhost;DOWN;HARD;3;PING CRITICAL - Packet loss = 100% [10-10-2010 06:27:29] HOST ALERT: wwwhost;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100% [10-10-2010 06:26:19] HOST ALERT: wwwhost;DOWN;SOFT;1;PING CRITICAL - Packet loss = 100% -- Here's some from earlier this month, after I'd switched from check_ping to check_icmp. Again, there were service alerts, but the host checks are still about a minute apart: -- [11-07-2010 21:55:53] HOST ALERT: wwwhost;UP;SOFT;2;OK - 10.3.1.11: rta 4.480ms, lost 0% [11-07-2010 21:54:43] HOST ALERT: wwwhost;DOWN;SOFT;1;CRITICAL - 10.3.1.11: rta nan, lost 100% -- [11-09-2010 23:40:15] HOST ALERT: wwwhost;UP;SOFT;2;OK - 10.3.1.11: rta 1.018ms, lost 0% [11-09-2010 23:39:15] HOST ALERT: wwwhost;DOWN;SOFT;1;CRITICAL - 10.3.1.11: rta 650.987ms, lost 80% -- On November 12th, I upgraded to Nagios 3.2.3 and the 1.4.15 plugins, and got this later that evening. The host checks were only about 20 seconds apart: -- [11-12-2010 23:46:43] SERVICE ALERT: wwwhost;Counter: IIS Web Connections;OK;SOFT;2;Web Sessions: 2 [11-12-2010 23:45:14] HOST ALERT: wwwhost;UP;SOFT;2;OK - 10.3.1.11: rta 0.985ms, lost 0% [11-12-2010 23:44:53] HOST ALERT: wwwhost;DOWN;SOFT;1;CRITICAL - 10.3.1.11: rta 355.633ms, lost 80% [11-12-2010 23:44:44] SERVICE ALERT: wwwhost;Counter: IIS Web Connections;WARNING;SOFT;1;No data was received from host! -- Two days later, it looked like it was behaving properly: -- [11-14-2010 23:44:57] HOST ALERT: wwwhost;UP;SOFT;2;OK - 10.3.1.11: rta 1.338ms, lost 0% [11-14-2010 23:44:27] SERVICE ALERT: wwwhost;Service: Snare;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds [11-14-2010 23:44:27] SERVICE ALERT: wwwhost;Service: RServer3;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds [11-14-2010 23:43:34] HOST ALERT: wwwhost;DOWN;SOFT;1;CRITICAL - 10.3.1.11: rta 860.577ms, lost 80% [11-14-2010 23:43:22] SERVICE ALERT: wwwhost;Service: Epilog;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds -- [11-14-2010 08:56:55] HOST ALERT: wwwhost;UP;SOFT;2;OK - 10.3.1.11: rta 2.633ms, lost 0% [11-14-2010 08:55:45] HOST ALERT: wwwhost;DOWN;SOFT;1;CRITICAL - 10.3.1.11: rta 518.822ms, lost 80% [11-14-2010 08:55:36] SERVICE ALERT: wwwhost;Counter: IIS Web Connections;WARNING;SOFT;1;No data was received from host! -- Last night, however, the host got rechecked at short intervals: -- [11-15-2010 23:56:09] HOST ALERT: wwwhost;UP;SOFT;3;WARNING - 10.3.1.11: rta 89.448ms, lost 40% [11-15-2010 23:55:39] HOST ALERT: wwwhost;DOWN;SOFT;2;CRITICAL - 10.3.1.11: rta 984.594ms, lost 80% [11-15-2010 23:55:21] HOST ALERT: wwwhost;DOWN;SOFT;1;CRITICAL - 10.3.1.11: rta 738.100ms, lost 80% [11-15-2010 23:55:09] SERVICE ALERT: wwwhost;CPU;WARNING;SOFT;1;No data was received from host! [11-15-2010 23:54:00] HOST FLAPPING ALERT: wwwhost;STARTED; Host appears to have started flapping (23.0% change > 20.0% threshold) [11-15-2010 23:53:59] HOST ALERT: wwwhost;UP;HARD;1;WARNING - 10.3.1.11: rta 183.851ms, lost 60% [11-15-2010 23:53:29] HOST ALERT: wwwhost;DOWN;HARD;3;CRITICAL - 10.3.1.11: rta nan, lost 100% [11-15-2010 23:53:29] SERVICE ALERT
Re: [Nagios-users] Notifications on passive service checks
After some testing, it looks like it will only re-notify after receiving another passive check result. It won't simply re-notify because it's still in a non-ok state after the notification_interval has expired. So to combat this I just used the check freshness attribute to re-execute my external script and feed the passive check result into nagios and thus re-sending a non-ok notification at what would have been the interval for notifications. So technically my external scripts are running at every interval to check the freshnes, not only when it's called for by my event_handler from another active service check... which I'm ok with. -Original Message- From: Andreas Ericsson [mailto:a...@op5.se] Sent: Friday, November 12, 2010 5:00 AM To: Nagios Users List Cc: Hall, JC Subject: Re: [Nagios-users] Notifications on passive service checks On 11/11/2010 11:27 PM, Hall, JC wrote: > Is it accurate that Nagios will only send 1 notification for a > passive service check? > > IE, the notification_interval definition for a passively checked > service won't instruct Nagios to re-send a notification such as with > actively checked services? > To be honest, I haven't got the faintest idea. An educated guess is that it will re-send the notification if it receives another passive check-result and enough time has passed though, or that it simply re-sends the notification when enough time has passed. If you try and find out, let me know either way and I'll amend the docs. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Next possible notification time bug
Hi, Here is the problem I'm trying to solve. We have services that have a set maintenance window, for example every Tuesday from 13:30 to 14:00. So to stop Nagios from sending notifications during this maintenance window I have created a time period that excludes "tuesday 13:30-14:00" and use it as the notification_period for the service. When testing this it seems like Nagios isn't correctly picking the next available time to send notifications out. I have configured a service called "TEST_SERVICE2" to return CRITICAL status starting at 13:57:28 (which is during the maintenance window). In Nagios' debug log it says "Next possible notification time: Wed Nov 17 00:00:00 2010", but I think the next possible time should be Nov 16 14:00 or soon after. Anyone else came across this issue? Thanks in advance! Here are my configurations. define timeperiod{ name24x7 timeperiod_name 24x7 alias 24 Hours A Day, 7 Days A Week sunday 00:00-24:00 monday 00:00-24:00 tuesday 00:00-24:00 wednesday 00:00-24:00 thursday00:00-24:00 friday 00:00-24:00 saturday00:00-24:00 } # 'test-downtime' timeperiod definition define timeperiod{ nametest-downtime timeperiod_name test-downtime alias test downtime tuesday 13:30-14:00 } # 'test' timeperiod definition define timeperiod{ nametest timeperiod_name test alias test use 24x7 exclude test-downtime } define service{ host_name dsmgtbal800 service_description TEST_SERVICE2 check_command check_test contactschung contact_groups null active_checks_enabled 1 passive_checks_enabled 1 parallelize_check 1 obsess_over_service 1 check_freshness 0 notifications_enabled 1 event_handler_enabled 1 flap_detection_enabled 0 failure_prediction_enabled 1 process_perf_data 1 retain_status_information 1 retain_nonstatus_information1 is_volatile 0 check_period24x7 max_check_attempts 1 normal_check_interval 1 retry_check_interval1 notification_optionsu,c,r,s notification_interval 15 notification_period test } Here is the output from Nagios' debug log. Tue Nov 16 13:57:28 2010 [1289933848.065697] [032.0] [pid=4359] ** Service Notification Attempt ** Host: 'dsmgtbal800', Service: 'TEST_SERVICE2', Type : 0, Options: 0, Current State: 2, Last Notification: Wed Dec 31 19:00:00 1969 Tue Nov 16 13:57:28 2010 [1289933848.065732] [032.1] [pid=4359] This service shouldn't have notifications sent out at this time. Tue Nov 16 13:57:28 2010 [1289933848.065747] [032.1] [pid=4359] Next possible notification time: Wed Nov 17 00:00:00 2010 Tue Nov 16 13:57:28 2010 [1289933848.065775] [032.0] [pid=4359] Notification viability test failed. No notification will be sent out. Tue Nov 16 13:58:28 2010 [1289933908.196211] [032.0] [pid=4359] ** Service Notification Attempt ** Host: 'dsmgtbal800', Service: 'TEST_SERVICE2', Type: 0, Options: 0, Current State: 2, Last Notification: Wed Dec 31 19:00:00 1969 Tue Nov 16 13:58:28 2010 [1289933908.196248] [032.1] [pid=4359] This service shouldn't have notifications sent out at this time. Tue Nov 16 13:58:28 2010 [1289933908.196262] [032.1] [pid=4359] Next possible notification time: Wed Nov 17 00:00:00 2010 Tue Nov 16 13:58:28 2010 [1289933908.196269] [032.0] [pid=4359] Notification viability test failed. No notification will be sent out. Tue Nov 16 13:59:28 2010 [1289933968.114925] [032.0] [pid=4359] ** Service Notification Attempt ** Host: 'dsmgtbal800', Service: 'TEST_SERVICE2', Type: 0, Options: 0, Current State: 2, Last Notification: Wed Dec 31 19:00:00 1969 Tue Nov 16 13:59:28 2010 [1289933968.114970] [032.1] [pid=4359] This service shouldn't have notifications sent out at this time. Tue Nov 16 13:59:28 2010 [1289933968.114985] [032.1] [pid=4359] Next possible notification time: Wed Nov 17 00:00:00 2010 Tue Nov 16 13:59:28 2010 [1289933968.114991] [032.0] [pid=4359] Notification viability test failed. No notification will be sent out. Tue Nov 16 14:00:28 2010 [1289934028.020298] [032.0] [pid=4359] ** Service Notification Attempt ** Host: 'dsmgtbal800', Service: 'TEST_SERVICE2', Type: 0, Options: 0, Current State: 2, Last Notification: Wed Dec 31 19:00:00 1969 Tue Nov 16 14:00:28 2010 [1289934028.020343] [032.1] [pid=435
Re: [Nagios-users] Macros in notes?
Mark A. Lappin wrote: > > What I would like to do, for my network printers, switches, routers, and > some other devices, is add more information to the extended info page. I have > been playing around with notes and to get decently readable output, I end up > with a bunch of ugly looking HTML which I have been duplicating on every host > definition. Trying to include printer make, model, print queue, location, > primary users, toner part number etc; routers nearest service center, circuit > identifier, etc. Works great, hard to maintain. > > So I was/have been trying (unsuccessfully) to use macros in my host > definition and on the template put in the more complex HTML that would fill in > from the macros > > The below configs show what I was attempting. I do not get any > configuration warnings, I don't however get the value that I have set in the > host, I get the > literal output: $_HOSTprnMake$. So I'm thinking (1) Nagios doesn't support > what I'm trying to do and I can't use macros in notes or (2) I have a syntax > error that I'm not seeing. I'm hoping somebody here can give me some insight > into which case it might be - especially for #1 before I really start beating > my head against the wall. It's #1. Nagios only supports macro expansion for command objects (maybe others I don't know). Using macro expansions will work in the arguments (if any) that you pass to the check_command because they're expanded for the command object. Being able to do what you are trying to do here would be nice. I would like to use macros for constructing host and service names. > > define host{ > use generic-printer > host_name 11314-AR > alias 11314-AR-4200N > address 192.168.98.31 > action_url http://192.168.98.31 > hostgroups network-printers > _prnMakeHP > _prnModel Laserjet 2300n > _prnMainQueue "lmfj-print\\11314-AR" > } > > > define host{ > namegeneric-printer ; The name of this host > template > use generic-host; Inherit default values > from the generic-host template > check_period24x7; By default, printers are > monitored round the clock > check_interval 5 ; Actively check the printer > every 5 minutes > retry_interval 1 ; Schedule host check > retries at 1 minute intervals > max_check_attempts 10 ; Check each printer 10 > times (max) > check_command check-host-alive; Default command to > check if printers are "alive" > notification_period workhours ; Printers are only > used during the workday > notification_interval 30 ; Resend notifications every > 30 minutes > notification_optionsd,r ; Only send notifications > for specific host states > contact_groups admins ; Notifications get sent to > the admins by default > register0 ; DONT REGISTER THIS - ITS > JUST A TEMPLATE > notes bgcolor="#FF" style="border-collapse: collapse" bordercolor="#00">\ >Make\ > $_HOSTprnMake$\ > > } > > > Any advice/input is very much appreciated. > > --Mark > > > > Mark A. Lappin, CCNA, MCITP: Enterprise Administrator | Lee Michaels Fine > Jewelry > Director of Information Technology > 11314 Cloverland Ave | Baton Rouge, LA 70809 > Ph: 225.291.9094 ext 245 | Fax: 225.368.3675 | Mobile: 225-362-2770 > www.lmfj.com > -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Macros in notes?
On Tue, Nov 16, 2010 at 4:33 PM, Mark A. Lappin wrote: > > What I would like to do, for my network printers, switches, routers, and some > other devices, is add more information to the extended info page. I have > been playing around with notes and to get decently readable output, I end up > with a bunch of ugly looking HTML which I have been duplicating on every host > definition. Trying to include printer make, model, print queue, location, > primary users, toner part number etc; routers nearest service center, > circuit identifier, etc. Works great, hard to maintain. Agreed. IMHO information like that shouldn't be kept in the Nagios config. A trick we've used a few times is to have a wiki installed, then have notes_url be http://wiki/$HOSTNAME$ This also means you can let more people update information on hosts, printers etc. without having to give them access to Nagios' configs and reloading after each change. > > So I was/have been trying (unsuccessfully) to use macros in my host > definition and on the template put in the more complex HTML that would fill > in from the macros > > The below configs show what I was attempting. I do not get any configuration > warnings, I don't however get the value that I have set in the host, I get > the literal output: $_HOSTprnMake$. So I'm thinking (1) Nagios doesn't > support what I'm trying to do and I can't use macros in notes or (2) I > have a syntax error that I'm not seeing. I'm hoping somebody here can give > me some insight into which case it might be - especially for #1 before I > really start beating my head against the wall. I'm pretty sure but haven't confirmed that all custom macro names are converted to uppercase. If that's done when defining custom macros but not when referring to macros, $_HOSTprnMake$ should instead be $_HOSTPRNMAKE$. Let me know if that works. Cheers, Martin Melin -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Macros in notes?
What I would like to do, for my network printers, switches, routers, and some other devices, is add more information to the extended info page. I have been playing around with notes and to get decently readable output, I end up with a bunch of ugly looking HTML which I have been duplicating on every host definition. Trying to include printer make, model, print queue, location, primary users, toner part number etc; routers nearest service center, circuit identifier, etc. Works great, hard to maintain. So I was/have been trying (unsuccessfully) to use macros in my host definition and on the template put in the more complex HTML that would fill in from the macros The below configs show what I was attempting. I do not get any configuration warnings, I don't however get the value that I have set in the host, I get the literal output: $_HOSTprnMake$. So I'm thinking (1) Nagios doesn't support what I'm trying to do and I can't use macros in notes or (2) I have a syntax error that I'm not seeing. I'm hoping somebody here can give me some insight into which case it might be - especially for #1 before I really start beating my head against the wall. define host{ use generic-printer host_name 11314-AR alias 11314-AR-4200N address 192.168.98.31 action_url http://192.168.98.31 hostgroups network-printers _prnMakeHP _prnModel Laserjet 2300n _prnMainQueue "lmfj-print\\11314-AR" } define host{ namegeneric-printer ; The name of this host template use generic-host; Inherit default values from the generic-host template check_period24x7; By default, printers are monitored round the clock check_interval 5 ; Actively check the printer every 5 minutes retry_interval 1 ; Schedule host check retries at 1 minute intervals max_check_attempts 10 ; Check each printer 10 times (max) check_command check-host-alive; Default command to check if printers are "alive" notification_period workhours ; Printers are only used during the workday notification_interval 30 ; Resend notifications every 30 minutes notification_optionsd,r ; Only send notifications for specific host states contact_groups admins ; Notifications get sent to the admins by default register0 ; DONT REGISTER THIS - ITS JUST A TEMPLATE notes \ Make\ $_HOSTprnMake$\ } Any advice/input is very much appreciated. --Mark Mark A. Lappin, CCNA, MCITP: Enterprise Administrator | Lee Michaels Fine Jewelry Director of Information Technology 11314 Cloverland Ave | Baton Rouge, LA 70809 Ph: 225.291.9094 ext 245 | Fax: 225.368.3675 | Mobile: 225-362-2770 www.lmfj.com This communication is privileged and confidential. If you are not the intended recipient, please notify the sender by reply e-mail and destroy all copies of this communication . -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null