Re: [Nagios-users] Notification after Acknowledgment

2011-05-13 Thread Jim Avery
On 13 May 2011 08:34, Andre Kruger andre.kru...@trw.com wrote:
 Hi

 Can you guys please give me your input on how you handle the following
 situation.

 Lets take monitoring a disk as an example. For arguments sake lets say when
 the disk reaches 80% capacity I send out a warning and at 90% I send out a
 critical. There is also a Service Escalation configured to send out
 notifications when this service reaches critical.

 So at 80 percent I get my notification all is well. I then go ahead and
 acknowledge the event and in doing so Nagios will not send out any further
 notifications. Which according to the Nagios logic is correct.

 The problem is if the disk in the mean time reaches critical, 90% capacity,
 I won't get another notification. Not even the Service Escalation helps
 here, because the event has already been acknowledged.

 Do you guys have any suggestions on how this problem can be solved?

 Regards
 Andre


The way I sometimes use for prolonged issues like this is I will
acknowledge the alert, but then raise the warning and critical
thresholds in Nagios.  The problem with this approach is that Nagios
then reports the status as OK which might give a false impression to
other users.  It is also important to remember to reduce the warning
threshold back to its usual level once the issue is resolved.

For issues which might be fast-moving I would suggest that it is not
appropriate to acknowledge the issue unless you are in a postion
actively to manage it until resolution.

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notification after Acknowledgment

2011-05-13 Thread Andre Kruger
Hi
 
Thanks for that.
 
I just read how non-sticky acknowledgments work from 3.2.3. I think this solves 
my problem.
 
http://wiki.nagios.org/index.php/Acknowledgementlogic
 

Assuming you have a service with notifications enabled for all states with a 
max retry attempts of 1, these are the notifications you should get based on 
the following transitions: 
service in OK 
service goes into WARNING - notification sent 
non-sticky acknowledgement applied 
service goes into CRITICAL. Acknowledgement removed. Notification sent 
non-sticky acknowledgement applied 
service goes into WARNING. Acknowledgement removed. Notification sent 
non-sticky acknowledgement applied 
service goes into CRITICAL. Acknowledgement removed. Notification sent 
service goes into OK. Recovery notification sent 


 Jim Avery j...@jimavery.me.uk 2011/05/13 09:57 
On 13 May 2011 08:34, Andre Kruger andre.kru...@trw.com wrote:
 Hi

 Can you guys please give me your input on how you handle the following
 situation.

 Lets take monitoring a disk as an example. For arguments sake lets say when
 the disk reaches 80% capacity I send out a warning and at 90% I send out a
 critical. There is also a Service Escalation configured to send out
 notifications when this service reaches critical.

 So at 80 percent I get my notification all is well. I then go ahead and
 acknowledge the event and in doing so Nagios will not send out any further
 notifications. Which according to the Nagios logic is correct.

 The problem is if the disk in the mean time reaches critical, 90% capacity,
 I won't get another notification. Not even the Service Escalation helps
 here, because the event has already been acknowledged.

 Do you guys have any suggestions on how this problem can be solved?

 Regards
 Andre


The way I sometimes use for prolonged issues like this is I will
acknowledge the alert, but then raise the warning and critical
thresholds in Nagios.  The problem with this approach is that Nagios
then reports the status as OK which might give a false impression to
other users.  It is also important to remember to reduce the warning
threshold back to its usual level once the issue is resolved.

For issues which might be fast-moving I would suggest that it is not
appropriate to acknowledge the issue unless you are in a postion
actively to manage it until resolution.

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Please consider your environmental responsibility before printing this e-mail 
or any other document. Ask yourself whether you need a hard copy.

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Notification after Acknowledgment

2011-05-13 Thread Yueh-Hung Liu
what's the purpose of acknowledging the service problems?
just to suppress the notifications or ?


On Fri, May 13, 2011 at 3:34 PM, Andre Kruger andre.kru...@trw.com wrote:
 Hi

 Can you guys please give me your input on how you handle the following
 situation.

 Lets take monitoring a disk as an example. For arguments sake lets say when
 the disk reaches 80% capacity I send out a warning and at 90% I send out a
 critical. There is also a Service Escalation configured to send out
 notifications when this service reaches critical.

 So at 80 percent I get my notification all is well. I then go ahead and
 acknowledge the event and in doing so Nagios will not send out any further
 notifications. Which according to the Nagios logic is correct.

 The problem is if the disk in the mean time reaches critical, 90% capacity,
 I won't get another notification. Not even the Service Escalation helps
 here, because the event has already been acknowledged.

 Do you guys have any suggestions on how this problem can be solved?

 Regards
 Andre

 P Please consider your environmental responsibility before printing this
 e-mail or any other document. Ask yourself whether you need a hard copy.
 --
 Achieve unprecedented app performance and reliability
 What every C/C++ and Fortran developer should know.
 Learn how Intel has extended the reach of its next-generation tools
 to help boost performance applications - inlcuding clusters.
 http://p.sf.net/sfu/intel-dev2devmay
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notification after Acknowledgment

2011-05-13 Thread Deborah Martin
Andre,

I wouldn't acknowledge it unless you plan to actually do something about it.

I use escalations which instigate callouts to engineers. When the oncall 
engineer acks an alert it means they are investigating. It would be pointless 
surely to ack something which you aren't going to do anything about.

Also, think about why you'd ack at 80% if it's just a warning. We have 
thresholds of 85% for disk usage warnings but in all honesty it's there as 
exactly that - a warning. We don't send notifications for warnings on disk 
usage. We just monitor via the web interface. Notifications are sent on 
critical alerts because that is the time when action needs to be taken and 
users need to pay attention.

But this is all based on our requirements rather than yours so this is just my 
tuppence worth!

Regards,
Deborah




From: Andre Kruger [mailto:andre.kru...@trw.com]
Sent: 13 May 2011 08:35
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] Notification after Acknowledgment

Hi

Can you guys please give me your input on how you handle the following 
situation.

Lets take monitoring a disk as an example. For arguments sake lets say when the 
disk reaches 80% capacity I send out a warning and at 90% I send out a 
critical. There is also a Service Escalation configured to send out 
notifications when this service reaches critical.

So at 80 percent I get my notification all is well. I then go ahead and 
acknowledge the event and in doing so Nagios will not send out any further 
notifications. Which according to the Nagios logic is correct.

The problem is if the disk in the mean time reaches critical, 90% capacity, I 
won't get another notification. Not even the Service Escalation helps here, 
because the event has already been acknowledged.

Do you guys have any suggestions on how this problem can be solved?

Regards
Andre


P Please consider your environmental responsibility before printing this e-mail 
or any other document. Ask yourself whether you need a hard copy.


--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Notification after Acknowledgment

2011-05-13 Thread Andre Kruger
Yes, it was to stop the notifications, but I would then like to receive 
notifications again when the service that was acknowledged goes into a critical 
state. But non-sticky acknowledgments has solved this problem for me.
 
I think I am going to change my default to non-sticky.

 Yueh-Hung Liu yuehung@gmail.com 2011/05/13 10:07 
what's the purpose of acknowledging the service problems?
just to suppress the notifications or ?


On Fri, May 13, 2011 at 3:34 PM, Andre Kruger andre.kru...@trw.com wrote:
 Hi

 Can you guys please give me your input on how you handle the following
 situation.

 Lets take monitoring a disk as an example. For arguments sake lets say when
 the disk reaches 80% capacity I send out a warning and at 90% I send out a
 critical. There is also a Service Escalation configured to send out
 notifications when this service reaches critical.

 So at 80 percent I get my notification all is well. I then go ahead and
 acknowledge the event and in doing so Nagios will not send out any further
 notifications. Which according to the Nagios logic is correct.

 The problem is if the disk in the mean time reaches critical, 90% capacity,
 I won't get another notification. Not even the Service Escalation helps
 here, because the event has already been acknowledged.

 Do you guys have any suggestions on how this problem can be solved?

 Regards
 Andre

 P Please consider your environmental responsibility before printing this
 e-mail or any other document. Ask yourself whether you need a hard copy.
 --
 Achieve unprecedented app performance and reliability
 What every C/C++ and Fortran developer should know.
 Learn how Intel has extended the reach of its next-generation tools
 to help boost performance applications - inlcuding clusters.
 http://p.sf.net/sfu/intel-dev2devmay
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Please consider your environmental responsibility before printing this e-mail 
or any other document. Ask yourself whether you need a hard copy.

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Notification after Acknowledgment

2011-05-13 Thread Jim Avery
On 13 May 2011 09:01, Andre Kruger andre.kru...@trw.com wrote:

 I just read how non-sticky acknowledgments work from 3.2.3. I think this
 solves my problem.

 http://wiki.nagios.org/index.php/Acknowledgementlogic


Neat!  Thanks I hadn't noticed that.

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null