Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-31 Thread Andreas Ericsson
Jarrod Moore wrote:
> On Mon, Mar 30, 2009 at 10:13 PM, Andreas Ericsson  wrote:
>> Jarrod Moore wrote:
>>> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
 Jarrod Moore wrote:
> Hello everyone,
>
> I have a couple of related questions regarding service dependencies in
> Nagios and their limitations. I have two service checks (let's call
> them A and B) and service A depends on service B to function
> correctly. I want to set Nagios up so that if service B crashes then
> both services A and B are put into the critical state in Nagios. I've
> tried using service dependencies in Nagios to represent this behaviour
> but have yet to be successful. I can only get it to suppress
> notifications of service A if both services go down.
>
 This is expected behaviour. If A is truly dependant on B, then A will
 turn into a non-ok state of its own volition rather than as a result
 of any dependency magic. Dependencies are designed as a means of
 suppressing notifications. Otherwise, you would *always* get a
 notification for B first, and a minute or so later from A (actually,
 without the dependency you could get from A first).

> Is there a way to do what I'm trying to do here? I'd have thought it
> would be logical that if a service depends on another service and the
> service depended on dies then all services depending on it would fail
> their checks as well, but there;s probably some scenario where it
> doesn't work so well. I've had a look through the mailing list
> archives and found someone had asked a similar question to the
> nagios-devel list about 2.5 years ago and didn't end up getting an
> answer, so I thought I might ask whether solutions to this type of
> problem had been developed since then.
>
 They haven't. You're using dependencies the wrong way, really. If
 A is truly dependent on B and doesn't go into a non-ok state after
 B has crashed, then your check isn't doing what it's supposed to do,
 or you've misunderstood the relationship somehow.

 If you were to explain what the two services actually are, it would
 be easier to point you to a solution that works.

 --
 Andreas Ericsson   andreas.erics...@op5.se
 OP5 AB www.op5.se
 Tel: +46 8-230225  Fax: +46 8-230231

 Considering the successes of the wars on alcohol, poverty, drugs and
 terror, I think we should give some serious thought to declaring war
 on peace.

>>> Well basically I have a map (similar to Google Maps) embedded in a
>>> website, which hits a URL to retrieve maps. So I have one check using
>>> check_http to check that the website itself is up and another check on
>>> that URL to make sure that the map service is available. Now if the
>>> map service goes down, the website is still up but the maps won't
>>> appear, which means the website's functionality is significantly
>>> affected. However, it is still up and viewable so doing a check on the
>>> website URL still passes.
>>>
>> It sounds to me like you'd want to make the map-check dependent on
>> the webserver-check. That would suppress notifications from the
>> map-check when it's the webserver that's bombing out. Do you really
>> need two notifications when the map-service goes offline?
> 
> Sorry, I didn't explain that very well. I have a website check that I
> want to have depend on the result of a map service check. The thing is
> that I would like two notifications to be sent to my email - one for
> the service check that is failing and one for each site that is
> affected by the crashed service. That way I would know what is
> affected and what needs fixing. Now I should mention at this point (if
> it wasn't already blindingly obvious) that I'm by no means a Nagios
> master. However, my idea was to have a chain of service dependencies
> and then not send notifications for service dependencies in between
> that I don't want emails about. There's probably a better way of doing
> what I want and in that case, I'm all ... eyes.
> 
>>> Now of course I could just write a script or something to check both
>>> URLs and set that as the check command. There is a problem for me with
>>> this approach, however, because I have some other instances where a
>>> web service depends on other web services.
>> Define "depend". As I understand the definition, coal-based lifeforms
>> on our fine planet depend on water and sunlight; Life cannot function
>> properly without them.
>> It sounds like you want to make sunlight depend on coal-based lifeforms,
>> because without the life, the sun is rather pointless.
>>
>> Instead of trying to coerce dependencies to work backwards, I'd sit
>> down and think what you want your Nagios installation to do for you,
>> and why you would want two services to go critical when one of them
>> does. Isn't one noti

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Matthias Flacke


Jarrod Moore wrote:
> On Fri, Mar 27, 2009 at 5:43 PM, Matthias Flacke  
> wrote:
>> Jarrod Moore wrote:
>>> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
 Jarrod Moore wrote:
> Hello everyone,
>
> I have a couple of related questions regarding service dependencies in
> Nagios and their limitations. I have two service checks (let's call
> them A and B) and service A depends on service B to function
> correctly. I want to set Nagios up so that if service B crashes then
> both services A and B are put into the critical state in Nagios. I've
> tried using service dependencies in Nagios to represent this behaviour
> but have yet to be successful. I can only get it to suppress
> notifications of service A if both services go down.
>
 This is expected behaviour. If A is truly dependant on B, then A will
 turn into a non-ok state of its own volition rather than as a result
 of any dependency magic. Dependencies are designed as a means of
 suppressing notifications. Otherwise, you would *always* get a
 notification for B first, and a minute or so later from A (actually,
 without the dependency you could get from A first).

> Is there a way to do what I'm trying to do here? I'd have thought it
> would be logical that if a service depends on another service and the
> service depended on dies then all services depending on it would fail
> their checks as well, but there;s probably some scenario where it
> doesn't work so well. I've had a look through the mailing list
> archives and found someone had asked a similar question to the
> nagios-devel list about 2.5 years ago and didn't end up getting an
> answer, so I thought I might ask whether solutions to this type of
> problem had been developed since then.
>
 They haven't. You're using dependencies the wrong way, really. If
 A is truly dependent on B and doesn't go into a non-ok state after
 B has crashed, then your check isn't doing what it's supposed to do,
 or you've misunderstood the relationship somehow.

 If you were to explain what the two services actually are, it would
 be easier to point you to a solution that works.

 --
 Andreas Ericsson   andreas.erics...@op5.se
 OP5 AB www.op5.se
 Tel: +46 8-230225  Fax: +46 8-230231

 Considering the successes of the wars on alcohol, poverty, drugs and
 terror, I think we should give some serious thought to declaring war
 on peace.

>>> Well basically I have a map (similar to Google Maps) embedded in a
>>> website, which hits a URL to retrieve maps. So I have one check using
>>> check_http to check that the website itself is up and another check on
>>> that URL to make sure that the map service is available. Now if the
>>> map service goes down, the website is still up but the maps won't
>>> appear, which means the website's functionality is significantly
>>> affected. However, it is still up and viewable so doing a check on the
>>> website URL still passes.
>>>
>>> Now of course I could just write a script or something to check both
>>> URLs and set that as the check command. There is a problem for me with
>>> this approach, however, because I have some other instances where a
>>> web service depends on other web services. When I want to use these
>>> services in websites, I'd then have to write a check for each script,
>>> each containing every service in the chain that is needed to display
>>> the website correctly. This way of doing things just seems a bit
>>> repetitive to me, especially when I have a check for these web
>>> services already.
>> You can give check_multi a try (http://my-plugin.de/check_multi).
>>
>> It allows to combine multiple checks on plugin level and has a
>> builtin state logic to evaluate the results of these checks.
>> You can reuse the command files by implementing macros.
>>
>> If I understood your setup correctly the whole result should return
>> CRITICAL if either the main website or the map are not accessible.
>> This is the standard behaviour of check_multi and could be
>> implemented like this:
>>
>> # foo.cmd
>> # call: check_multi -f  -s URLWEB= -s
>> URLMAP=
>> command [ website ] = check_http ... -u $URLWEB$ ...
>> command [ map ] = check_http ... -u $URLMAP$ ...
>>
>> It should work already with these two statements like you expect it
>> with simple check_http, only combined. If one of the child checks
>> fails, the whole construct returns WARNING or CRITICAL.
>>
>> If you need the RC determination more sophisticated, you can define
>> it in perl syntax like this:
>> state [ WARNING ] = website != OK || $website$=~/some evil output/
>> state [ CRITICAL] = website >= WARNING && map != OK
>>
>> Cheers,
>> -Matthias
>>
> 
> Hi Matthias,
> 
> Thanks for the link. I've been checking (no pun intended) out
> check_multi over th

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Jarrod Moore
On Mon, Mar 30, 2009 at 10:13 PM, Andreas Ericsson  wrote:
> Jarrod Moore wrote:
>>
>> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>>>
>>> Jarrod Moore wrote:

 Hello everyone,

 I have a couple of related questions regarding service dependencies in
 Nagios and their limitations. I have two service checks (let's call
 them A and B) and service A depends on service B to function
 correctly. I want to set Nagios up so that if service B crashes then
 both services A and B are put into the critical state in Nagios. I've
 tried using service dependencies in Nagios to represent this behaviour
 but have yet to be successful. I can only get it to suppress
 notifications of service A if both services go down.

>>> This is expected behaviour. If A is truly dependant on B, then A will
>>> turn into a non-ok state of its own volition rather than as a result
>>> of any dependency magic. Dependencies are designed as a means of
>>> suppressing notifications. Otherwise, you would *always* get a
>>> notification for B first, and a minute or so later from A (actually,
>>> without the dependency you could get from A first).
>>>
 Is there a way to do what I'm trying to do here? I'd have thought it
 would be logical that if a service depends on another service and the
 service depended on dies then all services depending on it would fail
 their checks as well, but there;s probably some scenario where it
 doesn't work so well. I've had a look through the mailing list
 archives and found someone had asked a similar question to the
 nagios-devel list about 2.5 years ago and didn't end up getting an
 answer, so I thought I might ask whether solutions to this type of
 problem had been developed since then.

>>> They haven't. You're using dependencies the wrong way, really. If
>>> A is truly dependent on B and doesn't go into a non-ok state after
>>> B has crashed, then your check isn't doing what it's supposed to do,
>>> or you've misunderstood the relationship somehow.
>>>
>>> If you were to explain what the two services actually are, it would
>>> be easier to point you to a solution that works.
>>>
>>> --
>>> Andreas Ericsson                   andreas.erics...@op5.se
>>> OP5 AB                             www.op5.se
>>> Tel: +46 8-230225                  Fax: +46 8-230231
>>>
>>> Considering the successes of the wars on alcohol, poverty, drugs and
>>> terror, I think we should give some serious thought to declaring war
>>> on peace.
>>>
>>
>> Well basically I have a map (similar to Google Maps) embedded in a
>> website, which hits a URL to retrieve maps. So I have one check using
>> check_http to check that the website itself is up and another check on
>> that URL to make sure that the map service is available. Now if the
>> map service goes down, the website is still up but the maps won't
>> appear, which means the website's functionality is significantly
>> affected. However, it is still up and viewable so doing a check on the
>> website URL still passes.
>>
>
> It sounds to me like you'd want to make the map-check dependent on
> the webserver-check. That would suppress notifications from the
> map-check when it's the webserver that's bombing out. Do you really
> need two notifications when the map-service goes offline?

Sorry, I didn't explain that very well. I have a website check that I
want to have depend on the result of a map service check. The thing is
that I would like two notifications to be sent to my email - one for
the service check that is failing and one for each site that is
affected by the crashed service. That way I would know what is
affected and what needs fixing. Now I should mention at this point (if
it wasn't already blindingly obvious) that I'm by no means a Nagios
master. However, my idea was to have a chain of service dependencies
and then not send notifications for service dependencies in between
that I don't want emails about. There's probably a better way of doing
what I want and in that case, I'm all ... eyes.

>> Now of course I could just write a script or something to check both
>> URLs and set that as the check command. There is a problem for me with
>> this approach, however, because I have some other instances where a
>> web service depends on other web services.
>
> Define "depend". As I understand the definition, coal-based lifeforms
> on our fine planet depend on water and sunlight; Life cannot function
> properly without them.
> It sounds like you want to make sunlight depend on coal-based lifeforms,
> because without the life, the sun is rather pointless.
>
> Instead of trying to coerce dependencies to work backwards, I'd sit
> down and think what you want your Nagios installation to do for you,
> and why you would want two services to go critical when one of them
> does. Isn't one notification and one red blob in the UI enough? If
> it isn't, what do you hope to gain from having two noti

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Jarrod Moore
On Fri, Mar 27, 2009 at 5:43 PM, Matthias Flacke  wrote:
>
> Jarrod Moore wrote:
>> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>>> Jarrod Moore wrote:
 Hello everyone,

 I have a couple of related questions regarding service dependencies in
 Nagios and their limitations. I have two service checks (let's call
 them A and B) and service A depends on service B to function
 correctly. I want to set Nagios up so that if service B crashes then
 both services A and B are put into the critical state in Nagios. I've
 tried using service dependencies in Nagios to represent this behaviour
 but have yet to be successful. I can only get it to suppress
 notifications of service A if both services go down.

>>> This is expected behaviour. If A is truly dependant on B, then A will
>>> turn into a non-ok state of its own volition rather than as a result
>>> of any dependency magic. Dependencies are designed as a means of
>>> suppressing notifications. Otherwise, you would *always* get a
>>> notification for B first, and a minute or so later from A (actually,
>>> without the dependency you could get from A first).
>>>
 Is there a way to do what I'm trying to do here? I'd have thought it
 would be logical that if a service depends on another service and the
 service depended on dies then all services depending on it would fail
 their checks as well, but there;s probably some scenario where it
 doesn't work so well. I've had a look through the mailing list
 archives and found someone had asked a similar question to the
 nagios-devel list about 2.5 years ago and didn't end up getting an
 answer, so I thought I might ask whether solutions to this type of
 problem had been developed since then.

>>> They haven't. You're using dependencies the wrong way, really. If
>>> A is truly dependent on B and doesn't go into a non-ok state after
>>> B has crashed, then your check isn't doing what it's supposed to do,
>>> or you've misunderstood the relationship somehow.
>>>
>>> If you were to explain what the two services actually are, it would
>>> be easier to point you to a solution that works.
>>>
>>> --
>>> Andreas Ericsson                   andreas.erics...@op5.se
>>> OP5 AB                             www.op5.se
>>> Tel: +46 8-230225                  Fax: +46 8-230231
>>>
>>> Considering the successes of the wars on alcohol, poverty, drugs and
>>> terror, I think we should give some serious thought to declaring war
>>> on peace.
>>>
>>
>> Well basically I have a map (similar to Google Maps) embedded in a
>> website, which hits a URL to retrieve maps. So I have one check using
>> check_http to check that the website itself is up and another check on
>> that URL to make sure that the map service is available. Now if the
>> map service goes down, the website is still up but the maps won't
>> appear, which means the website's functionality is significantly
>> affected. However, it is still up and viewable so doing a check on the
>> website URL still passes.
>>
>> Now of course I could just write a script or something to check both
>> URLs and set that as the check command. There is a problem for me with
>> this approach, however, because I have some other instances where a
>> web service depends on other web services. When I want to use these
>> services in websites, I'd then have to write a check for each script,
>> each containing every service in the chain that is needed to display
>> the website correctly. This way of doing things just seems a bit
>> repetitive to me, especially when I have a check for these web
>> services already.
>
> You can give check_multi a try (http://my-plugin.de/check_multi).
>
> It allows to combine multiple checks on plugin level and has a
> builtin state logic to evaluate the results of these checks.
> You can reuse the command files by implementing macros.
>
> If I understood your setup correctly the whole result should return
> CRITICAL if either the main website or the map are not accessible.
> This is the standard behaviour of check_multi and could be
> implemented like this:
>
> # foo.cmd
> # call: check_multi -f  -s URLWEB= -s
> URLMAP=
> command [ website ] = check_http ... -u $URLWEB$ ...
> command [ map     ] = check_http ... -u $URLMAP$ ...
>
> It should work already with these two statements like you expect it
> with simple check_http, only combined. If one of the child checks
> fails, the whole construct returns WARNING or CRITICAL.
>
> If you need the RC determination more sophisticated, you can define
> it in perl syntax like this:
> state [ WARNING ] = website != OK || $website$=~/some evil output/
> state [ CRITICAL] = website >= WARNING && map != OK
>
> Cheers,
> -Matthias
>

Hi Matthias,

Thanks for the link. I've been checking (no pun intended) out
check_multi over the last day or two and I like it. My main concern
with this, though, is that if I had 10 websites that were dependent on
the m

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Andreas Ericsson
Jarrod Moore wrote:
> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>> Jarrod Moore wrote:
>>> Hello everyone,
>>>
>>> I have a couple of related questions regarding service dependencies in
>>> Nagios and their limitations. I have two service checks (let's call
>>> them A and B) and service A depends on service B to function
>>> correctly. I want to set Nagios up so that if service B crashes then
>>> both services A and B are put into the critical state in Nagios. I've
>>> tried using service dependencies in Nagios to represent this behaviour
>>> but have yet to be successful. I can only get it to suppress
>>> notifications of service A if both services go down.
>>>
>> This is expected behaviour. If A is truly dependant on B, then A will
>> turn into a non-ok state of its own volition rather than as a result
>> of any dependency magic. Dependencies are designed as a means of
>> suppressing notifications. Otherwise, you would *always* get a
>> notification for B first, and a minute or so later from A (actually,
>> without the dependency you could get from A first).
>>
>>> Is there a way to do what I'm trying to do here? I'd have thought it
>>> would be logical that if a service depends on another service and the
>>> service depended on dies then all services depending on it would fail
>>> their checks as well, but there;s probably some scenario where it
>>> doesn't work so well. I've had a look through the mailing list
>>> archives and found someone had asked a similar question to the
>>> nagios-devel list about 2.5 years ago and didn't end up getting an
>>> answer, so I thought I might ask whether solutions to this type of
>>> problem had been developed since then.
>>>
>> They haven't. You're using dependencies the wrong way, really. If
>> A is truly dependent on B and doesn't go into a non-ok state after
>> B has crashed, then your check isn't doing what it's supposed to do,
>> or you've misunderstood the relationship somehow.
>>
>> If you were to explain what the two services actually are, it would
>> be easier to point you to a solution that works.
>>
>> --
>> Andreas Ericsson   andreas.erics...@op5.se
>> OP5 AB www.op5.se
>> Tel: +46 8-230225  Fax: +46 8-230231
>>
>> Considering the successes of the wars on alcohol, poverty, drugs and
>> terror, I think we should give some serious thought to declaring war
>> on peace.
>>
> 
> Well basically I have a map (similar to Google Maps) embedded in a
> website, which hits a URL to retrieve maps. So I have one check using
> check_http to check that the website itself is up and another check on
> that URL to make sure that the map service is available. Now if the
> map service goes down, the website is still up but the maps won't
> appear, which means the website's functionality is significantly
> affected. However, it is still up and viewable so doing a check on the
> website URL still passes.
> 

It sounds to me like you'd want to make the map-check dependent on
the webserver-check. That would suppress notifications from the
map-check when it's the webserver that's bombing out. Do you really
need two notifications when the map-service goes offline?

> Now of course I could just write a script or something to check both
> URLs and set that as the check command. There is a problem for me with
> this approach, however, because I have some other instances where a
> web service depends on other web services.

Define "depend". As I understand the definition, coal-based lifeforms
on our fine planet depend on water and sunlight; Life cannot function
properly without them.
It sounds like you want to make sunlight depend on coal-based lifeforms,
because without the life, the sun is rather pointless.

Instead of trying to coerce dependencies to work backwards, I'd sit
down and think what you want your Nagios installation to do for you,
and why you would want two services to go critical when one of them
does. Isn't one notification and one red blob in the UI enough? If
it isn't, what do you hope to gain from having two notifications and
two red blobs?

> When I want to use these
> services in websites, I'd then have to write a check for each script,
> each containing every service in the chain that is needed to display
> the website correctly. This way of doing things just seems a bit
> repetitive to me, especially when I have a check for these web
> services already.

I'm sorry, but I still fail to see the point. Perhaps you'd be better
off defining each website as a servicegroup with all of the services
that make up the entire visitor-experience parts of a particular
servicegroup. That would make it possible for you to get some sort of
visualization of what (Nagios-)services affect which customer-services,
while at the same time keeping configuration work to a minimum.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-27 Thread Matthias Flacke

Jarrod Moore wrote:
> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>> Jarrod Moore wrote:
>>> Hello everyone,
>>>
>>> I have a couple of related questions regarding service dependencies in
>>> Nagios and their limitations. I have two service checks (let's call
>>> them A and B) and service A depends on service B to function
>>> correctly. I want to set Nagios up so that if service B crashes then
>>> both services A and B are put into the critical state in Nagios. I've
>>> tried using service dependencies in Nagios to represent this behaviour
>>> but have yet to be successful. I can only get it to suppress
>>> notifications of service A if both services go down.
>>>
>> This is expected behaviour. If A is truly dependant on B, then A will
>> turn into a non-ok state of its own volition rather than as a result
>> of any dependency magic. Dependencies are designed as a means of
>> suppressing notifications. Otherwise, you would *always* get a
>> notification for B first, and a minute or so later from A (actually,
>> without the dependency you could get from A first).
>>
>>> Is there a way to do what I'm trying to do here? I'd have thought it
>>> would be logical that if a service depends on another service and the
>>> service depended on dies then all services depending on it would fail
>>> their checks as well, but there;s probably some scenario where it
>>> doesn't work so well. I've had a look through the mailing list
>>> archives and found someone had asked a similar question to the
>>> nagios-devel list about 2.5 years ago and didn't end up getting an
>>> answer, so I thought I might ask whether solutions to this type of
>>> problem had been developed since then.
>>>
>> They haven't. You're using dependencies the wrong way, really. If
>> A is truly dependent on B and doesn't go into a non-ok state after
>> B has crashed, then your check isn't doing what it's supposed to do,
>> or you've misunderstood the relationship somehow.
>>
>> If you were to explain what the two services actually are, it would
>> be easier to point you to a solution that works.
>>
>> --
>> Andreas Ericsson   andreas.erics...@op5.se
>> OP5 AB www.op5.se
>> Tel: +46 8-230225  Fax: +46 8-230231
>>
>> Considering the successes of the wars on alcohol, poverty, drugs and
>> terror, I think we should give some serious thought to declaring war
>> on peace.
>>
> 
> Well basically I have a map (similar to Google Maps) embedded in a
> website, which hits a URL to retrieve maps. So I have one check using
> check_http to check that the website itself is up and another check on
> that URL to make sure that the map service is available. Now if the
> map service goes down, the website is still up but the maps won't
> appear, which means the website's functionality is significantly
> affected. However, it is still up and viewable so doing a check on the
> website URL still passes.
> 
> Now of course I could just write a script or something to check both
> URLs and set that as the check command. There is a problem for me with
> this approach, however, because I have some other instances where a
> web service depends on other web services. When I want to use these
> services in websites, I'd then have to write a check for each script,
> each containing every service in the chain that is needed to display
> the website correctly. This way of doing things just seems a bit
> repetitive to me, especially when I have a check for these web
> services already.

You can give check_multi a try (http://my-plugin.de/check_multi).

It allows to combine multiple checks on plugin level and has a
builtin state logic to evaluate the results of these checks.
You can reuse the command files by implementing macros.

If I understood your setup correctly the whole result should return
CRITICAL if either the main website or the map are not accessible.
This is the standard behaviour of check_multi and could be
implemented like this:

# foo.cmd
# call: check_multi -f  -s URLWEB= -s
URLMAP=
command [ website ] = check_http ... -u $URLWEB$ ...
command [ map ] = check_http ... -u $URLMAP$ ...

It should work already with these two statements like you expect it
with simple check_http, only combined. If one of the child checks
fails, the whole construct returns WARNING or CRITICAL.

If you need the RC determination more sophisticated, you can define
it in perl syntax like this:
state [ WARNING ] = website != OK || $website$=~/some evil output/
state [ CRITICAL] = website >= WARNING && map != OK

Cheers,
-Matthias

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to 

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-26 Thread Jarrod Moore
On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
> Jarrod Moore wrote:
>>
>> Hello everyone,
>>
>> I have a couple of related questions regarding service dependencies in
>> Nagios and their limitations. I have two service checks (let's call
>> them A and B) and service A depends on service B to function
>> correctly. I want to set Nagios up so that if service B crashes then
>> both services A and B are put into the critical state in Nagios. I've
>> tried using service dependencies in Nagios to represent this behaviour
>> but have yet to be successful. I can only get it to suppress
>> notifications of service A if both services go down.
>>
>
> This is expected behaviour. If A is truly dependant on B, then A will
> turn into a non-ok state of its own volition rather than as a result
> of any dependency magic. Dependencies are designed as a means of
> suppressing notifications. Otherwise, you would *always* get a
> notification for B first, and a minute or so later from A (actually,
> without the dependency you could get from A first).
>
>> Is there a way to do what I'm trying to do here? I'd have thought it
>> would be logical that if a service depends on another service and the
>> service depended on dies then all services depending on it would fail
>> their checks as well, but there;s probably some scenario where it
>> doesn't work so well. I've had a look through the mailing list
>> archives and found someone had asked a similar question to the
>> nagios-devel list about 2.5 years ago and didn't end up getting an
>> answer, so I thought I might ask whether solutions to this type of
>> problem had been developed since then.
>>
>
> They haven't. You're using dependencies the wrong way, really. If
> A is truly dependent on B and doesn't go into a non-ok state after
> B has crashed, then your check isn't doing what it's supposed to do,
> or you've misunderstood the relationship somehow.
>
> If you were to explain what the two services actually are, it would
> be easier to point you to a solution that works.
>
> --
> Andreas Ericsson                   andreas.erics...@op5.se
> OP5 AB                             www.op5.se
> Tel: +46 8-230225                  Fax: +46 8-230231
>
> Considering the successes of the wars on alcohol, poverty, drugs and
> terror, I think we should give some serious thought to declaring war
> on peace.
>

Well basically I have a map (similar to Google Maps) embedded in a
website, which hits a URL to retrieve maps. So I have one check using
check_http to check that the website itself is up and another check on
that URL to make sure that the map service is available. Now if the
map service goes down, the website is still up but the maps won't
appear, which means the website's functionality is significantly
affected. However, it is still up and viewable so doing a check on the
website URL still passes.

Now of course I could just write a script or something to check both
URLs and set that as the check command. There is a problem for me with
this approach, however, because I have some other instances where a
web service depends on other web services. When I want to use these
services in websites, I'd then have to write a check for each script,
each containing every service in the chain that is needed to display
the website correctly. This way of doing things just seems a bit
repetitive to me, especially when I have a check for these web
services already.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-26 Thread Andreas Ericsson
Jarrod Moore wrote:
> Hello everyone,
> 
> I have a couple of related questions regarding service dependencies in
> Nagios and their limitations. I have two service checks (let's call
> them A and B) and service A depends on service B to function
> correctly. I want to set Nagios up so that if service B crashes then
> both services A and B are put into the critical state in Nagios. I've
> tried using service dependencies in Nagios to represent this behaviour
> but have yet to be successful. I can only get it to suppress
> notifications of service A if both services go down.
> 

This is expected behaviour. If A is truly dependant on B, then A will
turn into a non-ok state of its own volition rather than as a result
of any dependency magic. Dependencies are designed as a means of
suppressing notifications. Otherwise, you would *always* get a
notification for B first, and a minute or so later from A (actually,
without the dependency you could get from A first).

> Is there a way to do what I'm trying to do here? I'd have thought it
> would be logical that if a service depends on another service and the
> service depended on dies then all services depending on it would fail
> their checks as well, but there;s probably some scenario where it
> doesn't work so well. I've had a look through the mailing list
> archives and found someone had asked a similar question to the
> nagios-devel list about 2.5 years ago and didn't end up getting an
> answer, so I thought I might ask whether solutions to this type of
> problem had been developed since then.
> 

They haven't. You're using dependencies the wrong way, really. If
A is truly dependent on B and doesn't go into a non-ok state after
B has crashed, then your check isn't doing what it's supposed to do,
or you've misunderstood the relationship somehow.

If you were to explain what the two services actually are, it would
be easier to point you to a solution that works.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-25 Thread Jarrod Moore
Hello everyone,

I have a couple of related questions regarding service dependencies in
Nagios and their limitations. I have two service checks (let's call
them A and B) and service A depends on service B to function
correctly. I want to set Nagios up so that if service B crashes then
both services A and B are put into the critical state in Nagios. I've
tried using service dependencies in Nagios to represent this behaviour
but have yet to be successful. I can only get it to suppress
notifications of service A if both services go down.

Is there a way to do what I'm trying to do here? I'd have thought it
would be logical that if a service depends on another service and the
service depended on dies then all services depending on it would fail
their checks as well, but there;s probably some scenario where it
doesn't work so well. I've had a look through the mailing list
archives and found someone had asked a similar question to the
nagios-devel list about 2.5 years ago and didn't end up getting an
answer, so I thought I might ask whether solutions to this type of
problem had been developed since then.

Cheers,

Jarrod

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null