On Mon, Mar 30, 2009 at 10:13 PM, Andreas Ericsson <a...@op5.se> wrote: > Jarrod Moore wrote: >> >> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson <a...@op5.se> wrote: >>> >>> Jarrod Moore wrote: >>>> >>>> Hello everyone, >>>> >>>> I have a couple of related questions regarding service dependencies in >>>> Nagios and their limitations. I have two service checks (let's call >>>> them A and B) and service A depends on service B to function >>>> correctly. I want to set Nagios up so that if service B crashes then >>>> both services A and B are put into the critical state in Nagios. I've >>>> tried using service dependencies in Nagios to represent this behaviour >>>> but have yet to be successful. I can only get it to suppress >>>> notifications of service A if both services go down. >>>> >>> This is expected behaviour. If A is truly dependant on B, then A will >>> turn into a non-ok state of its own volition rather than as a result >>> of any dependency magic. Dependencies are designed as a means of >>> suppressing notifications. Otherwise, you would *always* get a >>> notification for B first, and a minute or so later from A (actually, >>> without the dependency you could get from A first). >>> >>>> Is there a way to do what I'm trying to do here? I'd have thought it >>>> would be logical that if a service depends on another service and the >>>> service depended on dies then all services depending on it would fail >>>> their checks as well, but there;s probably some scenario where it >>>> doesn't work so well. I've had a look through the mailing list >>>> archives and found someone had asked a similar question to the >>>> nagios-devel list about 2.5 years ago and didn't end up getting an >>>> answer, so I thought I might ask whether solutions to this type of >>>> problem had been developed since then. >>>> >>> They haven't. You're using dependencies the wrong way, really. If >>> A is truly dependent on B and doesn't go into a non-ok state after >>> B has crashed, then your check isn't doing what it's supposed to do, >>> or you've misunderstood the relationship somehow. >>> >>> If you were to explain what the two services actually are, it would >>> be easier to point you to a solution that works. >>> >>> -- >>> Andreas Ericsson andreas.erics...@op5.se >>> OP5 AB www.op5.se >>> Tel: +46 8-230225 Fax: +46 8-230231 >>> >>> Considering the successes of the wars on alcohol, poverty, drugs and >>> terror, I think we should give some serious thought to declaring war >>> on peace. >>> >> >> Well basically I have a map (similar to Google Maps) embedded in a >> website, which hits a URL to retrieve maps. So I have one check using >> check_http to check that the website itself is up and another check on >> that URL to make sure that the map service is available. Now if the >> map service goes down, the website is still up but the maps won't >> appear, which means the website's functionality is significantly >> affected. However, it is still up and viewable so doing a check on the >> website URL still passes. >> > > It sounds to me like you'd want to make the map-check dependent on > the webserver-check. That would suppress notifications from the > map-check when it's the webserver that's bombing out. Do you really > need two notifications when the map-service goes offline?
Sorry, I didn't explain that very well. I have a website check that I want to have depend on the result of a map service check. The thing is that I would like two notifications to be sent to my email - one for the service check that is failing and one for each site that is affected by the crashed service. That way I would know what is affected and what needs fixing. Now I should mention at this point (if it wasn't already blindingly obvious) that I'm by no means a Nagios master. However, my idea was to have a chain of service dependencies and then not send notifications for service dependencies in between that I don't want emails about. There's probably a better way of doing what I want and in that case, I'm all ... eyes. >> Now of course I could just write a script or something to check both >> URLs and set that as the check command. There is a problem for me with >> this approach, however, because I have some other instances where a >> web service depends on other web services. > > Define "depend". As I understand the definition, coal-based lifeforms > on our fine planet depend on water and sunlight; Life cannot function > properly without them. > It sounds like you want to make sunlight depend on coal-based lifeforms, > because without the life, the sun is rather pointless. > > Instead of trying to coerce dependencies to work backwards, I'd sit > down and think what you want your Nagios installation to do for you, > and why you would want two services to go critical when one of them > does. Isn't one notification and one red blob in the UI enough? If > it isn't, what do you hope to gain from having two notifications add > two red blobs? I'd say that a service "depends" on another when it requires the other service to provide 100% of its functionality. What I'm trying to say is that the two services that I'm providing are merely a subset of the entire dependency chain. The map service depends on data being in a PostgreSQL database. If the data isn't there, I want two emails - one saying the website doesn't work and that the data is missing. That check depends on the database server being available. If it isn't, I want two emails - one saying that the website is affected and one saying that the database is down. >> When I want to use these >> services in websites, I'd then have to write a check for each script, >> each containing every service in the chain that is needed to display >> the website correctly. This way of doing things just seems a bit >> repetitive to me, especially when I have a check for these web >> services already. > > I'm sorry, but I still fail to see the point. Perhaps you'd be better > off defining each website as a servicegroup with all of the services > that make up the entire visitor-experience parts of a particular > servicegroup. That would make it possible for you to get some sort of > visualization of what (Nagios-)services affect which customer-services, > while at the same time keeping configuration work to a minimum. > > -- > Andreas Ericsson andreas.erics...@op5.se > OP5 AB www.op5.se > Tel: +46 8-230225 Fax: +46 8-230231 > > Considering the successes of the wars on alcohol, poverty, drugs and > terror, I think we should give some serious thought to declaring war > on peace. > Service groups would be enough if I was primarily using the Nagios web UI but, unfortunately, I'm after email notifications and (as far as I am aware) you can't define contacts for service groups. I could settle for a notification saying "Service <name> from the <name> group is down" or something similar. ------------------------------------------------------------------------------ _______________________________________________ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null