On 2015-05-10 00:55, Lee Wilson wrote:
Evening All, I was wondering if any more work is being done on
dynamic service creation and possible more advanced alerting?

Yes. The dev-2.0 branch contains the start of it, in splitting out
creation and registration of objects to separate functions.

The main headache is that modules need to be adapted to handle the
object configuration changing after they're loaded. Currently,
livestatus and merlin will both crash when objects are either added
or removed.

 The
areas of auto-configuration and gathering of status from compound
services has always been a weakness of Nagios and certainly prevented
me getting it adopted by less willing employers. I'm very keep to see
the core of Naemon s kept as minimal as possible but to provide some
of these features, perhaps a cooperating addon would help. I've been
thinking that having something like an inventory service for solving
the problem of interfaces or other dynamic services. This starts of
as a parent service that Naemon is aware of. When it's run it runs as
a normal service check that collects all the data and reports back
success/failure.  In addition it talks to an additional daemon that
processes the inventory (maybe in the form of XML/JSON data) and
creates new services which would then be checked by Naemon as
normal. I doubt all the info could be contained in the perfdata
output and it probably shouldn't either, than can be left for more
summary/basic info (such as time to run check, number of interfaces
found, etc).   These dynamic services wouldn't be deleted
automatically by default as that is upto the administrator.  Biggest
issue I see with this is having to get the plugins rewritten to
handle it and also needing to have a server side element (even if
just some kind of parser script) to the plugin to process the data.
Being a network engineer, I do tend to focus on the likes of
interfaces (especially switches as they rather long winded to add)
etc it equally could apply for enumerating windows services, load
balancer pools, etc. The service checks could also be configured with
filtering capabilities (such as exclude 'this', only include starting
with 'x', etc).



For dealing with compound services (such as 3 out of
5 HTTP services have failed in the last hour), this would probably
need something that can process the recent service check output and
notice patterns.  In traditional Nagios this could use NDO and have
the addon read the DB data every x minutes but I guess MKLiveStatus
could do the same thing.

Better solution; A module adds a pseudo-service for the cluster and
updates the cluster health based on the number of elements in a good
or bad state in the cluster. It could also mark the cluster services
with custom variables and a clever UI could then use that information
to display it in a way that makes sense. A tree structure would make
a lot of sense, but UI's aren't really my strong side.

For each compound alert that is created
potentially a service check is created allowing it be ran on schedule
but I'd be concerned this would put unnecessary load on the main
naemon process, a seperate addon could even be run on an entirely
different server if MkLiveStatus is available over the network. My
last 2 employers have wanted features like this (especially the last
one) so unfortunately I've never been able to get them to adopt
Nagios/Naemon but I keep trying. Am I correct in saying that it's not
possible to alert based on the status of a host/service group and
they are mainly just for display purposes? This is just my early
brain dump on the idea without needing to change any of the core
Naemon functionality. Would be interested in any feedback. Lee


It's entirely possible to write a module that tracks the state of the
elements in a host- or servicegroup and alerts based on that, and it
wouldn't be very difficult. Currently, it should update the status of a
host or service that isn't being actively checked, but automagically
adding one seems a lot cleaner to me.

/Andreas

Reply via email to