Hello, we're running mon to monitor a net of (currently) 1.100 nodes. To gather long-time statistics on failures, we'd like to store the results from fping in a database. Has anyone done something like this before? I had a look at dtquery but must admit that it is far beyond my current perl know-how to adapt it... bugzilla-alert would be fine if there'd be an upalert to close the bugs (which is not possible because mon doesn't know the bug-id...)
Furthermore, I'm looking for a way to classify the failures reported by fping: we've grouped the hosts by 50 per hostgroup (will be abt. 300 when everything is installed). Generally, 3 to 5 hosts are not reachable (turned off or whatever) which is our normal condition so it only needs to be recorded. In case of a major network failure, the numbers go up to 70 percent or even the whole hostgroup is down. Is there a way to generate alerts only in these cases? E.g. if more than 30 per cent are down, generate alert 1, otherwise just log the downtime (or whatever is appropriate). I think (though I haven't tried it) this is a different functionality from the severity patch just posted to the list...? Uwe
