Hi, folks. We’re going to be running mon on over 1,000 servers (each
one is monitoring things at a remote site). Each of these servers/sites
are reporting in (via the “redistribute” command) to a
Corporate/main monitoring server so we can be aware of a failure out in the
remote site. This corporate site will expect alerts from each server
& monitor check (via the “traptimeout” command). All this
is currently working correctly. The problem is that we’re going to need to turn the
monitoring period for several of the remote site monitors in each location way
up – like checking every 10 seconds (i.e., “interval 10s”).
That mean we’re going to see a huge increase in the number of traps we’re
seeing at the corporate site. Is there some way to only redistribute alerts from the
remote servers every 60 seconds, or perhaps another approach to the problem,
like not using “redistribute”? Remote site configuration example: watch BRANCH_SERVER service DRBD interval 1m monitor
DRBDCheck.monitor -s me description Is my
DRBD working? redistribute
trap.alert mainmonitor period wd
{Mon-Sun}
alert trap.alert mainmonitor
upalert trap.alert mainmonitor service My_HB interval 1m monitor
HACheck.monitor -s me description Is my
heartbeat active? redistribute trap.alert
mainmonitor period wd
{Mon-Sun}
alert trap.alert mainmonitor
upalert trap.alert mainmonitor Corporate site configuration example: service DRBD description Is my
DRBD working? traptimeout 2m service My_HB description Is my
heartbeat active? traptimeout 2m Thanks, Tim |
_______________________________________________ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon