On 4 Nov 2009, at 15:42, Henry wrote:
Opsview also sets an additional amount to the freshness threshold.
This is 30 minutes by default. The reason is that when we set
freshness values on the Opsview master, we don't want to get lots of
alerts if a slave is done until this additional amount has passed.
Will this be a problem?
Depends:
To illustrate it with my simplest use cases:
- A Backup job, that runs once a day, has a threshold of 26 hours (to
avoid race conditions). An additional 30 Minutes won't carry weight.
- A batch job that has to run every 30 Minutes (60 minutes SLA). So we
send a passive result to Opsview, just before the job exits
(successfully or not). ATM the threshold is 45 minutes, so we got
enough
time to troubleshoot/restart the job, if it doesn't send an result.
Here
additional 30 minutes bother.
Hmm. We've to agonize over this.
How critical is the 30 minute latency?
If it is vital, we'd have to work out a way of telling Nagios that
this is a freshness check for a distributed environment (add the 30
minutes) versus a freshness check for a passive result (use the old
default of 15 seconds). This implies a code change to Nagios somewhere.
Ton
_______________________________________________
Opsview-users mailing list
Opsview-users@lists.opsview.org
http://lists.opsview.org/lists/listinfo/opsview-users