I'm running into some "usability" and "understanding" problems related
to how the exclude_period works.

Whatever state a service has when entering an exclude period, it will
keep that state until the end of that period, when it starts running the
monitor again. Suppose the logic is simple, no monitors are run, so no
input for changing/updating state.

All good and well when you understand how it works. But I'm getting
quite a bit of feedback related to this.

Has anyone ever looked deeply at the exclude_period mechanism? Modified
it? Thought about other more intuitive/"practical" ways it could work?

I was thinking something along the lines:

* Monitor should still be run.
* Alerts should never be fired/triggered.
* State should only change from BAD -> OK. Never from OK -> BAD.
* Should be able to perform "test service" action.

Anyway, just some quick ramblings. Any ideas/thoughts?

Anders Synstad
Basefarm AS

mon mailing list

Reply via email to