Hi Bertrand,
sounds great, I like your suggestions.
Do we need logic to make sure only the worst sticky result is kept? It
might not be convenient to keep all sticky results, that would make
the above log very long.
I would keep one last result for WARN/CRITICAL/HEALTH_CHECK_ERROR per HC
instance and log them if they are within hc.warningsStickForMinutes...
I'm happy to implement this later this evening if you have not started
yet...
Regards
Georg
On 2017-06-07 11:26, Bertrand Delacretaz wrote:
Hi Georg,
On Wed, Jun 7, 2017 at 1:01 AM, Georg Henzler <[email protected]>
wrote:
...We could introduce a HC property "hc.keepWarnStickyForMin" (and
"hc.keepCriticalStickyForMin") - this can be entirely implemented in
the
impl package and would not require a new API....
Ok, so you'd have to implement an HC (or configure one) for each
sticky alarm that you want to declare. If Clint still needs a generic
HC as in his current patch, that can also be implemented without API
changes.
That sounds ok to me, and with the out-of-the-box HCs that we have
(including taking JMX data as input) that should cover all cases.
I'd use a single property however, hc.warningsStickForMinutes maybe,
and define that it applies to warn and higher levels - I don't think
we need to be more granular than that.
...
INFO Checking Event Queue...
INFO Event Queue is currently fine.
WARN --- Sticky result from 2017-06-07 11:49 ---
INFO Checking Event Queue...
WARN Event Queue overloaded!...
So the lines following WARN are historic sticky results? That would
work for me, I 'd just say "stick result...follows" to make that
clearer.
Do we need logic to make sure only the worst sticky result is kept? It
might not be convenient to keep all sticky results, that would make
the above log very long.
-Bertrand