[
https://issues.apache.org/jira/browse/SLING-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045440#comment-16045440
]
Georg Henzler commented on SLING-6855:
--------------------------------------
[~bdelacretaz] Thanks for fixing this - I light-heartedly only tested with
synchronous tests. I think the fix is good (except that I maybe would use
{{@Reference HealthCheckResultCache cache}} in {{AsyncHealthCheckExecutor}}
instead of passing it in as parameter)
> The cache keeps one result of each type, by design
Yes this is intended - if someone wants to keep the full history this is better
to be done in a monitoring tool that can store the historic results. Here it is
"only" about changing the result status if something went wrong in the past,
then one per type is sufficient.
> If we agree on how this feature works we should document it
I agree, will do so on Monday! (I will also look into how to create a release,
have not done that yet).
> Sticky Results Support
> ----------------------
>
> Key: SLING-6855
> URL: https://issues.apache.org/jira/browse/SLING-6855
> Project: Sling
> Issue Type: New Feature
> Components: Health Check
> Reporter: Clinton H Goudie-Nice
> Assignee: Georg Henzler
> Fix For: Health Check Annotations 1.0.6, Health Check Core
> 1.2.10, Health Check API 1.0.2
>
>
> Introduce HC service property {{hc.warningsStickForMinutes}} to allow old
> WARN/CRITICAL/HEALTH_CHECK_ERROR results to be sticky (see also
> http://sling.markmail.org/thread/tawikgt7bqxvnlj5#query:+page:1+mid:57hhg55hekr7ib33+state:results)
> --- Original Request ----
> *Create ResultRegistry to provide health check behavior for executing code
> that does not want a HealthCheck*
> I want to provide a Registry service that can be leveraged to provide health
> check results.
> These results can be for a period of time through an expiration, until the
> JVM is restarted, or added and later removed.
> This can be useful when code observes a specific (possibly bad) state, and
> wants to alert through the health check API that this state has taken place.
> Some examples:
> An event pool has filled, and some events will be thrown away.
> This is a failure case that requires a restart of the instance.
> It would be appropriate to trigger a permanent failure.
>
> A quota has been tripped. This quota may immediately recover, but it is
> sensible to alert for 30 minutes that the quota has been tripped.
> If you expect the failure will clear itself within a certain window, setting
> the expiration to that window can be ideal.
> GHPR to follow
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)