Hi Bertrand,

I created https://issues.apache.org/jira/browse/SLING-3278 for the prototype health check executor service. You can assign the task to me if you like (I couldn't do that myself as I don't seem to have the permissions for assigning...)

Georg

Am 09.12.2013 16:39, schrieb Bertrand Delacretaz:
Hi Georg,

On Thu, Dec 5, 2013 at 7:51 AM, Georg Henzler
<sl...@cq-eclipse-plugin.net> wrote:
...I just had a closer look at the Sling code
and I like some of the concepts but believe some other things could maybe be
improved...

Thanks for your review - I agree that we need better control on the
execution time and asynchronous execution of our health checks.

We discussed this recently [1] and what's suggested there is fairly
similar to what you suggest in terms of health checks execution, with
timeouts and caching of previously computed values.

...There is an emphasis on getting the overall status of the system: There is a Web Console Plugin and whiteboard servlet (not being dependent on sling) to retrieve an aggregated result of all
health checks registered as services...

You can aggregate Sling health checks with the CompositeHealthCheck
that's briefly described at [3] and used in the health check samples,
would that cover your use cases?

As a first step, I would like to propose the following:
* Introduce HealthCheckRunner to hc-core with the following signature: List<Result> HealthCheckRunner.runAllForTags(String... tags) // the
list is sorted to put failed ones always on top...

I don't think I would sort here, that's a presentation concern - I
prefer having a stable order in the output of the execution service
itself.

* The HealthCheckRunner would use the existing class HealthCheckFilter to
retrieve the service references

Sounds good

* The Web Console would be adjusted to use HealthCheckRunner

Ok

* I would add getExecutionTimeInMs() to org.apache.sling.hc.api.Result

If we're caching the Results I'd add creation timestamp, an expiration
time that can be set when creating the Result and the execution
duration as you suggest.

...* Add parameter format=json to /system/console/healthcheck to provide the result in JSON format (to avoid an extra servlet, I think it is possible for
console urls to return JSON but I would have to check)...

Maybe we don't need that as we have the SLING-2999 JMX resource
provider, but in general this makes sense.

If you want to provide a prototype health check executor service that
would be cool. Note that we have a Sling thread pools service [2]
that's probably useful for that.

-Bertrand

[1] http://markmail.org/message/ioatdxdogexacu2b

[2]

http://sling.apache.org/documentation/bundles/apache-sling-commons-thread-pool.html

[3]

http://sling.apache.org/documentation/bundles/sling-health-check-tool.html

Reply via email to