Hi,

Looking at SLING-3207 I think this deserves a bit more discussion: I
don't think this is only about JMX, and providing an executor service
that takes care of caching and async execution can help make the
individual health checks simpler.

>From a client's point of view I would suggest the following behavior:

1) Executing a health check (via a new HealthCheckExecutor service
that we'll add) is guaranteed to take a most T msec (configurable)

2) If an individual health check's execute() method takes longer that
T, the executor returns the last result that was previously computed,
or an empty result with state=NODATA if we don't have that yet. The
Result contains the timestamp of when it was computed.

3) The executor service prevents concurrent execution of a given health check

This is very similar to how an HTTP cache works, except that in 2) we
return an old result instead of waiting.

With this we can drop the "execute() method must be fast" requirement
(within reasonable bounds) which can simplify the actual health check
implementations.

WDYT?
-Bertrand

Reply via email to