Re: Health checks execution service
I'm not sure if a separate interface is a good thing - this requires the developer to choose and if we makes the wrong choice there is no way to correct this other than changing the implementation - and if we consider today that 1 minutes is the limit but decide tomorrow that it's 30 seconds this gets even more complicated (I just picked random time intervals here). So maybe, if we go with the executor service and have additional metadata (service properties) on the health checks indicating if they are long running etc. this could be handled in the executor service. Of course, this would require that a client of a health check always uses the executor service, but I think that's fine Carsten 2013/10/28 Felix Meschberger fmesc...@adobe.com Hi Apart from JMX (which is a separate biest), I agree, that we have to think about fixing the long-running check issue. How about a „LongRunningHealthCheck“ service ? Such services would be picked up by the HealthCheck infrastructure and execute as tasks (maybe the service properties could even provide scheduling properties). The HealthCheck infra would then register HealthCheck service object providing the most recent results from the LongRunningHealthCheck tasks. Regards Felix — Felix Meschberger | Principal Scientist | Adobe Am 28.10.2013 um 10:11 schrieb Bertrand Delacretaz bdelacre...@apache.org : Hi, Looking at SLING-3207 I think this deserves a bit more discussion: I don't think this is only about JMX, and providing an executor service that takes care of caching and async execution can help make the individual health checks simpler. From a client's point of view I would suggest the following behavior: 1) Executing a health check (via a new HealthCheckExecutor service that we'll add) is guaranteed to take a most T msec (configurable) 2) If an individual health check's execute() method takes longer that T, the executor returns the last result that was previously computed, or an empty result with state=NODATA if we don't have that yet. The Result contains the timestamp of when it was computed. 3) The executor service prevents concurrent execution of a given health check This is very similar to how an HTTP cache works, except that in 2) we return an old result instead of waiting. With this we can drop the execute() method must be fast requirement (within reasonable bounds) which can simplify the actual health check implementations. WDYT? -Bertrand -- Carsten Ziegeler cziege...@apache.org
Re: Health checks execution service
On Tue, Oct 29, 2013 at 9:11 AM, Carsten Ziegeler cziege...@apache.org wrote: ...maybe, if we go with the executor service and have additional metadata (service properties) on the health checks indicating if they are long running etc. this could be handled in the executor service... While the intention is fine I don't think this can be more than a hint - a health check that calls an external service, for example, might usually execute very quickly, and take very long when TCP timeouts happen. My suggestion earlier in this thread takes the execution time into account dynamically, so no need for metadata, it's reality driven ;-) ...Of course, this would require that a client of a health check always uses the executor service, but I think that's fine... I agree with that, the alternative would be to manipulate the health checks bytecode, but that sounds complicated for not much benefits. -Bertrand
Health checks execution service
Hi, Looking at SLING-3207 I think this deserves a bit more discussion: I don't think this is only about JMX, and providing an executor service that takes care of caching and async execution can help make the individual health checks simpler. From a client's point of view I would suggest the following behavior: 1) Executing a health check (via a new HealthCheckExecutor service that we'll add) is guaranteed to take a most T msec (configurable) 2) If an individual health check's execute() method takes longer that T, the executor returns the last result that was previously computed, or an empty result with state=NODATA if we don't have that yet. The Result contains the timestamp of when it was computed. 3) The executor service prevents concurrent execution of a given health check This is very similar to how an HTTP cache works, except that in 2) we return an old result instead of waiting. With this we can drop the execute() method must be fast requirement (within reasonable bounds) which can simplify the actual health check implementations. WDYT? -Bertrand
Re: Health checks execution service
I think these are different issues. Im not against the service per se (need to think about it), but getting an attribute of an bmean should not alter its state. Carsten Am 28.10.2013 10:11 schrieb Bertrand Delacretaz bdelacre...@apache.org: Hi, Looking at SLING-3207 I think this deserves a bit more discussion: I don't think this is only about JMX, and providing an executor service that takes care of caching and async execution can help make the individual health checks simpler. From a client's point of view I would suggest the following behavior: 1) Executing a health check (via a new HealthCheckExecutor service that we'll add) is guaranteed to take a most T msec (configurable) 2) If an individual health check's execute() method takes longer that T, the executor returns the last result that was previously computed, or an empty result with state=NODATA if we don't have that yet. The Result contains the timestamp of when it was computed. 3) The executor service prevents concurrent execution of a given health check This is very similar to how an HTTP cache works, except that in 2) we return an old result instead of waiting. With this we can drop the execute() method must be fast requirement (within reasonable bounds) which can simplify the actual health check implementations. WDYT? -Bertrand
Re: Health checks execution service
Yes, the problems are really different and must be handled differently. And yes, both are problems, indeed. Regards Felix — Felix Meschberger | Principal Scientist | Adobe Am 28.10.2013 um 15:51 schrieb Carsten Ziegeler cziege...@apache.org: I think these are different issues. Im not against the service per se (need to think about it), but getting an attribute of an bmean should not alter its state. Carsten Am 28.10.2013 10:11 schrieb Bertrand Delacretaz bdelacre...@apache.org: Hi, Looking at SLING-3207 I think this deserves a bit more discussion: I don't think this is only about JMX, and providing an executor service that takes care of caching and async execution can help make the individual health checks simpler. From a client's point of view I would suggest the following behavior: 1) Executing a health check (via a new HealthCheckExecutor service that we'll add) is guaranteed to take a most T msec (configurable) 2) If an individual health check's execute() method takes longer that T, the executor returns the last result that was previously computed, or an empty result with state=NODATA if we don't have that yet. The Result contains the timestamp of when it was computed. 3) The executor service prevents concurrent execution of a given health check This is very similar to how an HTTP cache works, except that in 2) we return an old result instead of waiting. With this we can drop the execute() method must be fast requirement (within reasonable bounds) which can simplify the actual health check implementations. WDYT? -Bertrand
Re: Health checks execution service
On Mon, Oct 28, 2013 at 3:51 PM, Carsten Ziegeler cziege...@apache.org wrote: ... I think these are different issues. Im not against the service per se (need to think about it), but getting an attribute of an bmean should not alter its state Agreed, if that can be implemented without having to reinvent things when we get to the executor service I'm fine. -Bertrand
Re: Health checks execution service
Hi Apart from JMX (which is a separate biest), I agree, that we have to think about fixing the long-running check issue. How about a „LongRunningHealthCheck“ service ? Such services would be picked up by the HealthCheck infrastructure and execute as tasks (maybe the service properties could even provide scheduling properties). The HealthCheck infra would then register HealthCheck service object providing the most recent results from the LongRunningHealthCheck tasks. Regards Felix — Felix Meschberger | Principal Scientist | Adobe Am 28.10.2013 um 10:11 schrieb Bertrand Delacretaz bdelacre...@apache.org: Hi, Looking at SLING-3207 I think this deserves a bit more discussion: I don't think this is only about JMX, and providing an executor service that takes care of caching and async execution can help make the individual health checks simpler. From a client's point of view I would suggest the following behavior: 1) Executing a health check (via a new HealthCheckExecutor service that we'll add) is guaranteed to take a most T msec (configurable) 2) If an individual health check's execute() method takes longer that T, the executor returns the last result that was previously computed, or an empty result with state=NODATA if we don't have that yet. The Result contains the timestamp of when it was computed. 3) The executor service prevents concurrent execution of a given health check This is very similar to how an HTTP cache works, except that in 2) we return an old result instead of waiting. With this we can drop the execute() method must be fast requirement (within reasonable bounds) which can simplify the actual health check implementations. WDYT? -Bertrand