Bertrand Delacretaz created SLING-3321:
------------------------------------------
Summary: Incorrect caching/timeout behavior with slow health check
Key: SLING-3321
URL: https://issues.apache.org/jira/browse/SLING-3321
Project: Sling
Issue Type: Bug
Components: Extensions
Affects Versions: Health Check Core 1.0.8
Reporter: Bertrand Delacretaz
Assignee: Bertrand Delacretaz
We might not need to fix this right now, just making a note of some tests I did
with the SlowHealthCheckSample.
By default SlowHealthCheckSample takes 1200-3700 msec to execute, and I have
set the cache lifetime to 5 seconds.
With these settings, executing the health check every second should always
provide a result: even if a particular execute call takes more than the default
2 seconds execution timeout, an older cached result should still be available
as 3700 (max execution time) + 1000 (execution period) is smaller than 5000
(time to live in cache)
I'll attach an execution log which shows that this is not the case. I see two
problems:
# A result which times out is cached and reused, even though the actual
execution might have finished in the meantime. We then get a timeout result and
the actual result is thrown away. There's no " execution counter=2" result in
my log for example.
# There's no way to say "execute the health check, but if it times out use an
older result if still valid". We might need an execution option for that, as
you don't always want that.
I think this is a realistic use case, checking external systems for example
might have that kind of timing characteristics. I should be able to call the
executor for such an HC every second, for example, and get a result every time,.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)