Stefan Egli created SLING-5290:
----------------------------------

             Summary: Ensure heartbeat self-check works with any 
timeout/interval config
                 Key: SLING-5290
                 URL: https://issues.apache.org/jira/browse/SLING-5290
             Project: Sling
          Issue Type: Improvement
          Components: Extensions
    Affects Versions: Discovery Impl 1.2.0
            Reporter: Stefan Egli
            Assignee: Stefan Egli
            Priority: Minor
             Fix For: Discovery Impl 1.2.2


SLING-5195 introduced a 'heartbeat self-check': a separate thread that checks 
if the local instance is successfully storing heartbeats to the repository. 
That check fails if the time since last heartbeat is 'too long'.

SLING-5284 fiddled with what 'too long' exactly means: namely that it should 
take into account not only the timeout but also the interval, as a result 
{{timeout - 2*interval}} was used.

Although not very likely, but lets say you use {{timeout=60sec}}, 
{{interval=25sec}}. That would mean the self-check fails after 10sec already - 
which is clearly too quickly. This formula needs to be improved. And actually 
tests (ClusterTest.testDuplicateInstance3726) highlighted this exact problem.

A new formula should be:
* at minimum {{2*interval}} (or {{timeout}} if that's even lower)
* at maximum {{timeout - 2*interval}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to