Marcel Reutegger created OAK-1732:
-------------------------------------

             Summary: Cluster node lease not renewed in time
                 Key: OAK-1732
                 URL: https://issues.apache.org/jira/browse/OAK-1732
             Project: Jackrabbit Oak
          Issue Type: Bug
          Components: core, mongomk
    Affects Versions: 0.20.0
            Reporter: Marcel Reutegger
            Assignee: Marcel Reutegger
            Priority: Blocker
             Fix For: 1.1


A cluster node lease is renewed periodically to indicate that a given clusterId 
is in use. This happens twice the async delay before the lease expires. With 
the default configuration this means the lease is renewed two seconds before it 
expires. Load test with an oak cluster shows that the lease sometimes is not 
renewed in time and the LastRevRecoveryAgent kicks in even though the cluster 
nodes are still running.

I see a couple of options to fix this:

- Renew the lease earlier. E.g. after half of the lease timeout, or even with 
every background operation run. This would be roughly once a second.
- Decouple the lease renew from the background operation thread where also more 
expensive operations are executed (_lastRev updates and cache invalidation).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to