Martin Böttcher created OAK-4739:
------------------------------------

             Summary: lease: immediate renew after long renew call
                 Key: OAK-4739
                 URL: https://issues.apache.org/jira/browse/OAK-4739
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: documentmk
    Affects Versions: 1.5.8
            Reporter: Martin Böttcher


A single temporary network issue can shut down the DocumentStore. We observed 
the following situation:

# org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo.renewLease was 
called (this is done regularly and completely normal)
# the network had a temporary issue (whatsoever)
# the database call terminated after a lot of time (the default db maxWaitTime 
is 120 seconds).
# org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo.renewLease decides 
that the current lease is too old (>120 seconds thats the default for the 
oak.documentMK.leaseDurationSeconds property), sets a leaseCheckFailed variable 
and throws an Exception
# because leaseCheckFailed is set all following tries (if any) will immediately 
throw an Exception, too.

I'd recommend to make the ClusterNodeInfo code more robust so that at least one 
retry will be made.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to