[ 
https://issues.apache.org/jira/browse/OAK-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15838142#comment-15838142
 ] 

Stefan Egli commented on OAK-5446:
----------------------------------

just one comment: maybe we should have two flavours of the test: one with the 
delay and one without - as both cases seem useful.

> leaseUpdateThread might be blocked by leaseUpdateCheck
> ------------------------------------------------------
>
>                 Key: OAK-5446
>                 URL: https://issues.apache.org/jira/browse/OAK-5446
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.4, 1.5.14
>            Reporter: Stefan Eissing
>            Assignee: Julian Reschke
>              Labels: candidate_oak_1_4, candidate_oak_1_6
>         Attachments: OAK-5446.diff, OAK-5446-jr.diff, OAK-5446.testcase
>
>
> Fighting with cluster nodes losing their lease and shutting down oak-core in 
> a cloud environment. For reasons unknown at this point in time, the whole 
> process seems to skip about two minutes of real time.
> This is a situation from which oak currently does not recover. Code analysis 
> shows that {{ClusterNodeInfo}} is handed the 
> {{LeaseCheckDocumentStoreWrapper}} instance to use as store. This is fatal 
> since any action the {{renewLease()}} tries to do will first invoke the 
> {{performLeaseCheck()}}. The lease check will, when the {{FailureMargin}} is 
> reached, _stall the renewLease() thread_ for 5 retry attempts and then 
> declare the lease to be lost.
> The {{ClusterNodeInfo}} should instead be using the "real" {{DocumentStore}}, 
> not the wrapped one, IMO.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to