[ 
https://issues.apache.org/jira/browse/OAK-4378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4378:
--------------------------------
    Description: 
We've seen cases where instances were shut down due to the lease update check 
lacking any clear reason why this would happen.

At least in one case we found that previous to the lease update check starting 
to complain, there was a temporary out-of-memory condition. Increasing the Java 
heap size fixed the problem for that server.

The DocumentNodeStore's {{NodeStoreTask}} *does* catch all Throwables and log 
them, but AFAIU, there's a risk that the OOM condition will cause the code in 
the catch clause to throw again.

At this point I'm not suggesting any change here, but for the lease update 
check, it might be good to check the state of the background threads as well. 
That might help in debugging things in the future.

  was:
We've seen cases where instances were shut down due to the lease update check 
lacking any clear reason why this would happen.

At least in one case we found that previous to the lease update check starting 
to complain, there was a temporary out-of-memory condition. Increasing the Java 
heap size fixed the problem for that server.

The DocumentNodeStore's {{NodeStoreTask}} *does* catch all Throwables and log 
them, but AFAIU, there's a risk that the OOM condition will cause the code in 
the catch clause to throw again.

At this point I'm not suggesting any change here, but for the lease update 
check, it might be good to check the state of the background threads as well.


> LeaseUpdateCheck: check state of background threads
> ---------------------------------------------------
>
>                 Key: OAK-4378
>                 URL: https://issues.apache.org/jira/browse/OAK-4378
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: documentmk
>            Reporter: Julian Reschke
>            Priority: Minor
>
> We've seen cases where instances were shut down due to the lease update check 
> lacking any clear reason why this would happen.
> At least in one case we found that previous to the lease update check 
> starting to complain, there was a temporary out-of-memory condition. 
> Increasing the Java heap size fixed the problem for that server.
> The DocumentNodeStore's {{NodeStoreTask}} *does* catch all Throwables and log 
> them, but AFAIU, there's a risk that the OOM condition will cause the code in 
> the catch clause to throw again.
> At this point I'm not suggesting any change here, but for the lease update 
> check, it might be good to check the state of the background threads as well. 
> That might help in debugging things in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to