[
https://issues.apache.org/jira/browse/OOZIE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ryota Egashira updated OOZIE-2206:
----------------------------------
Description:
OOZIE-1906 added znode cleanup thread.
currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie server
to keep reaping znode even after znode is cleaned up.
(https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/Reaper.java)
This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE
or REAP_UNTIL_DELETE
{code}
reaper = new ChildReaper(zk.getClient(), LOCKS_NODE,
Reaper.Mode.REAP_INDEFINITELY, getExecutorService(),
ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000,
REAPING_LEADER_PATH);
{code}
we hit one scenario where one ZK quorum slows down for short period, causing
many Zk locks not released properly, right after ChildReaper (every 5 min )
runs, which keep checking the list of Znode ever since, in the end, Oozie
server hit OOM.
was:
OOZIE-1906 added znode cleanup thread.
currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie server
to keep reaping znode even after znode is cleaned up. This adds memory
pressure on oozie server. Need to change to REAP_UNTIL_GONE or
REAP_UNTIL_DELETE
{code}
reaper = new ChildReaper(zk.getClient(), LOCKS_NODE,
Reaper.Mode.REAP_INDEFINITELY, getExecutorService(),
ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000,
REAPING_LEADER_PATH);
{code}
we hit one scenario where one ZK quorum slows down for short period, causing
many Zk locks not released properly, right after ChildReaper (every 5 min )
runs, which keep checking the list of Znode ever since, in the end, Oozie
server hit OOM.
> Change Reaper mode on ChildReaper in ZKLocksService
> ---------------------------------------------------
>
> Key: OOZIE-2206
> URL: https://issues.apache.org/jira/browse/OOZIE-2206
> Project: Oozie
> Issue Type: Bug
> Reporter: Ryota Egashira
>
> OOZIE-1906 added znode cleanup thread.
> currently passing Reaper.Mode.REAP_INDEFINITELY, but this enforce Oozie
> server to keep reaping znode even after znode is cleaned up.
> (https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/Reaper.java)
>
> This adds memory pressure on oozie server. Need to change to REAP_UNTIL_GONE
> or REAP_UNTIL_DELETE
>
> {code}
> reaper = new ChildReaper(zk.getClient(), LOCKS_NODE,
> Reaper.Mode.REAP_INDEFINITELY, getExecutorService(),
> ConfigurationService.getInt(services.getConf(), REAPING_THRESHOLD) * 1000,
> REAPING_LEADER_PATH);
> {code}
> we hit one scenario where one ZK quorum slows down for short period, causing
> many Zk locks not released properly, right after ChildReaper (every 5 min )
> runs, which keep checking the list of Znode ever since, in the end, Oozie
> server hit OOM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)