[ https://issues.apache.org/jira/browse/OOZIE-3254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17607242#comment-17607242 ]
Andras Salamon commented on OOZIE-3254: --------------------------------------- [~jmakai] [~dionusos] Sorry for the slow response. The change looks good to me. If it was tested with the same config where the original code and my partial fix gave OOM and this one works, then we are good to go. > [coordinator] LAST_ONLY and NONE execution modes: possible OutOfMemoryError > when there are too many coordinator actions to materialize > -------------------------------------------------------------------------------------------------------------------------------------- > > Key: OOZIE-3254 > URL: https://issues.apache.org/jira/browse/OOZIE-3254 > Project: Oozie > Issue Type: Bug > Components: coordinator > Affects Versions: 5.0.0 > Reporter: Andras Piros > Assignee: Janos Makai > Priority: Major > Attachments: OOZIE-3254-002.patch, OOZIE-3254-01-wip.patch > > > If there is a coordinator job defined with a {{frequency}} by the minute > (e.g. {{frequency="* * * * *"}}), and {{start-time}} lies well in the past, > and the coordinator job's {{execution-mode}} is {{LAST_ONLY}} or {{NONE}}, it > can happen that too many {{CoordinatorActionBean}} instances are kept on JVM > heap within {{CoordMaterializeTransitionXCommand#insertList}} as those > execution modes [*omit the check for the {{throttle}} > value*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java#L439-L443]. > As a consequence, we can see as many as multiple hundred thousands of log > entries [*trying to increase > {{CoordMaterializeTransitionXCommand#insertList}}*|https://github.com/apache/oozie/blob/master/core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java#L560-L566]: > {noformat} > [user@host ~]$ grep 'In storeToDB() coord action id' > /var/log/oozie/oozie-HOSTNAME.log.out | wc -l > 478408 > {noformat} > A much worse consequence is that those {{CoordinatorActionBean}} instances > are attached to GC root (the {{insertList}} itself), and thus, JVM is unable > to free them until a consequent call to {{insertList.clear()}}. This will > result in {{OutOfMemoryError}} occurrence in worst case. > {{CoordMaterializeTransitionXCommand#insertList}} should be watched for a > configurable limit parameter (default value something like 1000), and > persisted / cleared when that limit is reached. -- This message was sent by Atlassian Jira (v8.20.10#820010)