[
https://issues.apache.org/jira/browse/UIMA-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lou DeGenaro updated UIMA-3659:
-------------------------------
Fix Version/s: 1.1.0-Ducc
> DUCC Job Driver (JD) OOMs when Total number of work items is large
> ------------------------------------------------------------------
>
> Key: UIMA-3659
> URL: https://issues.apache.org/jira/browse/UIMA-3659
> Project: UIMA
> Issue Type: Bug
> Components: DUCC
> Affects Versions: 1.0.0-Ducc
> Reporter: Lou DeGenaro
> Assignee: Lou DeGenaro
> Fix For: 1.1.0-Ducc
>
>
> A Job of 300,000+ Total work items failed with Reason Premature after
> processing 70,000+ of them.
> The Job Driver (JD) maintains a file in the user's log directory named
> work-item-status.json.gz comprising the information shown by the WebServer on
> the Work Items tab of the Job Details page. As each work item is processed,
> the JD's WorkItemStateManager (WiSm) maintains an in-memory representation
> for Id, Node, PID, State, Start and End times. Periodically, the JD employs
> the WiSm's export method to re-write the above file.
> Although the amount of information is relatively small per work item, when
> the number of work items is large the amount of memory consumed is large
> since these in-memory representations are kept for the lifetime of the Job.
> To alleviate this "designed-in" memory leak, the WiSm should only keep
> active work items in-memory. Terminal work items should be flushed to disk.
> This change will affect DUCC components that employ WiSm, specifically CLI,
> WS and JD.
--
This message was sent by Atlassian JIRA
(v6.2#6252)