[ 
https://issues.apache.org/jira/browse/FLINK-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Michels reassigned FLINK-1442:
----------------------------------

    Assignee: Max Michels

> Archived Execution Graph consumes too much memory
> -------------------------------------------------
>
>                 Key: FLINK-1442
>                 URL: https://issues.apache.org/jira/browse/FLINK-1442
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 0.9
>            Reporter: Stephan Ewen
>            Assignee: Max Michels
>
> The JobManager archives the execution graphs, for analysis of jobs. The 
> graphs may consume a lot of memory.
> Especially the execution edges in all2all connection patterns are extremely 
> many and add up in memory consumption.
> The execution edges connect all parallel tasks. So for a all2all pattern 
> between n and m tasks, there are n*m edges. For parallelism of multiple 100 
> tasks, this can easily reach 100k objects and more, each with a set of 
> metadata.
> I propose the following to solve that:
> 1.  Clear all execution edges from the graph (majority of the memory 
> consumers) when it is given to the archiver.
> 2. Have the map/list of the archived graphs behind a soft reference, to it 
> will be removed under memory pressure before the JVM crashes. That may remove 
> graphs from the history early, but is much preferable to the JVM crashing, in 
> which case the graph is lost as well...
> 3. Long term: The graph should be archived somewhere else. Somthing like the 
> History server used by Hadoop and Hive would be a good idea.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to