[ 
https://issues.apache.org/jira/browse/FLINK-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167377#comment-15167377
 ] 

Gabor Gevay commented on FLINK-3322:
------------------------------------

I have constructed a much simpler example to demonstrate the problem: 
ConnectedComponents on a graph that is an 1000 length path: 1->2, 2->3, 3->4, 
4->5, ... 999->1000:
https://github.com/ggevay/flink/tree/MemoryManager-crazy-gc-2
The class to run is org.apache.flink.graph.example.ConnectedComponents. Try 
increasing the memory from eg. 500m to 5000m, and look at the difference when  
taskmanager.memory.preallocate is true and false (TaskManager.scala:1713).
I measured the following times on my laptop:
false, 500m:  14s
false, 5000m: 115s
true, 500m:   8s
true, 5000m:  13s

(I guess that the difference between the two runs where preallocate is true is 
due to the time it takes for the JVM to allocate the memory once, but this 
should also be checked that it isn't for some other unexpected reason.)

So the bottom line is that the problem gets worse when there are more 
iterations. (We have 1001 iterations in the linked example.)

> MemoryManager creates too much GC pressure with iterative jobs
> --------------------------------------------------------------
>
>                 Key: FLINK-3322
>                 URL: https://issues.apache.org/jira/browse/FLINK-3322
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Runtime
>    Affects Versions: 1.0.0
>            Reporter: Gabor Gevay
>            Priority: Critical
>             Fix For: 1.0.0
>
>
> When taskmanager.memory.preallocate is false (the default), released memory 
> segments are not added to a pool, but the GC is expected to take care of 
> them. This puts too much pressure on the GC with iterative jobs, where the 
> operators reallocate all memory at every superstep.
> See the following discussion on the mailing list:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Memory-manager-behavior-in-iterative-jobs-tt10066.html
> Reproducing the issue:
> https://github.com/ggevay/flink/tree/MemoryManager-crazy-gc
> The class to start is malom.Solver. If you increase the memory given to the 
> JVM from 1 to 50 GB, performance gradually degrades by more than 10 times. 
> (It will generate some lookuptables to /tmp on first run for a few minutes.) 
> (I think the slowdown might also depend somewhat on 
> taskmanager.memory.fraction, because more unused non-managed memory results 
> in rarer GCs.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to