ASF GitHub Bot commented on FLINK-3322:

Github user ramkrish86 commented on the issue:

    Thank you. I agree this is a super important piece of FLINK where the 
memory is involved and that is one reason why I wanted to know its managment 
and usage and wanted to work on this. I have already split this into 2 tasks. 
One that deals with sorters and another is with the Iterators.
    In an iterative task I found that memory is being allocated with the 
iterators and then with the sorters.
    This PR is only for sorters.
    the other one is for the iterators 
    @ggevay  is sheperding it and helping me out with the idea of making 
ResettableDrivers which I was not fully sure if it can be done.
    I had added some docs to the JIRA itself about splitting up the tasks and 
what is being addressed here and how it can be addressed.
    `SorterMemoryAllocators` are nothing but just holders of the memory 
segments that needs to be passed to the sorters. They hold the memory so that 
the read buffers, write buffers and large memory buffers can pull in memory and 
put back the memory to them at the end of each iteration. I initially 
understood that changing this should have minimal impact on other areas which 
are not impacted without which it is going to be difficult to review. 
    As said above it was just to show the intention of the changes and I am 
very much aware that these change needs time to get a thorough review and any 
design needs to be focussed on future enhancements as I had mentioned in the 
doc. A PR will always help to understand better because the changes are in code.

> MemoryManager creates too much GC pressure with iterative jobs
> --------------------------------------------------------------
>                 Key: FLINK-3322
>                 URL: https://issues.apache.org/jira/browse/FLINK-3322
>             Project: Flink
>          Issue Type: Bug
>          Components: Local Runtime
>    Affects Versions: 1.0.0
>            Reporter: Gabor Gevay
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 1.0.0
>         Attachments: FLINK-3322.docx, FLINK-3322_reusingmemoryfordrivers.docx
> When taskmanager.memory.preallocate is false (the default), released memory 
> segments are not added to a pool, but the GC is expected to take care of 
> them. This puts too much pressure on the GC with iterative jobs, where the 
> operators reallocate all memory at every superstep.
> See the following discussion on the mailing list:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Memory-manager-behavior-in-iterative-jobs-tt10066.html
> Reproducing the issue:
> https://github.com/ggevay/flink/tree/MemoryManager-crazy-gc
> The class to start is malom.Solver. If you increase the memory given to the 
> JVM from 1 to 50 GB, performance gradually degrades by more than 10 times. 
> (It will generate some lookuptables to /tmp on first run for a few minutes.) 
> (I think the slowdown might also depend somewhat on 
> taskmanager.memory.fraction, because more unused non-managed memory results 
> in rarer GCs.)

This message was sent by Atlassian JIRA

Reply via email to