Hi Gabor and Marton, the taskmanager.memory.preallocate switch basically replaces Flink's streaming mode. The current stream runtime code does not operate on managed memory. Hence, all memory allocated by the memory manager cannot be used for streaming jobs and is "lost". If the switch is set to false, memory is requested as it is required by operators as you observed. Setting the parameter to true is the original behavior and does not have any downside effects for batch programs.
The effect of the switch on the performance of iterative jobs is interesting and it sounds like it should be improved. Best, Fabian 2016-01-30 14:04 GMT+01:00 Gábor Gévay <gga...@gmail.com>: > Hello! > > We have a strangely behaving iterative Flink job: when we give it more > memory, it gets much slower (more than 10 times). The problem seems to > be mostly caused by GCs. Enabling object reuse didn’t help. > > With some profiling and debugging, we traced the problem to the > operators requesting new memory segments from the memory manager at > every superstep of the iteration, and the memory manager satisfying > these requests by allocating new memory segments from the Java heap > [1], and then the old ones have to be eventually reclaimed by garbage > collections. We found the option “taskmanager.memory.preallocate”, > which mostly solved the GC problem, but we would like to understand > the situation better. > > What is the reason for the default value of this setting being false? > Is there a downside to enabling this option? If the only downside is > the slower startup of the task managers, then we could have the best > of both worlds, by modifying the logic of the memory manager to use > pooling only after releases. I mean the memory manager would give the > segments back to the pool when the operators release them even when > “preallocate” is false, and then `allocatePages` would use a new > method of the memory pool, which would first check if there are > segments in the pool and calls `allocateNewSegment` or > `requestSegmentFromPool` accordingly. (Instead of the current > behaviour, which is to basically disable pooling, when the > “preallocate” setting is false.) > > [1] > https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/memory/MemoryManager.java#L293-L307 > > Best, > Gábor and Márton >