Re: Batch Job in a Flink 1.9 Standalone Cluster

Roman Grebennikov Mon, 14 Oct 2019 01:09:15 -0700

Forced GC does not mean that JVM will even try to release the freed memory back 
to the operating system. This highly depends on the JVM and garbage collector 
used for your Flink setup, but most probably it's the jvm8 with the ParallelGC 
collector.


ParallelGC is known to be not that aggressive on releasing free heap memory 
back to OS. I see here multiple different solutions:
1. Question yourself why do you really need to release any memory back? Is 
there a logical reason behind it? As next time you resubmit the job, the memory 
is going to be reused.
2. You can switch to G1GC and use JVM args like "-XX:MaxHeapFreeRatio 
-XX:MinHeapFreeRatio" to make it more aggressive on releasing memory.
3. You can use unofficial JVM builds from RedHat with ShenandoahGC backport, 
which is also able to do the job: 
https://builds.shipilev.net/openjdk-shenandoah-jdk8/
3. Flink 1.10 (hopefully) will be able to run on jvm11, so G1 on it is much 
more aggressive on releasing memory: 
https://bugs.openjdk.java.net/browse/JDK-8146436

Roman Grebennikov | g...@dfdx.me


On Sat, Oct 12, 2019, at 08:38, Timothy Victor wrote:
> This part about the GC not cleaning up after the job finishes makes sense. 
> However, I o served that even after I run a "jcmd <pid> GC.run" on the task 
> manager process ID the memory is still not released. This is what concerns me.
> 
> Tim
> 
> 
> On Sat, Oct 12, 2019, 2:53 AM Xintong Song <tonysong...@gmail.com> wrote:
>> Generally yes, with one slight difference. 
>> 
>> Once the job is done, the buffer is released by flink task manager (because 
>> pre-allocation is configured to be disabled), but the corresponding memory 
>> may not be released by jvm (because no GC cleans it). So it's not the task 
>> manager that keeps the buffer to be used for the next batch job. When the 
>> new batch job is running, the task executor allocates new buffers, which 
>> will use the memory of the previous buffer that jvm haven't released.
>> 
>> Thank you~

>> Xintong Song

>> 

>> 
>> 
>> On Sat, Oct 12, 2019 at 7:28 AM Timothy Victor <vict...@gmail.com> wrote:
>>> Thanks Xintong! In my case both of those parameters are set to false 
>>> (default). I think I am sort of following what's happening here.
>>> 
>>> I have one TM with heap size set to 1GB. When the cluster is started the TM 
>>> doesn't use that 1GB (no allocations). Once the first batch job is 
>>> submitted I can see the memory roughly go up by 1GB. I presume this is when 
>>> TM allocates its 1GB on the heap, and if I read correctly this is 
>>> essentially a large byte buffer that is tenured so that it is never GCed. 
>>> Flink writes any pojos (serializes) to this byte buffer and this is to 
>>> essentially circumvent GC for performance. Once the job is done, this byte 
>>> buffer remains on the heap, and the task manager keeps it to use for the 
>>> next batch job. This is why I never see the memory go down after a batch 
>>> job is complete. 
>>> 
>>> Does this make sense? Please let me know what you think.
>>> 
>>> Thanks
>>> 
>>> Tim
>>> 
>>> On Thu, Oct 10, 2019, 11:16 PM Xintong Song <tonysong...@gmail.com> wrote:
>>>> I think it depends on your configurations.
>>>> - Are you using on-heap/off-heap managed memory? (configured by 
>>>> 'taskmanager.memory.off-heap', by default is false)
>>>> - Is managed memory pre-allocated? (configured by 
>>>> 'taskmanager.memory.preallocate', by default is ffalse)

>>>> 

>>>> If managed memory is pre-allocated, then the allocated memory segments 
>>>> will never be released. If it's not pre-allocated, memory segments should 
>>>> be released when the task is finished, but the actual memory will not be 
>>>> de-allocated until next GC. Since the job is finished, there may not be 
>>>> enough heap activities to trigger the GC. If on-heap memory is used, you 
>>>> may not be able to observe the decreasing of TM memory usage, because JVM 
>>>> heap size does not scale down. Only if off-heap memory is used, you might 
>>>> be able to observe the decreasing of TM memory usage after a GC, but not 
>>>> from a jmap dump because jmap dumps heap memory usage only.

>>>> 

>>>> Besides, I don't think you need to worry about whether memory is released 
>>>> after one job is finished. Sometimes flink/jvm do not release memory after 
>>>> jobs/tasks finished, so that it can be reused directly by other 
>>>> jobs/tasks, for the purpose of reducing allocate/deallocated overheads and 
>>>> optimizing performance.

>>>> 

>>>> Thank you~

>>>> Xintong Song

>>>> 

>>>> 
>>>> 
>>>> On Thu, Oct 10, 2019 at 7:55 PM Timothy Victor <vict...@gmail.com> wrote:
>>>>> After a batch job finishes in a flink standalone cluster, I notice that 
>>>>> the memory isn't freed up. I understand Flink uses it's own memory 
>>>>> manager and just allocates a large tenured byte array that is not GC'ed. 
>>>>> But does the memory used in this byte array get released when the batch 
>>>>> job is done?
>>>>> 
>>>>> The scenario I am facing is that I am running a series of scheduled batch 
>>>>> jobs on a standalone cluster with 1 TM and 1 Slot. I notice that after a 
>>>>> job is complete the memory used in the TM isn't freed up. I can confirm 
>>>>> this by running jmap dump.
>>>>> 
>>>>> Has anyone else run into this issue? This is on 1.9.
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> Tim

Re: Batch Job in a Flink 1.9 Standalone Cluster

Reply via email to