Re: Batch Job in a Flink 1.9 Standalone Cluster

Timothy Victor Sat, 12 Oct 2019 01:39:57 -0700

This part about the GC not cleaning up after the job finishes makes sense.
 However, I o served that even after I run a "jcmd <pid> GC.run" on the
task manager process ID the memory is still not released.  This is what
concerns me.


Tim


On Sat, Oct 12, 2019, 2:53 AM Xintong Song <tonysong...@gmail.com> wrote:

> Generally yes, with one slight difference.
>
> Once the job is done, the buffer is released by flink task manager
> (because pre-allocation is configured to be disabled), but the
> corresponding memory may not be released by jvm (because no GC cleans it).
> So it's not the task manager that keeps the buffer to be used for the next
> batch job. When the new batch job is running, the task executor allocates
> new buffers, which will use the memory of the previous buffer that jvm
> haven't released.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Sat, Oct 12, 2019 at 7:28 AM Timothy Victor <vict...@gmail.com> wrote:
>
>> Thanks Xintong!   In my case both of those parameters are set to false
>> (default).  I think I am sort of following what's happening here.
>>
>> I have one TM with heap size set to 1GB.  When the cluster is started the
>> TM doesn't use that 1GB (no allocations).  Once the first batch job is
>> submitted I can see the memory roughly go up by 1GB.   I presume this is
>> when TM allocates its 1GB on the heap, and if I read correctly this is
>> essentially a large byte buffer that is tenured so that it is never GCed.
>> Flink writes any pojos (serializes) to this byte buffer and this is to
>> essentially circumvent GC for performance.   Once the job is done, this
>> byte buffer remains on the heap, and the task manager keeps it to use for
>> the next batch job.  This is why I never see the memory go down after a
>> batch job is complete.
>>
>> Does this make sense?  Please let me know what you think.
>>
>> Thanks
>>
>> Tim
>>
>> On Thu, Oct 10, 2019, 11:16 PM Xintong Song <tonysong...@gmail.com>
>> wrote:
>>
>>> I think it depends on your configurations.
>>> - Are you using on-heap/off-heap managed memory? (configured by
>>> 'taskmanager.memory.off-heap', by default is false)
>>>
>>> - Is managed memory pre-allocated? (configured by
>>> 'taskmanager.memory.preallocate', by default is ffalse)
>>>
>>>
>>> If managed memory is pre-allocated, then the allocated memory segments
>>> will never be released. If it's not pre-allocated, memory segments should
>>> be released when the task is finished, but the actual memory will not be
>>> de-allocated until next GC. Since the job is finished, there may not be
>>> enough heap activities to trigger the GC. If on-heap memory is used, you
>>> may not be able to observe the decreasing of TM memory usage, because JVM
>>> heap size does not scale down. Only if off-heap memory is used, you might
>>> be able to observe the decreasing of TM memory usage after a GC, but not
>>> from a jmap dump because jmap dumps heap memory usage only.
>>>
>>>
>>> Besides, I don't think you need to worry about whether memory is
>>> released after one job is finished. Sometimes flink/jvm do not release
>>> memory after jobs/tasks finished, so that it can be reused directly by
>>> other jobs/tasks, for the purpose of reducing allocate/deallocated
>>> overheads and optimizing performance.
>>>
>>>
>>> Thank you~
>>>
>>> Xintong Song
>>>
>>>
>>>
>>> On Thu, Oct 10, 2019 at 7:55 PM Timothy Victor <vict...@gmail.com>
>>> wrote:
>>>
>>>> After a batch job finishes in a flink standalone cluster, I notice that
>>>> the memory isn't freed up.   I understand Flink uses it's own memory
>>>> manager and just allocates a large tenured byte array that is not GC'ed.
>>>>  But does the memory used in this byte array get released when the batch
>>>> job is done?
>>>>
>>>> The scenario I am facing is that I am running a series of scheduled
>>>> batch jobs on a standalone cluster with 1 TM and 1 Slot.  I notice that
>>>> after a job is complete the memory used in the TM isn't freed up.  I can
>>>> confirm this by running  jmap dump.
>>>>
>>>> Has anyone else run into this issue?   This is on 1.9.
>>>>
>>>> Thanks
>>>>
>>>> Tim
>>>>
>>>

Re: Batch Job in a Flink 1.9 Standalone Cluster

Reply via email to