[ 
https://issues.apache.org/jira/browse/FLINK-21419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17301408#comment-17301408
 ] 

Kezhu Wang commented on FLINK-21419:
------------------------------------

Glad to see, we all lean to fail concurrent free by default, at least in dev. I 
guess core dump is most notable and destructive failure:(. I tried to throw 
exception in my local env, it does not work at least for FLINK-21728 as many 
exceptions are just swallowed in closing phase. I think we could write some 
code to print stack traces if concurrent free detected just like [~xintongsong] 
has do in FLINK-21728.
{quote}the original purpose of using GC cleaner was to rely on both explicit 
segment free and GC for releasing the memory, while make sure the memory is 
released only once.
{quote}
I actually tracked this story a bit in weekend. This changed in 
FLINK-14894(superseded by FLINK-15758 later) where there is a conclusion at 
that time:
{quote}The conclusion at the moment is that release unsafe memory, while 
potentially having link on it in Java code, is dangerous.
{quote}
 
 The fact does not changed.

I list existing options here:
 * Rely on manually free and safety net for leaking detection. We are currently 
going this way. It is also adopted before FLINK-14894. It might be dangerous if 
there are mistakes in free timing.
 * Totally rely on gc to reclaim native memory. It is adopted in FLINK-15758. 
This is the safest version but the reclaiming is not as instant as manual 
counterpart.

I like #1, but the diagnosis/fix to potential mistake could be time consuming. 
I wonder wether we could combine the two by introducing an option to switch 
between. So that if bad things happen in #1, users could circumvent it through 
#2 for a while.

If we are going solely #1, I also think {{finalize}} should match our target.

 

> Remove GC cleaner mechanism for unsafe memory segments
> ------------------------------------------------------
>
>                 Key: FLINK-21419
>                 URL: https://issues.apache.org/jira/browse/FLINK-21419
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>            Reporter: Xintong Song
>            Assignee: Nicholas Jiang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.13.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to