[ 
https://issues.apache.org/jira/browse/FLINK-24401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

future updated FLINK-24401:
---------------------------
    Attachment:     (was: image-2021-09-29-11-45-48-098.png)

> TM cannot exit after Metaspace OOM
> ----------------------------------
>
>                 Key: FLINK-24401
>                 URL: https://issues.apache.org/jira/browse/FLINK-24401
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Task
>    Affects Versions: 1.12.0, 1.13.0
>            Reporter: future
>            Priority: Major
>             Fix For: 1.13.3, 1.14.1
>
>         Attachments: image-2021-09-29-11-47-47-157.png
>
>
> Hi masters, from the code and log, we can see that OOM will terminateJVM 
> directly, but Metaspace OutOfMemoryError will graceful shutdown. The code 
> comment mentions: {{_it does not usually require more class loading to fail 
> again with the Metaspace OutOfMemoryError_.}}.
> But we encountered: after Metaspace OutOfMemoryError, 
> {{_java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner$Result_.}}, makes Tm 
> unable to exit, keeps trying again, keeps NoClassDefFoundError, keeps class 
> loading failure, until kill tm by manually.
> I want to add a catch Throwable in the onFatalError method, and directly 
> terminateJVM() in the catch. Is there any problem with this strategy? 
>  
> [code link 
> |https://github.com/apache/flink/blob/4fe9f525a92319acc1e3434bebed601306f7a16f/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskManagerRunner.java#L312]
> picture:
> !image-2021-09-29-11-45-48-098.png|width=663,height=343!
>  
> !image-2021-09-29-11-47-47-157.png!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to