[jira] [Commented] (FLINK-16408) Bind user code class loader to lifetime of a slot

Echo Lee (Jira) Wed, 19 Aug 2020 00:00:53 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180311#comment-17180311
 ]


Echo Lee commented on FLINK-16408:
----------------------------------

[~trohrmann]  Yes, The problem I currently encounter is that Metaspace OOM 
causes Taskmanager to shut down.

Here is the error log:
{code:java}
Fatal error occurred while executing the TaskManager. Shutting it down...
java.lang.OutOfMemoryError: Metaspace. The metaspace out-of-memory error has 
occurred. This can mean two things: either the job requires a larger size of 
JVM metaspace to load classes or there is a class loading leak. In the first 
case 'taskmanager.memory.jvm-metaspace.size' configuration option should be 
increased. If the error persists (usually in cluster after several job 
(re-)submissions) then there is probably a class loading leak in user code or 
some of its dependencies which has to be investigated and fixed. The task 
executor has to be shutdown...
{code}

> Bind user code class loader to lifetime of a slot
> -------------------------------------------------
>
>                 Key: FLINK-16408
>                 URL: https://issues.apache.org/jira/browse/FLINK-16408
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.9.2, 1.10.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 1.11.0
>
>         Attachments: Metaspace-OOM.png
>
>
> In order to avoid class leaks due to creating multiple user code class 
> loaders and loading class multiple times in a recovery case, I would suggest 
> to bind the lifetime of a user code class loader to the lifetime of a slot. 
> More precisely, the user code class loader should live at most as long as the 
> slot which is using it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-16408) Bind user code class loader to lifetime of a slot

Reply via email to