[
https://issues.apache.org/jira/browse/FLINK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180311#comment-17180311
]
Echo Lee commented on FLINK-16408:
----------------------------------
[~trohrmann] Yes, The problem I currently encounter is that Metaspace OOM
causes Taskmanager to shut down.
Here is the error log:
{code:java}
Fatal error occurred while executing the TaskManager. Shutting it down...
java.lang.OutOfMemoryError: Metaspace. The metaspace out-of-memory error has
occurred. This can mean two things: either the job requires a larger size of
JVM metaspace to load classes or there is a class loading leak. In the first
case 'taskmanager.memory.jvm-metaspace.size' configuration option should be
increased. If the error persists (usually in cluster after several job
(re-)submissions) then there is probably a class loading leak in user code or
some of its dependencies which has to be investigated and fixed. The task
executor has to be shutdown...
{code}
> Bind user code class loader to lifetime of a slot
> -------------------------------------------------
>
> Key: FLINK-16408
> URL: https://issues.apache.org/jira/browse/FLINK-16408
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Affects Versions: 1.9.2, 1.10.0
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: Metaspace-OOM.png
>
>
> In order to avoid class leaks due to creating multiple user code class
> loaders and loading class multiple times in a recovery case, I would suggest
> to bind the lifetime of a user code class loader to the lifetime of a slot.
> More precisely, the user code class loader should live at most as long as the
> slot which is using it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)