[
https://issues.apache.org/jira/browse/FLINK-16408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180363#comment-17180363
]
Till Rohrmann commented on FLINK-16408:
---------------------------------------
How many concurrent {{WordCount}} jobs do you have running when the cluster
fails? How long does a {{WordCount}} job take to execute? Maybe you could share
the cluster logs with us to see what is going on.
> Bind user code class loader to lifetime of a slot
> -------------------------------------------------
>
> Key: FLINK-16408
> URL: https://issues.apache.org/jira/browse/FLINK-16408
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Affects Versions: 1.9.2, 1.10.0
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Priority: Critical
> Labels: pull-request-available
> Fix For: 1.11.0
>
> Attachments: Metaspace-OOM.png
>
>
> In order to avoid class leaks due to creating multiple user code class
> loaders and loading class multiple times in a recovery case, I would suggest
> to bind the lifetime of a user code class loader to the lifetime of a slot.
> More precisely, the user code class loader should live at most as long as the
> slot which is using it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)