[
https://issues.apache.org/jira/browse/BEAM-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002682#comment-17002682
]
sunjincheng commented on BEAM-9030:
-----------------------------------
Discussion details can be found here:
[https://lists.apache.org/thread.html/ef5b24766d94d3d389bee9c03e59003b9cf417c81cde50ede5856ad7%40%3Cdev.beam.apache.org%3E]
> Metaspace memory leak when running python jobs with flink runner
> ----------------------------------------------------------------
>
> Key: BEAM-9030
> URL: https://issues.apache.org/jira/browse/BEAM-9030
> Project: Beam
> Issue Type: Bug
> Components: java-fn-execution, runner-flink
> Reporter: sunjincheng
> Assignee: sunjincheng
> Priority: Major
> Fix For: 2.19.0
>
>
> When submitting a Python word count job to a Flink session/standalone cluster
> repeatedly, the meta space usage of the task manager of the Flink cluster
> will continuously increase (about 40MB each time). The reason is that the
> Beam classes are loaded with the user class loader in Flink and there are
> problems with the implementation of `ProcessManager`(from Beam) and
> `ThreadPoolCache`(from netty) which may cause the user class loader could not
> be garbage collected even after the job finished which causes the meta space
> memory leak eventually. You can refer to FLINK-15338[1] for more information.
> Regarding to `ProcessManager`, I have created a JIRA BEAM-9006[2] to track
> it. Regarding to `ThreadPoolCache`, it is a Netty problem and has been fixed
> in NETTY#8955[3]. Netty 4.1.35 Final has already included this fix and GRPC
> 1.22.0 has already dependents on Netty 4.1.35 Final. So we need to bump the
> version of GRPC to 1.22.0+ (currently 1.21.0).
>
> What do you think?
> [1] https://issues.apache.org/jira/browse/FLINK-15338
> [2] https://issues.apache.org/jira/browse/BEAM-9006
> [3] [https://github.com/netty/netty/pull/8955]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)