[ 
https://issues.apache.org/jira/browse/BEAM-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17002682#comment-17002682
 ] 

sunjincheng commented on BEAM-9030:
-----------------------------------

Discussion details can be found here: 
[https://lists.apache.org/thread.html/ef5b24766d94d3d389bee9c03e59003b9cf417c81cde50ede5856ad7%40%3Cdev.beam.apache.org%3E]

> Metaspace memory leak when running python jobs with flink runner
> ----------------------------------------------------------------
>
>                 Key: BEAM-9030
>                 URL: https://issues.apache.org/jira/browse/BEAM-9030
>             Project: Beam
>          Issue Type: Bug
>          Components: java-fn-execution, runner-flink
>            Reporter: sunjincheng
>            Assignee: sunjincheng
>            Priority: Major
>             Fix For: 2.19.0
>
>
> When submitting a Python word count job to a Flink session/standalone cluster 
> repeatedly, the meta space usage of the task manager of the Flink cluster 
> will continuously increase (about 40MB each time). The reason is that the 
> Beam classes are loaded with the user class loader in Flink and there are 
> problems with the implementation of `ProcessManager`(from Beam) and 
> `ThreadPoolCache`(from netty) which may cause the user class loader could not 
> be garbage collected even after the job finished which causes the meta space 
> memory leak eventually. You can refer to FLINK-15338[1] for more information.
> Regarding to `ProcessManager`, I have created a JIRA BEAM-9006[2] to track 
> it. Regarding to `ThreadPoolCache`, it is a Netty problem and has been fixed 
> in NETTY#8955[3]. Netty 4.1.35 Final has already included this fix and GRPC 
> 1.22.0 has already dependents on Netty 4.1.35 Final. So we need to bump the 
> version of GRPC to 1.22.0+ (currently 1.21.0).
>  
> What do you think?
> [1] https://issues.apache.org/jira/browse/FLINK-15338
> [2] https://issues.apache.org/jira/browse/BEAM-9006
> [3] [https://github.com/netty/netty/pull/8955]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to