[ 
https://issues.apache.org/jira/browse/FLINK-21437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17288356#comment-17288356
 ] 

Xintong Song edited comment on FLINK-21437 at 4/9/21, 1:46 AM:
---------------------------------------------------------------

[~qccash],

It seems this is a long-standing and well known JDK bug.
https://bugs.openjdk.java.net/browse/JDK-4872014

>From the Apache Flink side, I don't see much thing we can do about it. If the 
>JDK team is not fixing it, maybe try to fire an issue in the Apache Hadoop 
>community, to make {{LocalDirAllocator}} maintains shutdown hooks itself 
>instead of relying on {{File#deleteOnExit}}?


was (Author: xintongsong):
[~qccash],

It seems this is a long-standing and well known JDF bug.
https://bugs.openjdk.java.net/browse/JDK-4872014

>From the Apache Flink side, I don't see much thing we can do about it. If the 
>JDK team is not fixing it, maybe try to fire an issue in the Apache Hadoop 
>community, to make {{LocalDirAllocator}} maintains shutdown hooks itself 
>instead of relying on {{File#deleteOnExit}}?

> Memory leak when using filesystem state backend on Alibaba Cloud OSS
> --------------------------------------------------------------------
>
>                 Key: FLINK-21437
>                 URL: https://issues.apache.org/jira/browse/FLINK-21437
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / State Backends
>            Reporter: Qian Chao
>            Priority: Minor
>
> When using filesystem state backend, and storing checkpoints on Alibaba Cloud 
> OSS
> flink-conf.yaml:
> {code:java}
> state.backend: filesystem
> state.checkpoints.dir: oss://yourBucket/checkpoints
> fs.oss.endpoint: xxxxx
> fs.oss.accessKeyId: xxxxx
> fs.oss.accessKeySecret: xxxxx{code}
> A memory leak (both jobmanager and taskmanager) would occur after a period of 
> time, objects retained in jvm heap like:
> {code:java}
> The class "java.io.DeleteOnExitHook", loaded by "<system class loader>", 
> occupies 1,018,323,960 (96.47%) bytes. The memory is accumulated in one 
> instance of "java.util.LinkedHashMap", loaded by "<system class loader>", 
> which occupies 1,018,323,832 (96.47%) bytes.
> {code}
>  
> The root cause should be that when using flink-oss-fs-hadoop to upload file 
> to OSS, OSSFileSystem will create temporary file, and deleteOnExit, so 
> LinkedHashSet<String> files in DeleteOnExitHook will get bigger and bigger.
> {code:java}
> org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem::create
> -> 
> org.apache.hadoop.fs.aliyun.oss.AliyunOSSOutputStream::new 
> -> 
> dirAlloc.createTmpFileForWrite("output-", -1L, conf) 
> -> 
> org.apache.hadoop.fs.LocalDirAllocator::createTmpFileForWrite 
> -> 
> result.deleteOnExit()
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to