[ 
https://issues.apache.org/jira/browse/FLINK-15239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997205#comment-16997205
 ] 

Rui Li commented on FLINK-15239:
--------------------------------

I thought we could find this {{StatisticsDataReferenceCleaner}} thread by name 
and interrupt it at the end of the job. However, in earlier Hadoop versions 
(like the one depended upon by Hive), this thread catches all throwables and 
does not respond to an interrupt exception :(

> TM Metaspace memory leak
> ------------------------
>
>                 Key: FLINK-15239
>                 URL: https://issues.apache.org/jira/browse/FLINK-15239
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Runtime
>            Reporter: Rui Li
>            Priority: Blocker
>             Fix For: 1.10.0
>
>
> Start a standalone cluster and then submit multiple queries for Hive tables 
> via SQL CLI. Hive connector dependencies are specified via the {{library}} 
> option. TM will fail eventually with:
> {noformat}
> 2019-12-13 15:11:03,698 INFO  org.apache.flink.runtime.taskmanager.Task       
>               - Source: Values(tuples=[[{ 4.3 }]], values=[EXPR$0]) -> 
> SinkConversionToRow -> Sink: Unnamed (1/1) (b9f9667f686fd97c1c5af65b8b163c44) 
> switched from RUNNING to FAILED.
> java.lang.OutOfMemoryError: Metaspace
>         at java.lang.ClassLoader.defineClass1(Native Method)
>         at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
>         at 
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>         at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
>         at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
>         at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at 
> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:60)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         at 
> org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:439)
>         at 
> org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:324)
>         at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
>         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:687)
>         at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:628)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
>         at 
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
>         at 
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2701)
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2683)
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372)
>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
>         at 
> org.apache.flink.connectors.hive.HadoopFileSystemFactory.create(HadoopFileSystemFactory.java:46)
>         at 
> org.apache.flink.table.filesystem.PartitionTempFileManager.<init>(PartitionTempFileManager.java:73)
>         at 
> org.apache.flink.table.filesystem.FileSystemOutputFormat.open(FileSystemOutputFormat.java:104)
>         at 
> org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction.open(OutputFormatSinkFunction.java:65)
>         at 
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
>         at 
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:102)
>         at 
> org.apache.flink.streaming.api.operators.StreamSink.open(StreamSink.java:48)
>         at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeStateAndOpen(StreamTask.java:1018)
> {noformat}
> Even for the succeeded queries, TM prints the following errors:
> {noformat}
> Exception in thread "LeaseRenewer:lirui@localhost:8020" 
> java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/LeaseRenewer$2
>         at org.apache.hadoop.hdfs.LeaseRenewer.renew(LeaseRenewer.java:412)
>         at org.apache.hadoop.hdfs.LeaseRenewer.run(LeaseRenewer.java:448)
>         at 
> org.apache.hadoop.hdfs.LeaseRenewer.access$700(LeaseRenewer.java:71)
>         at org.apache.hadoop.hdfs.LeaseRenewer$1.run(LeaseRenewer.java:304)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hdfs.LeaseRenewer$2
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at 
> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:60)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         ... 5 more
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to