yittg edited a comment on issue #3776:
URL: https://github.com/apache/iceberg/issues/3776#issuecomment-998502908


   After some digging, i find the direct cause is 
[that](https://github.com/apache/iceberg/blob/5009949ba4377ac5a8572ff7ae70e886c9e33bec/core/src/main/java/org/apache/iceberg/ManifestReader.java?#L100-L102):
   when we build a `AvroIterable`, we didn't set the class loader, then the 
default Thread#getContextClassLoader will be used.
   
   The task is running in a persist shared [thread 
pool](https://github.com/apache/iceberg/blob/5009949ba4377ac5a8572ff7ae70e886c9e33bec/core/src/main/java/org/apache/iceberg/util/ThreadPools.java?#L60-L62),
  whose context class loader will be closed in Flink TaskManager, after being 
used for the first time.
   
   However just setting the class loader manually is not enough, because others 
code may still use the context class loader, which can not be avoided.
   An example exception stack like following, which is caused by 
ServiceLoader#load in JDK, 
   <details>
   
   <summary>at javax.xml.parsers.FactoryFinder.findServiceProvider</summary>
   
   ```
   java.lang.IllegalStateException: Trying to access closed classloader. Please 
check if you store classloaders directly or indirectly in static fields. If the 
stacktrace suggests that the leak occurs
   in a third party library and cannot be fixed immediately, you can disable 
this check with the configuration 'classloader.check-leaked-classloader'.
           at 
org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:159)
           at 
org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResources(FlinkUserCodeClassLoaders.java:188)
           at 
java.util.ServiceLoader$LazyClassPathLookupIterator.nextProviderClass(ServiceLoader.java:1196)
           at 
java.util.ServiceLoader$LazyClassPathLookupIterator.hasNextService(ServiceLoader.java:1221)
           at 
java.util.ServiceLoader$LazyClassPathLookupIterator.hasNext(ServiceLoader.java:1265)
           at java.util.ServiceLoader$2.hasNext(ServiceLoader.java:1300)
           at java.util.ServiceLoader$3.hasNext(ServiceLoader.java:1385)
           at javax.xml.parsers.FactoryFinder$1.run(FactoryFinder.java:287)
           at java.security.AccessController.doPrivileged(Native Method)
           at 
javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:283)
           at javax.xml.parsers.FactoryFinder.find(FactoryFinder.java:261)
           at 
javax.xml.parsers.SAXParserFactory.newInstance(SAXParserFactory.java:147)
           at 
org.jdom.input.JAXPParserFactory.createParser(JAXPParserFactory.java:125)
           at jdk.internal.reflect.GeneratedMethodAccessor26.invoke(Unknown 
Source)
           at 
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:566)
           at org.jdom.input.SAXBuilder.createParser(SAXBuilder.java:585)
           at org.jdom.input.SAXBuilder.build(SAXBuilder.java:460)
           at org.jdom.input.SAXBuilder.build(SAXBuilder.java:807)
           at 
com.aliyun.oss.internal.ResponseParsers.getXmlRootElement(ResponseParsers.java:1015)
           at 
com.aliyun.oss.internal.ResponseParsers.parseListObjects(ResponseParsers.java:1028)
           at 
com.aliyun.oss.internal.ResponseParsers$ListObjectsReponseParser.parse(ResponseParsers.java:562)
           at 
com.aliyun.oss.internal.ResponseParsers$ListObjectsReponseParser.parse(ResponseParsers.java:556)
           at 
com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:152)
           at 
com.aliyun.oss.internal.OSSOperation.doOperation(OSSOperation.java:113)
           at 
com.aliyun.oss.internal.OSSBucketOperation.listObjects(OSSBucketOperation.java:421)
           at com.aliyun.oss.OSSClient.listObjects(OSSClient.java:445)
           at 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.listObjects(AliyunOSSFileSystemStore.java:434)
           at 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.getFileStatus(AliyunOSSFileSystem.java:273)
           at 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.create(AliyunOSSFileSystem.java:115)
           at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1118)
           at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1098)
           at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:987)
           at 
org.apache.iceberg.hadoop.HadoopOutputFile.createOrOverwrite(HadoopOutputFile.java:85)
           at 
org.apache.iceberg.avro.AvroFileAppender.<init>(AvroFileAppender.java:51)
           at org.apache.iceberg.avro.Avro$WriteBuilder.build(Avro.java:198)
           at 
org.apache.iceberg.ManifestWriter$V1Writer.newAppender(ManifestWriter.java:277)
           at org.apache.iceberg.ManifestWriter.<init>(ManifestWriter.java:58)
           at org.apache.iceberg.ManifestWriter.<init>(ManifestWriter.java:34)
           at 
org.apache.iceberg.ManifestWriter$V1Writer.<init>(ManifestWriter.java:256)
           at org.apache.iceberg.ManifestFiles.write(ManifestFiles.java:117)
           at 
org.apache.iceberg.SnapshotProducer.newManifestWriter(SnapshotProducer.java:370)
           at 
org.apache.iceberg.MergingSnapshotProducer$DataFileFilterManager.newManifestWriter(MergingSnapshotProducer.java:711)
           at 
org.apache.iceberg.ManifestFilterManager.filterManifestWithDeletedFiles(ManifestFilterManager.java:383)
           at 
org.apache.iceberg.ManifestFilterManager.filterManifest(ManifestFilterManager.java:308)
           at 
org.apache.iceberg.ManifestFilterManager.lambda$filterManifests$0(ManifestFilterManager.java:186)
           at 
org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:405)
           at org.apache.iceberg.util.Tasks$Builder.access$300(Tasks.java:71)
           at org.apache.iceberg.util.Tasks$Builder$1.run(Tasks.java:311)
           at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
           at java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
           at java.lang.Thread.run(Thread.java:829)
   ```
   </details>
   
   
   So looks like we should avoid using shared thread pool across different job.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to