[
https://issues.apache.org/jira/browse/FLINK-31092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693779#comment-17693779
]
luoyuxia commented on FLINK-31092:
----------------------------------
Try to analyze the heap dump, I found most of object will be
`ServiceLoaderUtil#LoadResult(IllegalStateException)`, and the exception
message is `
Trying to access closed classloader.xxxx`. I think that's the cause of OOM.
From the code:
{code:java}
static <T> List<LoadResult<T>> load(Class<T> clazz, ClassLoader classLoader) {
List<LoadResult<T>> loadResults = new ArrayList<>();
Iterator<T> serviceLoaderIterator = ServiceLoader.load(clazz,
classLoader).iterator();
while (true) {
try {
T next = serviceLoaderIterator.next();
loadResults.add(new LoadResult<>(next));
} catch (NoSuchElementException e) {
break;
} catch (Throwable t) {
loadResults.add(new LoadResult<>(t));
}
}
return loadResults;
} {code}
Seems it'll then loop indefinitely when `serviceLoaderIterator.next()` throw
exception other than NoSuchElementException. And it'll add more and more
`LoadResult` util OOM.
And the stack where OOM happens is as follows:
{code:java}
at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:48)
at
org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.ensureInner(FlinkUserCodeClassLoaders.java:179)
at
org.apache.flink.util.FlinkUserCodeClassLoaders$SafetyNetWrapperClassLoader.getResources(FlinkUserCodeClassLoaders.java:213)
at
java.util.ServiceLoader$LazyIterator.hasNextService(ServiceLoader.java:348)
at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:364)
at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
at
org.apache.flink.table.factories.ServiceLoaderUtil.load(ServiceLoaderUtil.java:42)
at
org.apache.flink.table.factories.FactoryUtil.discoverFactories(FactoryUtil.java:805)
at
org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:524)
at
org.apache.flink.table.factories.PlannerFactoryUtil.createPlanner(PlannerFactoryUtil.java:45)
at
org.apache.flink.table.gateway.service.operation.OperationExecutor.createStreamTableEnvironment(OperationExecutor.java:375)
at
org.apache.flink.table.gateway.service.operation.OperationExecutor.getTableEnvironment(OperationExecutor.java:332)
at
org.apache.flink.table.gateway.service.operation.OperationExecutor.executeStatement(OperationExecutor.java:190)
at
org.apache.flink.table.gateway.service.SqlGatewayServiceImpl.lambda$executeStatement$1(SqlGatewayServiceImpl.java:212)
at
org.apache.flink.table.gateway.service.SqlGatewayServiceImpl$$Lambda$1007.apply(<unknown
string>)
at
org.apache.flink.table.gateway.service.operation.OperationManager.lambda$submitOperation$1(OperationManager.java:110)
at
org.apache.flink.table.gateway.service.operation.OperationManager$$Lambda$1008.call(<unknown
string>)
at
org.apache.flink.table.gateway.service.operation.OperationManager$Operation.lambda$run$0(OperationManager.java:242)
at
org.apache.flink.table.gateway.service.operation.OperationManager$Operation$$Lambda$1010.run(<unknown
string>)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748){code}
[~fsk119] Could you please have a look? Is it possible it'll access a closed
classloader?
> Hive ITCases fail with OutOfMemoryError
> ---------------------------------------
>
> Key: FLINK-31092
> URL: https://issues.apache.org/jira/browse/FLINK-31092
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Hive
> Affects Versions: 1.17.0
> Reporter: Matthias Pohl
> Assignee: luoyuxia
> Priority: Critical
> Labels: test-stability
> Attachments: VisualVM-FLINK-31092.png
>
>
> We're experiencing a OutOfMemoryError where the heap space reaches the upper
> limit:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=46161&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=23142
> {code}
> Feb 15 05:05:14 [INFO] Running
> org.apache.flink.table.catalog.hive.HiveCatalogITCase
> Feb 15 05:05:17 [INFO] java.lang.OutOfMemoryError: Java heap space
> Feb 15 05:05:17 [INFO] Dumping heap to java_pid9669.hprof ...
> Feb 15 05:05:28 [INFO] Heap dump file created [1957090051 bytes in 11.718
> secs]
> java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.maven.surefire.booter.ForkedBooter.cancelPingScheduler(ForkedBooter.java:209)
> at
> org.apache.maven.surefire.booter.ForkedBooter.acknowledgedExit(ForkedBooter.java:419)
> at
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:186)
> at
> org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:562)
> at
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:548)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)