yittg commented on issue #3776: URL: https://github.com/apache/iceberg/issues/3776#issuecomment-999427065
To be clear, i encountered this issue by putting the iceberg Flink runtime jar into the lib directory as [Flink suggested](https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/ops/debugging/debugging_classloading/#avoiding-dynamic-classloading-for-user-code). So i tested another scenario next, submit the iceberg jar as user code. As expected, it can work well for the first several jobs, then the followed job will fail on meta space OOM ``` java.lang.OutOfMemoryError: Metaspace. The metaspace out-of-memory error has occurred. This can mean two things: either the job requires a larger size of JVM metaspace to load classes or there is a class loading leak. In the first case 'taskmanager.memory.jvm-metaspace.size' configuration option should be increased. If the error persists (usually in cluster after several job (re-)submissions) then there is probably a class loading leak in user code or some of its dependencies which has to be investigated and fixed. The task executor has to be shutdown... ``` You can also see many `iceberg-worker-pool-0` threads because the worker pool can not be released along with the loaded user classes. So either we need pass a ExecutorService to the tasks from the Flink context, or provide a mechanism to release these threads after job complete, like making use of `RuntimeContext#registerUserCodeClassLoaderReleaseHookIfAbsent`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
