When the job shuts down, the user code classloader is cleared which removes the
possibility to load new classes. The LoadingCache in JobBundleFactory attempts
to load the RemovalCause class after job shutdown to evict the cache. This
results in the following exception which prevents cleanup of Docker containers:
```
2018-09-06 15:37:07,996 ERROR
org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory
- Unable to close.
java.lang.NoClassDefFoundError:
org/apache/beam/repackaged/beam_runners_java_fn_execution/com/google/common/cache/RemovalCause
at
org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.cache.LocalCache$Segment.clear(LocalCache.java:3290)
at
org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.cache.LocalCache.clear(LocalCache.java:4322)
at
org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.cache.LocalCache$LocalManualCache.invalidateAll(LocalCache.java:4937)
at
org.apache.beam.runners.fnexecution.control.JobBundleFactoryBase.close(JobBundleFactoryBase.java:186)
at
org.apache.beam.runners.flink.translation.functions.FlinkBatchExecutableStageContext.close(FlinkBatchExecutableStageContext.java:68)
at
org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.closeActual(ReferenceCountingFlinkExecutableStageContextFactory.java:186)
at
org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory$WrappedContext.access$200(ReferenceCountingFlinkExecutableStageContextFactory.java:162)
at
org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory.release(ReferenceCountingFlinkExecutableStageContextFactory.java:150)
at
org.apache.beam.runners.flink.translation.functions.ReferenceCountingFlinkExecutableStageContextFactory.lambda$scheduleRelease$1(ReferenceCountingFlinkExecutableStageContextFactory.java:110)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException:
org.apache.beam.repackaged.beam_runners_java_fn_execution.com.google.common.cache.RemovalCause
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at
org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders$ChildFirstClassLoader.loadClass(FlinkUserCodeClassLoaders.java:129)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 16 more
```
The solution for now is to attempt to cleanup the cache synchronously when
closing the JobBundleFactory.
CC @angoenka @tweise
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
--- | --- | --- | --- | --- | --- | --- | ---
Go | [](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
| --- | --- | --- | --- | --- | ---
Java | [](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
| --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
</br> [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| --- | --- | --- | ---
[ Full content available at: https://github.com/apache/beam/pull/6342 ]
This message was relayed via gitbox.apache.org for [email protected]