[
https://issues.apache.org/jira/browse/FLINK-20261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Till Rohrmann updated FLINK-20261:
----------------------------------
Component/s: Connectors / Common
> Uncaught exception in ExecutorNotifier due to split assignment broken by
> failed task
> ------------------------------------------------------------------------------------
>
> Key: FLINK-20261
> URL: https://issues.apache.org/jira/browse/FLINK-20261
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Common, Connectors / FileSystem
> Affects Versions: 1.12.0
> Reporter: Andrey Zagrebin
> Priority: Blocker
> Fix For: 1.12.0
>
>
> While trying to extend
> FileSourceTextLinesITCase::testContinuousTextFileSource with recovery test
> after TM failure (TestingMiniCluster::terminateTaskExecutor,
> [branch|https://github.com/azagrebin/flink/tree/FLINK-20118-it]) in
> FLINK-20118, I encountered the following case:
> * SourceCoordinatorContext::assignSplits schedules async assignment (all
> reader tasks alive)
> * call TestingMiniCluster::terminateTaskExecutor while doing writeFile in a
> loop of testContinuousTextFileSource
> * causes graceful TaskExecutor::onStop shutdown
> * causes TM/RM disconnect and failing slot allocations in JM by RM
> * eventually causes SourceCoordinatorContext::unregisterSourceReader
> * actual assignment starts (SourceCoordinatorContext::assignSplits:
> callInCoordinatorThread)
> * registeredReaders.containsKey(subtaskId) check fails (due to failed task)
> with IllegalArgumentException which is uncaught in single thread executor
> * forces ThreadPool to recreate the single thread
> * calls CoordinatorExecutorThreadFactory::newThread
> * fails expected condition of single thread creation with
> IllegalStateException which is uncaught
> * calls FatalExitExceptionHandler and exits JVM abruptly
> {code:java}
> [SourceCoordinator-Source: file-source] ERROR
> org.apache.flink.runtime.util.FatalExitExceptionHandler - FATAL: Thread
> 'SourceCoordinator-Source: file-source' produced an uncaught exception.
> Stopping the process...
> java.lang.IllegalStateException: Should never happen. This factory should
> only be used by a SingleThreadExecutor.
> at
> org.apache.flink.runtime.source.coordinator.SourceCoordinatorProvider$CoordinatorExecutorThreadFactory.newThread(SourceCoordinatorProvider.java:94)
> ~[classes/:?]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.<init>(ThreadPoolExecutor.java:619)
> ~[?:1.8.0_172]
> at
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:932)
> ~[?:1.8.0_172]
> at
> java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1025)
> ~[?:1.8.0_172]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
> ~[?:1.8.0_172]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> ~[?:1.8.0_172]
> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
> Process finished with exit code 239
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)