[ https://issues.apache.org/jira/browse/FLINK-21053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Weijie Guo updated FLINK-21053: ------------------------------- Affects Version/s: 2.1.0 > Prevent potential RejectedExecutionExceptions in CheckpointCoordinator > failing JM > --------------------------------------------------------------------------------- > > Key: FLINK-21053 > URL: https://issues.apache.org/jira/browse/FLINK-21053 > Project: Flink > Issue Type: Improvement > Components: Runtime / Checkpointing > Affects Versions: 2.1.0 > Reporter: Roman Khachatryan > Priority: Minor > Labels: auto-unassigned > Fix For: 2.0.0 > > > In the past, there were multiple bugs caused by throwing/handling > RejectedExecutionException in CheckpointCoordinator (FLINK-18290, > FLINK-20992). > > And I think it's still possible as there are many places where an executor is > passed to calls to CompletableFuture.xxxAsync while it can already be shut > down. > > In FLINK-20992 we discussed two approaches to fix this. > One approach is to check executor state inside a synchronized block every > time when it is used. > Second approach is to > # Create executors inside CheckpointCoordinator (both io & timer thread > pools) > # Check isShutdown() in their RejectedExecution handlers (if yes and it's > RejectedExecutionException then just log; otherwise delegate to > FatalExitExceptionHandler) > # (this will allow to remove such RejectedExecutionException checks from > coordinator code) > -- This message was sent by Atlassian Jira (v8.20.10#820010)