Yangze Guo created FLINK-24161:
----------------------------------
Summary: Can not stop the job with savepoint while a task is
finishing
Key: FLINK-24161
URL: https://issues.apache.org/jira/browse/FLINK-24161
Project: Flink
Issue Type: Bug
Components: Runtime / Checkpointing
Affects Versions: 1.14.0
Reporter: Yangze Guo
Fix For: 1.14.0
When stop the job with savepoint, if there is a task is finishing, the action
will be timeout.
Testing job:
https://github.com/KarmaGYZ/flink/blob/test-147/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/wordcount/WordCount.java
Flink conf:
{code:bash}
state.savepoints.dir: /tmp/flink-savepoints
state.backend: rocksdb
state.backend.incremental: true
state.checkpoints.dir: file:///tmp/flink-ckp/
execution.checkpointing.aligned-checkpoint-timeout: 30 s
execution.checkpointing.interval: 5 s
taskmanager.numberOfTaskSlots: 2
{code}
How to reproduce:
{code:bash}
bin/flink run -d -p 4 examples/streaming/WordCount.jar
# while one task is finishing
bin/flink stop $JOB_ID
{code}
Client log:
{code:bash}
------------------------------------------------------------
The program finished with the following exception:
org.apache.flink.util.FlinkException: Could not stop with a savepoint job
"e139a2eba7f8dc0b07fab65e84421ee4".
at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:581)
at
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1002)
at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:569)
at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1069)
at
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132)
at
org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)
Caused by: java.util.concurrent.TimeoutException
at
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:579)
... 6 more
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)