Hi,

The above exception may be caused by both savepoint timing out and job
termination timing out.
To distinguish between these two cases, could you please check the
status of the savepoint and the tasks in the Flink Web UI? IIUC, after
you get this exception on client, you still have the job running.
Could you also check if there are any exceptions in "Exceptions
history" or in the logs?

Regards,
Roman

On Mon, Sep 27, 2021 at 6:49 AM Marco Villalobos
<mvillalo...@kineteque.com> wrote:
>
> Today, I kept on receiving a timeout exception when stopping my job with a 
> savepoint.
> This happened with Flink version 1.12.2 running in EMR.
>
> I had to use the deprecated cancel with savepoint feature instead.
>
> In fact, stopping with a savepoint, creating a savepoint, and cancelling with 
> a savepoint all gave me the timeout exception.
>
> However, the cancel with savepoint started creating a savepoint on the 
> cluster.
>
> The program finished with the following exception:
>
> org.apache.flink.util.FlinkException: Could not stop with a savepoint job 
> "5d6100984035db9541e9f08ecbd311bf".
> at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:585)
> at 
> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1006)
> at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:573)
> at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1073)
> at 
> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1136)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1136)
> Caused by: java.util.concurrent.TimeoutException
> at 
> java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
> at org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:583)
> ... 9 more
>
>
>

Reply via email to