Thanks.

I tried this command and it worked.

> flink stop -p s3a://path_to_savepoint/savepoints
> 5f9241d336ea2c652a84f79ac3158597  -yid application_1620673166934_0001


I will look at the "client.timeout" also to figure out what actually
happened.

Thanks.

On Tue, May 11, 2021 at 3:04 AM Chesnay Schepler <ches...@apache.org> wrote:

> Essentially this exception just means that the savepoint operation took
> longer than the CLI expected.
>
> This can occur for a number of reasons; maybe everything is working as
> expected but the timeout is just too low (controlled via "client.timeout").
> It could also be that the savepoint operation takes abnormally long; for
> example due to IO bottlenecks.
>
> I suggest to look into the JobManager logs to see whether the savepoint
> was actually created / the application shut down, and if so then maybe just
> increase the timeouts.
>
> On 5/11/2021 9:06 AM, Diwakar Jha wrote:
>
> Hello,
>
> I'm trying to use the flink 1.11 stop command to gracefully
> shutdown application with savepoint.
>
> flink stop --savepointPath s3a://path_to_save_point
>> c5d52e0146258f80fd52a3bf002d2a1b  -yid application_1620673166934_0001
>>
>
> 2021-05-11 06:26:57,852 ERROR org.apache.flink.client.cli.CliFrontend [] -
>> Error while running the command.
>> org.apache.flink.util.FlinkException: Could not stop with a savepoint job
>> "c5d52e0146258f80fd52a3bf002d2a1b".
>> at
>> org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:495)
>> ~[flink-dist_2.12-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:864)
>> ~[flink-dist_2.12-1.11.0.jar:1.11.0]
>> at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:487)
>> ~[flink-dist_2.12-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:931)
>> ~[flink-dist_2.12-1.11.0.jar:1.11.0]
>> at
>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992)
>> ~[flink-dist_2.12-1.11.0.jar:1.11.0]
>> at java.security.AccessController.doPrivileged(Native Method)
>> ~[?:1.8.0_252]
>> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_252]
>> at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>> [hadoop-common-3.2.1-amzn-1.jar:?]
>> at
>> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>> [flink-dist_2.12-1.11.0.jar:1.11.0]
>> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992)
>> [flink-dist_2.12-1.11.0.jar:1.11.0]
>> Caused by: java.util.concurrent.TimeoutException
>> at
>> java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
>> ~[?:1.8.0_252]
>> at
>> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
>> ~[?:1.8.0_252]
>> at
>> org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:493)
>> ~[flink-dist_2.12-1.11.0.jar:1.11.0]
>> ... 9 more
>>
>
> Cancel command seems to be working fine.
> Please let me know how to fix this TimeoutException.
>
> Thanks.
>
>
>

Reply via email to