Thanks. I tried this command and it worked.
> flink stop -p s3a://path_to_savepoint/savepoints > 5f9241d336ea2c652a84f79ac3158597 -yid application_1620673166934_0001 I will look at the "client.timeout" also to figure out what actually happened. Thanks. On Tue, May 11, 2021 at 3:04 AM Chesnay Schepler <ches...@apache.org> wrote: > Essentially this exception just means that the savepoint operation took > longer than the CLI expected. > > This can occur for a number of reasons; maybe everything is working as > expected but the timeout is just too low (controlled via "client.timeout"). > It could also be that the savepoint operation takes abnormally long; for > example due to IO bottlenecks. > > I suggest to look into the JobManager logs to see whether the savepoint > was actually created / the application shut down, and if so then maybe just > increase the timeouts. > > On 5/11/2021 9:06 AM, Diwakar Jha wrote: > > Hello, > > I'm trying to use the flink 1.11 stop command to gracefully > shutdown application with savepoint. > > flink stop --savepointPath s3a://path_to_save_point >> c5d52e0146258f80fd52a3bf002d2a1b -yid application_1620673166934_0001 >> > > 2021-05-11 06:26:57,852 ERROR org.apache.flink.client.cli.CliFrontend [] - >> Error while running the command. >> org.apache.flink.util.FlinkException: Could not stop with a savepoint job >> "c5d52e0146258f80fd52a3bf002d2a1b". >> at >> org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:495) >> ~[flink-dist_2.12-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:864) >> ~[flink-dist_2.12-1.11.0.jar:1.11.0] >> at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:487) >> ~[flink-dist_2.12-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:931) >> ~[flink-dist_2.12-1.11.0.jar:1.11.0] >> at >> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:992) >> ~[flink-dist_2.12-1.11.0.jar:1.11.0] >> at java.security.AccessController.doPrivileged(Native Method) >> ~[?:1.8.0_252] >> at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_252] >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) >> [hadoop-common-3.2.1-amzn-1.jar:?] >> at >> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) >> [flink-dist_2.12-1.11.0.jar:1.11.0] >> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992) >> [flink-dist_2.12-1.11.0.jar:1.11.0] >> Caused by: java.util.concurrent.TimeoutException >> at >> java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) >> ~[?:1.8.0_252] >> at >> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) >> ~[?:1.8.0_252] >> at >> org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:493) >> ~[flink-dist_2.12-1.11.0.jar:1.11.0] >> ... 9 more >> > > Cancel command seems to be working fine. > Please let me know how to fix this TimeoutException. > > Thanks. > > >