Hi,
你的任务时跑在yarn上的吗?如果是 需要指定 -yid

> 2020年11月6日 下午1:31,Congxian Qiu <qcx978132...@gmail.com> 写道:
> 
> Hi
>     从 client 端日志,或者 JM 日志还能看到其他的异常么?
> Best,
> Congxian
> 
> 
> 张锴 <zk357794...@gmail.com> 于2020年11月6日周五 上午11:42写道:
> 
>> 重启和反压都正常
>> 另外增加了从客户端到master的时间,还是有这个问题
>> 
>> hailongwang <18868816...@163.com> 于 2020年11月6日周五 10:54写道:
>> 
>>> Hi,
>>> 
>>> 
>>> 这个报错只是在规定的时间内没有完成 Savepoint,导致客户端连接 Master 超时,
>>> 具体的原因需要看下 Jobmaster 的日志。
>>> PS:在任务一直重启、反压的情况下,一般 Savepoint 都会失败。
>>> 
>>> 
>>> Best,
>>> Hailong Wang
>>> 
>>> 
>>> 
>>> 
>>> 在 2020-11-06 09:33:48,"张锴" <zk357794...@gmail.com> 写道:
>>>> 本人在使用flink savepoint 保存快照的时候,遇到错误,目前不清楚是因为什么导致这个问题,路过的大佬帮忙看下。
>>>> 
>>>> flink 版本1.10.1
>>>> 
>>>> 
>>>> 执行   flink savepoint a3a2e6c3a5a00bbe4c0c9e351dc58c47
>>>> hdfs://hadoopnamenodeHA/flink/flink-savepoints
>>>> 
>>>> 
>>>> 出现错误信息
>>>> 
>>>> 
>>>> org.apache.flink.util.FlinkException: Triggering a savepoint for the job
>>>> a3a2e6c3a5a00bbe4c0c9e351dc58c47 failed.
>>>> 
>>>> at
>>> 
>>> 
>>> org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:631)
>>>> 
>>>> at
>>> 
>>> 
>>> org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:609)
>>>> 
>>>> at
>>> 
>>> 
>>> org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:841)
>>>> 
>>>> at
>>> org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:606)
>>>> 
>>>> at
>>> 
>>> 
>>> org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:908)
>>>> 
>>>> at
>>> 
>>> 
>>> org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:966)
>>>> 
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> 
>>>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>>> 
>>>> at
>>> 
>>> 
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836)
>>>> 
>>>> at
>>> 
>>> 
>>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>>>> 
>>>> at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:966)
>>>> 
>>>> Caused by: java.util.concurrent.TimeoutException
>>>> 
>>>> at
>>> 
>>> 
>>> java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
>>>> 
>>>> at
>>> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
>>>> 
>>>> at
>>> 
>>> 
>>> org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:625)
>>> 
>> 

回复