Hi,SmileSmile. 个人之前有遇到过 类似 的host解析问题,可以从k8s的pod节点网络映射角度排查一下。 希望这对你有帮助。
祝好。 Roc Marshal 在 2020-07-15 17:04:18,"SmileSmile" <a511955...@163.com> 写道: > >Hi > >使用版本Flink 1.11,部署方式 kubernetes session。 TM个数30个,每个TM 4个slot。 job >并行度120.提交作业的时候出现大量的No hostname could be resolved for the IP address,JM time >out,作业提交失败。web ui也会卡主无响应。 > >用wordCount,并行度只有1提交也会刷,no hostname的日志会刷个几条,然后正常提交,如果并行度一上去,就会超时。 > > >部分日志如下: > >2020-07-15 16:58:46,460 WARN >org.apache.flink.runtime.taskmanager.TaskManagerLocation [] - No hostname >could be resolved for the IP address 10.32.160.7, using IP address as host >name. Local input split assignment (such as for HDFS files) may be impacted. >2020-07-15 16:58:46,460 WARN >org.apache.flink.runtime.taskmanager.TaskManagerLocation [] - No hostname >could be resolved for the IP address 10.44.224.7, using IP address as host >name. Local input split assignment (such as for HDFS files) may be impacted. >2020-07-15 16:58:46,461 WARN >org.apache.flink.runtime.taskmanager.TaskManagerLocation [] - No hostname >could be resolved for the IP address 10.40.32.9, using IP address as host >name. Local input split assignment (such as for HDFS files) may be impacted. > >2020-07-15 16:59:10,236 INFO >org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - The >heartbeat of JobManager with id 69a0d460de468888a9f41c770d963c0a timed out. >2020-07-15 16:59:10,236 INFO >org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - >Disconnect job manager >00000000000000000000000000000...@akka.tcp://flink@flink-jobmanager:6123/user/rpc/jobmanager_2 > for job e1554c737e37ed79688a15c746b6e9ef from the resource manager. > > >how to deal with ? > > >beset ! > >| | >a511955993 >| >| >邮箱:a511955...@163.com >| > >签名由 网易邮箱大师 定制