Hi

使用版本Flink 1.11,部署方式 kubernetes session。 TM个数30个,每个TM 4个slot。 job 
并行度120.提交作业的时候出现大量的No hostname could be resolved for the IP address,JM time 
out,作业提交失败。web ui也会卡主无响应。

用wordCount,并行度只有1提交也会刷,no hostname的日志会刷个几条,然后正常提交,如果并行度一上去,就会超时。


部分日志如下:

2020-07-15 16:58:46,460 WARN  
org.apache.flink.runtime.taskmanager.TaskManagerLocation     [] - No hostname 
could be resolved for the IP address 10.32.160.7, using IP address as host 
name. Local input split assignment (such as for HDFS files) may be impacted.
2020-07-15 16:58:46,460 WARN  
org.apache.flink.runtime.taskmanager.TaskManagerLocation     [] - No hostname 
could be resolved for the IP address 10.44.224.7, using IP address as host 
name. Local input split assignment (such as for HDFS files) may be impacted.
2020-07-15 16:58:46,461 WARN  
org.apache.flink.runtime.taskmanager.TaskManagerLocation     [] - No hostname 
could be resolved for the IP address 10.40.32.9, using IP address as host name. 
Local input split assignment (such as for HDFS files) may be impacted.

2020-07-15 16:59:10,236 INFO  
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - The 
heartbeat of JobManager with id 69a0d460de468888a9f41c770d963c0a timed out.
2020-07-15 16:59:10,236 INFO  
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - 
Disconnect job manager 
00000000000000000000000000000...@akka.tcp://flink@flink-jobmanager:6123/user/rpc/jobmanager_2
 for job e1554c737e37ed79688a15c746b6e9ef from the resource manager.


how to deal with ?


beset !

| |
a511955993
|
|
邮箱:a511955...@163.com
|

签名由 网易邮箱大师 定制

回复