Weike Dong created FLINK-19677:
----------------------------------
Summary: TaskManager takes abnormally long time to register with
JobManager on Kubernetes
Key: FLINK-19677
URL: https://issues.apache.org/jira/browse/FLINK-19677
Project: Flink
Issue Type: Bug
Components: Runtime / Task
Affects Versions: 1.11.2, 1.11.1, 1.11.0
Reporter: Weike Dong
During the registration process of TaskManager, JobManager would create a
_TaskManagerLocation_ instance, which tries to get hostname of the TaskManager
via reverse DNS lookup.
However, this always fails in Kubernetes environment, because for pods that are
not exposed by Services, their IPs cannot be resolved to domains by coredns,
and _InetAddress#getCanonicalHostName()_ would take ~5 seconds to return,
blocking the whole registration process.
Therefore Flink should provide a configuration parameter to turn off reverse
DNS lookup. Also, even when hostname is actually needed, this could be done
lazily to avoid blocking registration of other TaskManagers.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)