Elias Levy created FLINK-8358:
---------------------------------
Summary: Hostname used by DataDog metric reporter is not
configurable
Key: FLINK-8358
URL: https://issues.apache.org/jira/browse/FLINK-8358
Project: Flink
Issue Type: Bug
Components: Metrics
Affects Versions: 1.4.0
Reporter: Elias Levy
The hostname used by the DataDog metric reporter to report metrics is not
configurable. This can problematic if the hostname that Flink uses is
different from the hostname used by the system's DataDog agent.
For instance, in our environment we use Chef, and using the DataDog Chef
Handler, certain metadata such a host roles is associated with the hostname in
the DataDog service. The hostname used to submit this metadata is the name we
have given the host. But as Flink picks up the default name given by EC2 to
the instance, metrics submitted by Flink to DataDog using that hostname are not
associated with the tags derived from Chef.
In the Job Manager we can avoid this issue by explicitly setting the config
{{jobmanager.rpc.address}} to the hostname we desire. I attempted to do the
name on the Task Manager by setting the {{taskmanager.hostname}} config, but
DataDog does not seem to pick up that value.
Digging through the code it seem the DD metric reporter get the hostname from
the {{TaskManagerMetricGroup}} host variable, which seems to be set from
{{taskManagerLocation.getHostname}}. That in turn seems to be by calling
{{this.inetAddress.getCanonicalHostName()}}, which merely perform a reverse
lookup on the IP address, and then calling {{NetUtils.getHostnameFromFQDN}} on
the result. The later is further problematic because it result is a non-fully
qualified hostname.
More generally, there seems to be a need to specify the hostname of a JM or TM
node that be reused across Flink components.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)