[
https://issues.apache.org/jira/browse/FLINK-7935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310618#comment-16310618
]
Wei-Che Wei commented on FLINK-7935:
------------------------------------
Hi [~elevy]
What you described is almost correct. The FLINK-7692 provides users to expose
their own variables to {{MetricGroup}}, but how to map the metric name and
metric's variables to the third party metric system is the reporter's
responsibility.
You can use {{MetricGroup#getAllVariables()}} to get {{type:messageType}} and
other system scope variables. These can map to tags in DataDog reporter.
{{AbstractMetricGroup#getLogicalScope(CharacterFilter)}} can get
{{<parent.logical.scope>.messages.type}} back, so use this function to expose
metric name, which will be {{<parent.logical.scope>.messages.type.counts}}. For
example, Prometheus reporter use it to expose metric name.
[[1|https://github.com/apache/flink/blob/beb11976fe63c20a5dc9f22ea713c05b4d5e9585/flink-metrics/flink-metrics-prometheus/src/main/java/org/apache/flink/metrics/prometheus/PrometheusReporter.java#L217]]
However, {{MetricGroup#getMetricIdentifier(String)}} will still return
{{<parent.identifier>.messages.type.<messageType>}}. It seems that DataDog
reporter used this function to get metric name.
[[2|https://github.com/apache/flink/blob/master/flink-metrics/flink-metrics-datadog/src/main/java/org/apache/flink/metrics/datadog/DatadogHttpReporter.java#L63]]
I think that is the limitation in DataDog reporter, maybe we can make
{{AbstractMetricGroup#getLogicalScope(CharacterFilter)}} as a public API, and
update DataDog reporter.
cc [~Zentol]
Do you have any suggestions and comments? If I make any mistake on my comment,
please correct me. Thank you.
> Metrics with user supplied scope variables
> ------------------------------------------
>
> Key: FLINK-7935
> URL: https://issues.apache.org/jira/browse/FLINK-7935
> Project: Flink
> Issue Type: Improvement
> Components: Metrics
> Affects Versions: 1.3.2
> Reporter: Elias Levy
>
> We use DataDog for metrics. DD and Flink differ somewhat in how they track
> metrics.
> Flink names and scopes metrics together, at least by default. E.g. by default
> the System scope for operator metrics is
> {{<host>.taskmanager.<tm_id>.<job_name>.<operator_name>.<subtask_index>}}.
> The scope variables become part of the metric's full name.
> In DD the metric would be named something generic, e.g.
> {{taskmanager.job.operator}}, and they would be distinguished by their tag
> values, e.g. {{tm_id=foo}}, {{job_name=var}}, {{operator_name=baz}}.
> Flink allows you to configure the format string for system scopes, so it is
> possible to set the operator scope format to {{taskmanager.job.operator}}.
> We do this for all scopes:
> {code}
> metrics.scope.jm: jobmanager
> metrics.scope.jm.job: jobmanager.job
> metrics.scope.tm: taskmanager
> metrics.scope.tm.job: taskmanager.job
> metrics.scope.task: taskmanager.job.task
> metrics.scope.operator: taskmanager.job.operator
> {code}
> This seems to work. The DataDog Flink metric's plugin submits all scope
> variables as tags, even if they are not used within the scope format. And it
> appears internally this does not lead to metrics conflicting with each other.
> We would like to extend this to user defined metrics, but you can define
> variables/scopes when adding a metric group or metric with the user API, so
> that in DD we have a single metric with a tag with many different values,
> rather than hundreds of metrics to just the one value we want to measure
> across different event types.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)