[
https://issues.apache.org/jira/browse/NIFI-4713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Burgess updated NIFI-4713:
-------------------------------
Affects Version/s: (was: 1.4.0)
Status: Patch Available (was: Open)
> Datadog Metrics Alignment
> -------------------------
>
> Key: NIFI-4713
> URL: https://issues.apache.org/jira/browse/NIFI-4713
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Extensions
> Reporter: Robert Batts
> Priority: Major
> Labels: datadog, metrics
>
> Metrics that are being fed into Datadog from Nifi do not seem to align to the
> Nifi model. Therefore, I am proposing the following.
> # Change the metric names to work better with Datadog
> # Become more reliant on tagging
> # Allow custom tagging
> Currently, metrics are being sent to Datadog in the following format:
> <metricsPrefix>.<processorName/flow>.<metricName>
> However, Datadog is more of a reuse a metric name and filter via tagging
> system. So in Datadog, something with a metric name of
> <metricsPrefix>.<metricName> with a tag of <processorName> works better than
> one unique metric per processor (in an event where there is no processorName,
> exclude the tag instead of adding 'flow').
> Consider the way Datadog does Kafka. The metric kafka.consumer_lag represents
> the current lag of a topic (tag) for a given consumer_group (tag) over all
> partitions (tag).
> For the same moment in time:
> kafka.consumer_lag = 5 <topic:a, consumer_group:nifi, partition:0>
> kafka.consumer_lag = 7 <topic:a, consumer_group:nifi, partition:1>
> kafka.consumer_lag = 22 <topic:a, consumer_group:python, partition:0>
> kafka.consumer_lag = 19 <topic:a, consumer_group:python, partition:1>
> kafka.consumer_lag = 2 <topic:b, consumer_group:nifi, partition:0>
> If I wanted to know what the current lag was for a given consumer_group on
> all topics, I would include those tags and then sum on the remaining records
> (which would be the across the partitions).
> For the same moment in time:
> kafka.consumer_lag = 12 for topic:a and consumer_group:nifi
> kafka.consumer_lag = 2 for topic:b and consumer_group:nifi
> In a Nifi sense, this could allow you to (for example) have a tag that noted
> this was an aws-sqs pull and aggregate the average number of records being
> pulled across the entire system instead of on a single process.
> Additionally, there is room for custom tagging as well. For example: I want
> to be able to aggregate across all Nifi clusters I control. Setting the
> prefix unique for each cluster breaks this aggregation and might not allow me
> to filter properly later if I do not set a prefix. But, if custom tagging was
> allowed, I could set a tag for cluster_name:nifi-1 and then you could have
> all metrics aggregated but be able to filter down to that specific cluster
> for other operations. In my opinion, the easiest way to implement this would
> be to take all non-required attributes from the Datadog controller and use
> them as the custom tags (these attributes should be considered final/static
> when loaded). The attributes are already in Key=Value format, so it should be
> easy enough to switch them over to Key:Value formatting for tagging (once the
> required attributes are removed).
> (Most if not all work for this is centered on
> org.apache.nifi.reporting.datadog.DataDogReportingTask)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)