[ 
https://issues.apache.org/jira/browse/NIFI-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16592352#comment-16592352
 ] 

Corey Fritz edited comment on NIFI-5535 at 8/25/18 12:52 AM:
-------------------------------------------------------------

So I attempted to fix the tagging issue, which I actually did, but that then 
just exacerbated another problem. The DataDogReportingTask is just sending way 
too many metrics, with way too many tags. Each processor generates 6 metrics 
with 2 tags each. Each port generates 9 metrics with 5 tags each. Each 
connection generates 6 metrics with 8 tags each. Plus 10 aggregated flow-level 
metrics and 13 JVM metrics, each with 2 tags. Datadog considers each unique 
combination of a metric name + tag to be a "custom metric". The lowest plan 
with Datadog allows an average of 100 "custom metrics" per host (meaning some 
could have more, some could have less, as long as the total # of custom metrics 
works out to be 100/host).

I have a flow with about 30 processors that resulted in 370 metrics, and I 
didn't bother to figure out how many tags, being sent to Datadog. I noticed 
that some of the metrics I was actually interested in monitoring were not 
showing up in Datadog, and I'm sure it's because we're way over our limit. 
There should probably be an opt-in strategy for identifying which sets of 
metrics we want to send to Datadog.

So... my proposal is this (and I'm willing to tackle this as time allows):

1. Add an _Enable Monitoring_ property to all processors that is off by default

2. Add an _Enable Monitoring_ property to all ports that is off by default

3. Add an _Enable Monitoring_ property to all connections that is off by default

4. Add the following properties to the DataDogReportingTask
 * _Enable Flow-level Monitoring_, off by default
 * _Enable JVM Monitoring_, off by default

5. Update the DataDogReportingTask to only submit metrics for components that 
have had monitoring explicitly enabled

6. Update the DataDogReportingTask to remove all metric tags except for 
_Environment_. I just don't see much value in any of the other tags.

This seems like a pretty large refactoring with a wide scope since it would 
touch processors, ports, and connections, as well as the other metric reporting 
services, so I'd like to discuss further with someone before proceeding.


was (Author: snagafritz):
So I attempted to fix the tagging issue, which I actually did, but that then 
just exacerbated another problem. The DataDogReportingTask is just sending way 
too many metrics, with way too many tags. Each processor generates 6 metrics 
with 2 tags each. Each port generates 9 metrics with 5 tags each. Each 
connection generates 6 metrics with 8 tags each. Plus 10 aggregated flow-level 
metrics and 13 JVM metrics, each with 2 tags. Datadog considers each unique 
combination of a metric name + tag to be a "custom metric". The lowest plan 
with Datadog allows an average of 100 "custom metrics" per host (meaning some 
could have more, some could have less, as long as the total # of custom metrics 
works out to be 100/host).

I have a flow with about 30 processors that resulted in 370 metrics, and I 
didn't bother to figure out how many tags, being sent to Datadog. I noticed 
that some of the metrics I was actually interested in monitoring were not 
showing up in Datadog, and I'm sure it's because we're way over our limit. 
There should probably be an opt-in strategy for identifying which sets of 
metrics we want to send to Datadog.

So... my proposal is this (and I'm willing to tackle this as time allows):

1. Add an _Enable Monitoring_ property to all processors that is off by default

2. Add an _Enable Monitoring_ property to all ports that is off by default

3. Add an _Enable Monitoring_ property to all connections that is off by default

4. Add the following properties to the DataDogReportingTask
 * _Enable Flow-level Monitoring_, off by default
 * _Enable JVM Monitoring_, off by default

5. Update the DataDogReportingTask to only submit metrics for components that 
have had monitoring explicitly enabled

6. Update the DataDogReportingTask to remove all metric tags except for 
_Environment_. I just don't see much value in any of the other tags.

This seems like a pretty large refactoring with a wide scope since it would 
touch processors, ports, and connections, so I'd like to discuss further with 
someone before proceeding.

> DataDogReportingTask is not tagging metrics properly
> ----------------------------------------------------
>
>                 Key: NIFI-5535
>                 URL: https://issues.apache.org/jira/browse/NIFI-5535
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: 1.7.1
>            Reporter: Corey Fritz
>            Priority: Major
>         Attachments: Screen Shot 2018-08-19 at 12.33.58 AM.png
>
>
> The current (and looks like original) implementation of the 
> DataDogReportingTask is not applying metric tags correctly, and as a result, 
> the "Environment" configuration property on that task does not work. This 
> means that you're not going to be able to use tags to differentiate the 
> metric values coming from different environments.
> Currently, every metric reported by this task gets the same set of tags 
> applied:
> {code:java}
> connection-destination-id
> connection-destination-name
> connection-group-id
> connection-id
> connection-name
> connection-source-id
> connection-source-name
> dataflow_id
> env
> port-group-id
> port-id
> port-name{code}
> This list is defined here: 
> [https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-datadog-bundle/nifi-datadog-reporting-task/src/main/java/org/apache/nifi/reporting/datadog/metrics/MetricsService.java#L111-L126]
> I've attached a screenshot from Datadog demonstrating a JVM metric with all 
> of these tags applied.
> Each of these tags should include a value, i.e. "env:dev" instead of just 
> "env".
> Other observations:
>  * it doesn't make sense to attach the _connection-_ and _port-_ tags to JVM 
> metrics
>  * I'm not sure I see any value in the _dataflow_id_ tag
> I was hoping for a quick fix when I noticed the environment tagging wasn't 
> working, but after reviewing the code I think a not insignificant refactoring 
> will be required. I'll try to tackle this if/when time allows.
> See here for more context on Datadog tagging: 
> [https://docs.datadoghq.com/tagging]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to