Yun Tang commented on FLINK-13418:

[~Zentol], the comparison of these two different concepts is listed below and 
summarized from [influxdb official 
||concept||whether required||whether indexed||whether count for series||
|field|required|not indexed|no|

 Take our {{numBytesInRemotePerSecond}} of input channel metrics as example, it 
is a task scope metrics with default naming 
{{<host>.taskmanager.<tm_id>.<job_name>.<task_name>.<subtask_index>}}. If we 
want to know the change of bytes input throughput of specific tasks, we need 
query with specific {{task_name}} and {{subtask_index}}. Just like 
said, if {{task_name}} and {{subtask_index}} were filed instead of tag, InfuxDB 
have to scan every value of {{task_name}} and {{subtask_index}}. On the other 
side, these could be set as tags to optimize the performance as tag is indexed 
in influxDB. The value of {{numBytesInRemotePerSecond}} is appropriate to act 
as field as we would not query on condign of values in most cases.

However, in InfluxDB, a {{series}} is the collection of data that share a 
retention policy, measurement, and tag set. In other words, more tags, more 
series could be counted. Since series number would impact the overall index 
performance especially the default in-memory index version for InfluxDB, and 
the default limit for series per data-base is only one million. If we include 
the {{task_attempt_id}} and other unnecessary tags in influxDB, the total 
series number would increase dramatically especially {{task_attempt_id}} set 
would be a high dimension set. That's why [~TheoD] come across the huge memory 
usage of InfluxDB.


> Avoid InfluxdbReporter to report unnecessary tags
> -------------------------------------------------
>                 Key: FLINK-13418
>                 URL: https://issues.apache.org/jira/browse/FLINK-13418
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Metrics
>            Reporter: Yun Tang
>            Priority: Major
>             Fix For: 1.10.0
> Currently, when building measurement info within {{InfluxdbReporter}}, it 
> would involve all variables as tags (please see code 
> [here|https://github.com/apache/flink/blob/d57741cef9d4773cc487418baa961254d0d47524/flink-metrics/flink-metrics-influxdb/src/main/java/org/apache/flink/metrics/influxdb/MeasurementInfoProvider.java#L54]).
>  However, user could adjust their own scope format to abort unnecessary 
> scope, while {{InfluxdbReporter}} could report all the scopes as tags to 
> InfluxDB.
> This is due to current {{MetricGroup}} lacks of any method to get necessary 
> scopes but only {{#getScopeComponents()}} or {{#getAllVariables()}}. In other 
> words, InfluxDB need tag-key and tag-value to compose as its tags while we 
> could only get all variables (without any filter acording to scope format) or 
> only scopeComponents (could be treated as tag-value). I think that's why 
> previous implementation have to report all tags.
> From our experience on InfluxDB, as the size of tags contribute to the 
> overall series in InfluxDB, it would never be a good idea to contain too many 
> tags, not to mention the [default value of series per 
> database|https://docs.influxdata.com/influxdb/v1.7/troubleshooting/errors/#error-max-series-per-database-exceeded]
>  is only one million.

This message was sent by Atlassian JIRA

Reply via email to