[
https://issues.apache.org/jira/browse/FLINK-10243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16604079#comment-16604079
]
ASF GitHub Bot commented on FLINK-10243:
----------------------------------------
zentol opened a new pull request #6658: [FLINK-10243][metrics] Make latency
metrics granularity configurable
URL: https://github.com/apache/flink/pull/6658
## What is the purpose of the change
This PR makes the latency metric granularity configurable.
Let's say we have 2 sources S1 S2 and one operator O, each with a
parallelism of 2.
In SUBTASK mode the latencies are tracked from each source subtask to each
operator subtask, which is the current behavior.
In OPERATOR mode we no longer differentiate between source subtasks. This
mode is the new default since it is signifantly more stable than SUBTASK since
it scales linearly with the number of operators and parallelism.
In SINGLE mode we no longer differentiate between different sources. This
mode is a bit questionable, but comes at virtually no cost so why not.
## Brief change log
* add `MetricOptions#LATENCY_SOURCE_GRANULARITY` config option
* add `LatencyStats#Granularity` enum that contains granularity-dependent
behavior
* extend `LatencyStats` to accept a `Granularity` argument
* adjust `AbstractStreamOperator` to read configured granularity
* update documentation
## Verifying this change
The `LatencyStats` are currently untested. This PR adds
the`LatencyStatsTest` class covering all aspects of this PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Add option to reduce latency metrics granularity
> ------------------------------------------------
>
> Key: FLINK-10243
> URL: https://issues.apache.org/jira/browse/FLINK-10243
> Project: Flink
> Issue Type: Sub-task
> Components: Configuration, Metrics
> Affects Versions: 1.7.0
> Reporter: Chesnay Schepler
> Assignee: Chesnay Schepler
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.7.0
>
>
> The latency is currently tracked separately from each operator subtask to
> each source subtask. The total number of latency metrics in the cluster is
> thus {{(# of sources) * (# of operators) * parallelism²}}, i.e. quadratic
> scaling.
> If we'd ignore the source subtask the scaling would be a lot more manageable.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)