[jira] [Updated] (FLINK-39306) Non-source vertices do not use per-second rate metrics, producing inaccurate scaling decisions

Dennis-Mircea Ciupitu (Jira) Fri, 05 Jun 2026 02:09:11 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-39306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dennis-Mircea Ciupitu updated FLINK-39306:
------------------------------------------
    Description: 
h1. Summary

The busy-time {{TRUE_PROCESSING_RATE}} is computed as a ratio, {{{}busyTimeTpr 
= inputRate / busyTime{}}}, but the numerator and denominator are estimated 
over the metric window with different methods. The denominator is computed 
consistently with the configured busy-time aggregator, while the numerator 
always uses {{getRate}} on the cumulative records counter. Under the default 
{{MAX}} aggregator the two halves of the ratio therefore use different temporal 
estimators, which is internally inconsistent and can skew the ratio under 
non-uniform metric sampling. This issue aligns the numerator estimator with the 
denominator so the ratio is internally consistent.
h1. Background

{{TRUE_PROCESSING_RATE}} (the vertex capacity used by the scaler) is the larger 
of two sub-paths, a busy-time based estimate and an observed estimate, selected 
by {{{}selectTprMetric{}}}.

The busy-time estimate is {{{}busyTimeTpr = inputRate / (busyTimeAvg / 
1000){}}}. The denominator {{busyTimeAvg}} depends on the 
{{kubernetes.operator.metrics.busy-time.aggregator}} option:
 * {{{}AVG{}}}: {{{}getRate(ACCUMULATED_BUSY_TIME) / parallelism{}}}, a 
cumulative (time-integral) rate.
 * {{MAX}} or {{MIN}} (default {{{}MAX{}}}): {{{}getAverage(LOAD) * 1000{}}}, 
an arithmetic mean of the per-second busy-time gauge samples. The numerator 
{{inputRate}} is always {{{}getRate(NUM_RECORDS_IN){}}}, a cumulative endpoint 
rate.

{{getRate}} (a time-weighted, time-integral average) and {{getAverage}} of a 
per-second gauge (an unweighted sample mean) are two different linear 
estimators. They agree under uniform sampling but diverge under non-uniform 
sampling (bursts, recovery, transients).
h1. Problem

Under the default {{MAX}} aggregator, {{{}busyTimeTpr{}}}'s denominator is a 
per-second sample mean while its numerator is a cumulative endpoint rate. 
Because the value is a ratio of two co-varying quantities, using a single 
shared estimator for both lets their common sampling weighting cancel, whereas 
mixing estimators leaves the denominator's sampling artifact in the result. The 
numerator is also the odd one out relative to the observed estimate it is 
compared against in {{{}selectTprMetric{}}}, which is built entirely from 
per-second gauges.
h1. Goal

Make the busy-time {{TRUE_PROCESSING_RATE}} numerator follow the same estimator 
as its busy-time denominator: per-second gauge mean under {{MAX}} or 
{{{}MIN{}}}, cumulative {{getRate}} under {{{}AVG{}}}. This is an 
internal-consistency fix for the ratio. It is scoped to that numerator only. It 
is not a change to the demand or edge data-rate paths, and it does not attempt 
to revalidate the underlying capacity model, the observed-rate formula, or the 
subtask aggregation, which are out of scope.
h1. Notes

Behavior is unchanged under {{AVG}} and unchanged whenever metric sampling is 
uniform (the common steady state). The new estimator only differs under 
non-uniform sampling. Covered by unit tests.

  was:
h1. Summary

The busy-time {{TRUE_PROCESSING_RATE}} is computed as a ratio, {{busyTimeTpr = 
inputRate / busyTime}}, but the numerator and denominator are estimated over 
the metric window with different methods. The denominator is computed 
consistently with the configured busy-time aggregator, while the numerator 
always uses {{getRate}} on the cumulative records counter. Under the default 
{{MAX}} aggregator the two halves of the ratio therefore use different temporal 
estimators, which is internally inconsistent and can skew the ratio under 
non-uniform metric sampling. This issue aligns the numerator estimator with the 
denominator so the ratio is internally consistent.

h1. Background

{{TRUE_PROCESSING_RATE}} (the vertex capacity used by the scaler) is the larger 
of two sub-paths, a busy-time based estimate and an observed estimate, selected 
by {{selectTprMetric}}.

The busy-time estimate is {{busyTimeTpr = inputRate / (busyTimeAvg / 1000)}}. 
The denominator {{busyTimeAvg}} depends on the 
{{kubernetes.operator.metrics.busy-time.aggregator}} option:

{{AVG}}: {{getRate(ACCUMULATED_BUSY_TIME) / parallelism}}, a cumulative 
(time-integral) rate.
{{MAX}} or {{MIN}} (default {{MAX}}): {{getAverage(LOAD) * 1000}}, an 
arithmetic mean of the per-second busy-time gauge samples.
The numerator {{inputRate}} is always {{getRate(NUM_RECORDS_IN)}}, a cumulative 
endpoint rate.

{{getRate}} (a time-weighted, time-integral average) and {{getAverage}} of a 
per-second gauge (an unweighted sample mean) are two different linear 
estimators. They agree under uniform sampling but diverge under non-uniform 
sampling (bursts, recovery, transients).

h1. Problem

Under the default {{MAX}} aggregator, {{busyTimeTpr}}'s denominator is a 
per-second sample mean while its numerator is a cumulative endpoint rate. 
Because the value is a ratio of two co-varying quantities, using a single 
shared estimator for both lets their common sampling weighting cancel, whereas 
mixing estimators leaves the denominator's sampling artifact in the result. The 
numerator is also the odd one out relative to the observed estimate it is 
compared against in {{selectTprMetric}}, which is built entirely from 
per-second gauges.

h1. Goal

Make the busy-time {{TRUE_PROCESSING_RATE}} numerator follow the same estimator 
as its busy-time denominator: per-second gauge mean under {{MAX}} or {{MIN}}, 
cumulative {{getRate}} under {{AVG}}. This is an internal-consistency fix for 
the ratio. It is scoped to that numerator only. It is not a change to the 
demand or edge data-rate paths, and it does not attempt to revalidate the 
underlying capacity model, the observed-rate formula, or the subtask 
aggregation, which are out of scope.

h1. Notes

Behavior is unchanged under {{AVG}} and unchanged whenever metric sampling is 
uniform (the common steady state). The new estimator only differs under 
non-uniform sampling. Covered by unit tests.


> Non-source vertices do not use per-second rate metrics, producing inaccurate 
> scaling decisions
> ----------------------------------------------------------------------------------------------
>
>                 Key: FLINK-39306
>                 URL: https://issues.apache.org/jira/browse/FLINK-39306
>             Project: Flink
>          Issue Type: Bug
>          Components: Autoscaler, Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.14.0
>            Reporter: Dennis-Mircea Ciupitu
>            Priority: Major
>              Labels: autoscaling, operator, pull-request-available
>             Fix For: kubernetes-operator-1.16.0
>
>
> h1. Summary
> The busy-time {{TRUE_PROCESSING_RATE}} is computed as a ratio, 
> {{{}busyTimeTpr = inputRate / busyTime{}}}, but the numerator and denominator 
> are estimated over the metric window with different methods. The denominator 
> is computed consistently with the configured busy-time aggregator, while the 
> numerator always uses {{getRate}} on the cumulative records counter. Under 
> the default {{MAX}} aggregator the two halves of the ratio therefore use 
> different temporal estimators, which is internally inconsistent and can skew 
> the ratio under non-uniform metric sampling. This issue aligns the numerator 
> estimator with the denominator so the ratio is internally consistent.
> h1. Background
> {{TRUE_PROCESSING_RATE}} (the vertex capacity used by the scaler) is the 
> larger of two sub-paths, a busy-time based estimate and an observed estimate, 
> selected by {{{}selectTprMetric{}}}.
> The busy-time estimate is {{{}busyTimeTpr = inputRate / (busyTimeAvg / 
> 1000){}}}. The denominator {{busyTimeAvg}} depends on the 
> {{kubernetes.operator.metrics.busy-time.aggregator}} option:
>  * {{{}AVG{}}}: {{{}getRate(ACCUMULATED_BUSY_TIME) / parallelism{}}}, a 
> cumulative (time-integral) rate.
>  * {{MAX}} or {{MIN}} (default {{{}MAX{}}}): {{{}getAverage(LOAD) * 1000{}}}, 
> an arithmetic mean of the per-second busy-time gauge samples. The numerator 
> {{inputRate}} is always {{{}getRate(NUM_RECORDS_IN){}}}, a cumulative 
> endpoint rate.
> {{getRate}} (a time-weighted, time-integral average) and {{getAverage}} of a 
> per-second gauge (an unweighted sample mean) are two different linear 
> estimators. They agree under uniform sampling but diverge under non-uniform 
> sampling (bursts, recovery, transients).
> h1. Problem
> Under the default {{MAX}} aggregator, {{{}busyTimeTpr{}}}'s denominator is a 
> per-second sample mean while its numerator is a cumulative endpoint rate. 
> Because the value is a ratio of two co-varying quantities, using a single 
> shared estimator for both lets their common sampling weighting cancel, 
> whereas mixing estimators leaves the denominator's sampling artifact in the 
> result. The numerator is also the odd one out relative to the observed 
> estimate it is compared against in {{{}selectTprMetric{}}}, which is built 
> entirely from per-second gauges.
> h1. Goal
> Make the busy-time {{TRUE_PROCESSING_RATE}} numerator follow the same 
> estimator as its busy-time denominator: per-second gauge mean under {{MAX}} 
> or {{{}MIN{}}}, cumulative {{getRate}} under {{{}AVG{}}}. This is an 
> internal-consistency fix for the ratio. It is scoped to that numerator only. 
> It is not a change to the demand or edge data-rate paths, and it does not 
> attempt to revalidate the underlying capacity model, the observed-rate 
> formula, or the subtask aggregation, which are out of scope.
> h1. Notes
> Behavior is unchanged under {{AVG}} and unchanged whenever metric sampling is 
> uniform (the common steady state). The new estimator only differs under 
> non-uniform sampling. Covered by unit tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-39306) Non-source vertices do not use per-second rate metrics, producing inaccurate scaling decisions

Reply via email to