[
https://issues.apache.org/jira/browse/FLINK-31976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tan Kim updated FLINK-31976:
----------------------------
Description:
The determination of whether it is an inefficient scale-up is calculated as
follows
{code:java}
double lastProcRate =
lastSummary.getMetrics().get(TRUE_PROCESSING_RATE).getAverage();
double lastExpectedProcRate =
lastSummary.getMetrics().get(EXPECTED_PROCESSING_RATE).getCurrent();
var currentProcRate = evaluatedMetrics.get(TRUE_PROCESSING_RATE).getAverage();
double expectedIncrease = lastExpectedProcRate - lastProcRate;
double actualIncrease = currentProcRate - lastProcRate;
boolean withinEffectiveThreshold =
(actualIncrease / expectedIncrease)
>= conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_THRESHOLD);{code}
Because the expectedIncrease value references the last scaling history, it will
not change unless there is an additional scale-up, only the actualIncrease
value will change.
The actualIncrease value is currentProcRate( avg of TRUE_PROCESSING_RATE),
The calculation of TRUE_PROCESSING_RATE is as follows
trueProcessingRate = busyTimeMultiplier * numRecordsInPerSecond.getSum()
For example, let's say you've been marked as an inefficient scale-up, but the
LAG continues to build up.
You need to scale up to eliminate the growing LAG, but because you're marked as
an inefficient scale-up, it won't happen.
To unmark a scaleup as inefficient, the following conditions must be met:
actualIncrease/expectedIncrease > SCALING_EFFECTIVENESS_THRESHOLD (default 0.1)
Here, expectedIncrease is a constant with lastSummary, so the value of
actualIncrease must increase.
However, the actualIncrease value is proportional to busyTimeMultiplier and
numRecordsInPerSecond, and these two values will converge to a certain value if
no scaling occurs.
Therefore, the value of actualIncrease will also converge.
If this value fails to cross a threshold, no further scaling up is possible,
even if the lag continues to build up.
was:
The determination of whether it is an inefficient scale-up is calculated as
follows
{code:java}
double lastProcRate =
lastSummary.getMetrics().get(TRUE_PROCESSING_RATE).getAverage(); //
22569.315633422066
double lastExpectedProcRate =
lastSummary.getMetrics().get(EXPECTED_PROCESSING_RATE).getCurrent(); // 37340.0
var currentProcRate = evaluatedMetrics.get(TRUE_PROCESSING_RATE).getAverage();
double expectedIncrease = lastExpectedProcRate - lastProcRate;
double actualIncrease = currentProcRate - lastProcRate;
boolean withinEffectiveThreshold =
(actualIncrease / expectedIncrease)
>= conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_THRESHOLD);{code}
Because the expectedIncrease value references the last scaling history, it will
not change unless there is an additional scale-up, only the actualIncrease
value will change.
The actualIncrease value is currentProcRate( avg of TRUE_PROCESSING_RATE),
The calculation of TRUE_PROCESSING_RATE is as follows
trueProcessingRate = busyTimeMultiplier * numRecordsInPerSecond.getSum()
For example, let's say you've been marked as an inefficient scale-up, but the
LAG continues to build up.
You need to scale up to eliminate the growing LAG, but because you're marked as
an inefficient scale-up, it won't happen.
To unmark a scaleup as inefficient, the following conditions must be met:
actualIncrease/expectedIncrease > SCALING_EFFECTIVENESS_THRESHOLD (default 0.1)
Here, expectedIncrease is a constant with lastSummary, so the value of
actualIncrease must increase.
However, the actualIncrease value is proportional to busyTimeMultiplier and
numRecordsInPerSecond, and these two values will converge to a certain value if
no scaling occurs.
Therefore, the value of actualIncrease will also converge.
If this value fails to cross a threshold, no further scaling up is possible,
even if the lag continues to build up.
> Once marked as an inefficient scale-up, further scaling may not happen forever
> ------------------------------------------------------------------------------
>
> Key: FLINK-31976
> URL: https://issues.apache.org/jira/browse/FLINK-31976
> Project: Flink
> Issue Type: Improvement
> Components: Autoscaler
> Affects Versions: 1.17.0
> Reporter: Tan Kim
> Priority: Major
>
> The determination of whether it is an inefficient scale-up is calculated as
> follows
> {code:java}
> double lastProcRate =
> lastSummary.getMetrics().get(TRUE_PROCESSING_RATE).getAverage();
> double lastExpectedProcRate =
> lastSummary.getMetrics().get(EXPECTED_PROCESSING_RATE).getCurrent();
> var currentProcRate = evaluatedMetrics.get(TRUE_PROCESSING_RATE).getAverage();
> double expectedIncrease = lastExpectedProcRate - lastProcRate;
> double actualIncrease = currentProcRate - lastProcRate;
> boolean withinEffectiveThreshold =
> (actualIncrease / expectedIncrease)
> >= conf.get(AutoScalerOptions.SCALING_EFFECTIVENESS_THRESHOLD);{code}
> Because the expectedIncrease value references the last scaling history, it
> will not change unless there is an additional scale-up, only the
> actualIncrease value will change.
> The actualIncrease value is currentProcRate( avg of TRUE_PROCESSING_RATE),
> The calculation of TRUE_PROCESSING_RATE is as follows
> trueProcessingRate = busyTimeMultiplier * numRecordsInPerSecond.getSum()
> For example, let's say you've been marked as an inefficient scale-up, but the
> LAG continues to build up.
> You need to scale up to eliminate the growing LAG, but because you're marked
> as an inefficient scale-up, it won't happen.
> To unmark a scaleup as inefficient, the following conditions must be met:
> actualIncrease/expectedIncrease > SCALING_EFFECTIVENESS_THRESHOLD (default
> 0.1)
> Here, expectedIncrease is a constant with lastSummary, so the value of
> actualIncrease must increase.
> However, the actualIncrease value is proportional to busyTimeMultiplier and
> numRecordsInPerSecond, and these two values will converge to a certain value
> if no scaling occurs.
> Therefore, the value of actualIncrease will also converge.
> If this value fails to cross a threshold, no further scaling up is possible,
> even if the lag continues to build up.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)