[
https://issues.apache.org/jira/browse/FLINK-31826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhanghao Chen closed FLINK-31826.
---------------------------------
Resolution: Duplicate
> Incorrect estimation of the target data rate of a vertex when only a subset
> of its upstream vertex's output is consumed
> -----------------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-31826
> URL: https://issues.apache.org/jira/browse/FLINK-31826
> Project: Flink
> Issue Type: Improvement
> Components: Autoscaler
> Reporter: Zhanghao Chen
> Priority: Major
> Attachments: LHL7VKOG4B.jpg
>
>
> Currently, a vertex's target data rate = the sum of its upstream vertex's
> target data rate * input/output ratio. This assumes that all of the upstream
> vertex output goes into the current vertex. However, it does not always hold.
> Consider the following job plan generated by a Flink SQL job. The vertex in
> the middle has multiple Calc(select xx) operators chained, each connects to a
> separate downstream tasks. The total num_rec_out_rate of the middle vertex =
> SUM num_rec_in_rate of its downstream tasks.
> To fix this problem, we need operator level output metrics and edge info. The
> operator level metrics part is easy, but AFAIK, there's no way to get the
> operator level edge info from the current Flink REST APIs.
> !LHL7VKOG4B.jpg!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)