mxm commented on PR #847: URL: https://github.com/apache/flink-kubernetes-operator/pull/847#issuecomment-2243227124
> The `actual_target_data_rate_join` still can be greater than `target_data_rate_upstream_1` or `target_data_rate_upstream_2`. It means that the upstreams' target_data_rate remains unchanged. > > Also, the `actual_target_data_rate_join` can be less than the target_data_rate of upstream, making them equal to `actual_target_data_rate_join`. But then the target_data_rate of the join will be two times greater than it was expected. > > In both cases, the upstream_1 and upstream_2 operators will remain blocked after scaling. This is why the simpler approach may not be good enough. I think it can work if we apply the same logic that we used to determine `target_data_rate_join`. As you pointed out, we determined the taget data rate via: `actual_data_rate_join = actual_data_rate_upstream_1 + actual_data_rate_upstream_2` Consequently, we would need to satisfy the following equation for the backpropagation: `target_data_rate_join = target_data_rate_upstream_1 + target_data_rate_upstream_2` That would mean that each input vertex gets the following limit applied: `actual_data_rate_upstream_i = target_data_rate_upstream_i - (target_data_rate_join - actual_target_data_rate_join) / N` where `N` is the number of inputs. Do you think that would work? The benefit of this approach is that we leverage all the available information without having to add and backfeed additional factors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
