[
https://issues.apache.org/jira/browse/FLINK-31245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17694008#comment-17694008
]
Zhanghao Chen commented on FLINK-31245:
---------------------------------------
[~dmvk] Looking forward to your opinions on this. Personally, I think we can
deprecate the use of GlobalAggregateManager.
> Adaptive scheduler does not reset the state of GlobalAggregateManager when
> rescaling
> ------------------------------------------------------------------------------------
>
> Key: FLINK-31245
> URL: https://issues.apache.org/jira/browse/FLINK-31245
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.16.1
> Reporter: Zhanghao Chen
> Priority: Major
> Fix For: 1.18.0
>
>
> *Problem*
> GlobalAggregateManager is used to share state amongst parallel tasks in a job
> and thus coordinate their execution. It maintains a state (the _accumulators_
> field in JobMaster) in JM memory. The accumulator state content is defined in
> user code, in my company, a user stores task parallelism in the accumulator,
> assuming task parallelism never changes. However, this assumption is broken
> when using adaptive scheduler.
> *Possible Solutions*
> # Mark GlobalAggregateManager as deprecated. It seems that operator
> coordinator can completely replace GlobalAggregateManager and is a more
> elegent solution. Therefore, it is fine to deprecate GlobalAggregateManager
> and leave this issue there. If that's the case, we can open another ticket
> for doing that.
> # If we decide to continue supporting GlobalAggregateManager, then we need
> to reset the state when rescaling.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)