[ https://issues.apache.org/jira/browse/FLINK-30198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638331#comment-17638331 ]
Aitozi commented on FLINK-30198: -------------------------------- cc [~wanglijie] what do you think of this ? > Support AdaptiveBatchScheduler to set per-task size for reducer task > --------------------------------------------------------------------- > > Key: FLINK-30198 > URL: https://issues.apache.org/jira/browse/FLINK-30198 > Project: Flink > Issue Type: Improvement > Components: Runtime / Coordination > Reporter: Aitozi > Priority: Major > > When we use AdaptiveBatchScheduler in our case, we found that it can work > well in most case, but there is a limit that, there is only one global > parameter for per task data size by > {{jobmanager.adaptive-batch-scheduler.avg-data-volume-per-task}}. > However, in a map-reduce architecture, the reducer tasks are usually have > more complex computation logic such as aggregate/sort/join operators. So I > think it will be nicer if we can set the reducer and mapper task's data size > per task individually. > Then, how to distinguish the reducer task? > IMO, we can let the parallelism decider know whether the vertex have a hash > edge inputs. If yes, it should be a reducer task. -- This message was sent by Atlassian Jira (v8.20.10#820010)