[
https://issues.apache.org/jira/browse/FLINK-32288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730470#comment-17730470
]
Zhu Zhu commented on FLINK-32288:
---------------------------------
Thanks for reporting this! [~xiasun]
The root cause is that the VertexwiseSchedulingStrategy used by
AdaptiveBatchScheduler is less performant than the
PipelinedRegionSchedulingStrategy used by DefaultScheduler. But given that
AdaptiveBatchScheduler is the recommended and default scheduler for batch jobs,
we should use it to benchmark the batch job scheduling.
I have assigned you the ticket. Feel free to open a PR for it.
> Improve the scheduling performance of AdaptiveBatchScheduler
> ------------------------------------------------------------
>
> Key: FLINK-32288
> URL: https://issues.apache.org/jira/browse/FLINK-32288
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Affects Versions: 1.18.0
> Reporter: xingbe
> Assignee: xingbe
> Priority: Major
> Fix For: 1.18.0
>
>
> After adding the benchmark of AdaptiveBatchScheduler in FLINK-30480, we
> noticed a regression in the performance of
> SchedulingDownstreamTasksInBatchJobBenchmark#SchedulingDownstreamTasks. When
> scheduling a batch job with a parallelism of 4000*4000, the time spent
> increased from 32ms to 1336ms on my local PC.
> To improve the performance, we can optimize the traversal by checking if the
> consumedPartitionGroups have finished all its partitions.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)