[ 
https://issues.apache.org/jira/browse/FLINK-32306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17731458#comment-17731458
 ] 

xingbe commented on FLINK-32306:
--------------------------------

Hi [~martijnvisser] , Thanks for reporting this!

As [~zhuzh]  mentioned, we replaced the scheduler type in the batch job 
benchmark and exposed the existing performance regression issues. I would like 
to further explain the third issue mentioned above. Due to the fact that the 
AdaptiveBatchScheduler schedules a dynamic graph, the vertices initialization 
which was originally executed in the createScheduler phase was lazily loaded 
and delayed until the startScheduling phase, where the 
`initializeVerticesIfPossible()` method is called in 
`AdaptiveBatchScheduler#startSchedulingInternal`.

As the latest benchmark shows, the createScheduler phase reduced its time 
consumption by about 200ms ([benchmark 
url|http://codespeed.dak8s.net:8000/timeline/#/?exe=1,3,5,6,8,9&ben=createScheduler.BATCH&env=2&revs=200&equid=off&quarts=on&extr=on]),
 while the startScheduling phase increased its time consumption by 200ms, 
confirming this point. Since both methods are only called once during the 
initialization phase, they do not reduce the overall performance.

> Multiple batch scheduler performance regressions
> ------------------------------------------------
>
>                 Key: FLINK-32306
>                 URL: https://issues.apache.org/jira/browse/FLINK-32306
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>            Reporter: Martijn Visser
>            Priority: Blocker
>
> InitScheduling.BATCH
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=initSchedulingStrategy.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200
> schedulingDownstreamTasks.BATCH 
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=schedulingDownstreamTasks.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200
> startScheduling.BATCH
> http://codespeed.dak8s.net:8000/timeline/#/?exe=5&ben=startScheduling.BATCH&extr=on&quarts=on&equid=off&env=2&revs=200



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to