[jira] [Commented] (FLINK-23826) Verify optimized scheduler performance for large-scale jobs

Zhu Zhu (Jira) Tue, 14 Sep 2021 02:32:05 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-23826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17414833#comment-17414833
 ]


Zhu Zhu commented on FLINK-23826:
---------------------------------

Here's the testing result:

||JM/Scheduler Initializing||Task deployment||Making failure recovery decision||
|0.2s|16s|0.2s|

A similar benchmark against Flink-1.12(which does not include all the 
performance improvment) is in progress but blocked by some environment 
problems. It will be updated a bit later.
And this ticket is no longer a blocker of Flink-1.14.


> Verify optimized scheduler performance for large-scale jobs
> -----------------------------------------------------------
>
>                 Key: FLINK-23826
>                 URL: https://issues.apache.org/jira/browse/FLINK-23826
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.14.0
>            Reporter: Zhu Zhu
>            Assignee: Zhu Zhu
>            Priority: Critical
>             Fix For: 1.14.0
>
>
> This ticket is used to verify the result of FLINK-21110.
> It should check if large scale jobs' scheduling are working well and the 
> scheduling performance, with a real job running on cluster. 
> The conclusion should include, for a *10000 --- all-to-all-connected 
> -->10000* job:
> 1. time of job initialization on master (job received -> scheduling started)
> 2. time of task deployment (task deploying started -> all tasks in 
> INITIALIZATION)
> 3. time of making task failure recovery decision (JM notified about task 
> failure -> tasks to restart decided)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23826) Verify optimized scheduler performance for large-scale jobs

Reply via email to