[ 
https://issues.apache.org/jira/browse/FLINK-20612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17283906#comment-17283906
 ] 

Zhu Zhu commented on FLINK-20612:
---------------------------------

[~pnowojski] The main problem in scheduler performance is that large complexity 
process can be introduced unexpectedly, which can lead to an increase of the 
computing complexity from O(N) to O(N^2) or even O(N^3). From this aspect, I 
think we do not need the scheduler benchmarks to be that stable, maybe a 
variance within 50% is acceptable. Especially after the optimization of some 
process in FLINK-21110, the time of the benchmarks may decrease to a much small 
number and the curve can be more unstable as a result. Even if it is not that 
stable, it can help for the initial goal, that a very steep curve can be 
displayed if an unexpected large complexity process is introduced.
Besides that, I think we can also increase the parallelism from 4000 to 10000, 
given that the benchmarks currently is a bit short (within 10 s). Previously it 
is set to 4000 to ensure not get the benchmarks running too long, but now it 
seems the time of parallelism 10000 can still be accepted.

> Add benchmarks for scheduler
> ----------------------------
>
>                 Key: FLINK-20612
>                 URL: https://issues.apache.org/jira/browse/FLINK-20612
>             Project: Flink
>          Issue Type: Improvement
>          Components: Benchmarks, Runtime / Coordination
>    Affects Versions: 1.13.0
>            Reporter: Zhilong Hong
>            Assignee: Zhilong Hong
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.13.0
>
>
> With Flink 1.12, we failed to run large-scale jobs on our cluster. When we 
> were trying to run the jobs, we met the exceptions like out of heap memory, 
> taskmanager heartbeat timeout, and etc. We increased the size of heap memory 
> and extended the heartbeat timeout, the job still failed. After the 
> troubleshooting, we found that there are some performance bottlenecks in the 
> jobmaster. These bottlenecks are highly related to the complexity of the 
> topology.
> We implemented several benchmarks on these bottlenecks based on 
> flink-benchmark. The topology of the benchmarks is a simple graph, which 
> consists of only two vertices: one source vertex and one sink vertex. They 
> are both connected with all-to-all blocking edges. The parallelisms of the 
> vertices are both 8000. The execution mode is batch. The results of the 
> benchmarks are illustrated below:
> Table 1: The result of benchmarks on bottlenecks in the jobmaster
> | |*Time spent*|
> |Build topology|45725.466 ms|
> |Init scheduling strategy|38960.602 ms|
> |Deploy tasks|17472.884 ms|
> |Calculate failover region to restart|12960.912 ms|
> We'd like to propose these benchmarks for procedures related to the 
> scheduler. There are three main benefits:
>  # They help us to understand the current status of task deployment 
> performance and locate where the bottleneck is.
>  # We can use the benchmarks to evaluate the optimization in the future.
>  # As we run the benchmarks daily, they will help us to trace how the 
> performance changes and locate the commit that introduces the performance 
> regression if there is any.
> In the first version of the benchmarks, we mainly focus on the procedures we 
> mentioned above. The methods corresponding to the procedures are:
>  # Building topology: {{ExecutionGraph#attachJobGraph}}
>  # Initializing scheduling strategies: 
> {{PipelinedRegionSchedulingStrategy#init}}
>  # Deploying tasks: {{Execution#deploy}}
>  # Calculating failover regions: 
> {{RestartPipelinedRegionFailoverStrategy#getTasksNeedingRestart}}
> In the benchmarks, the topology consists of two vertices: source -> sink. 
> They are connected with all-to-all edges. The result partition type 
> ({{PIPELINED}} and {{BLOCKING}}) should be considered separately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to