[ 
https://issues.apache.org/jira/browse/FLINK-22767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhu Zhu reassigned FLINK-22767:
-------------------------------

    Assignee: Zhilong Hong

> Optimize the initialization of LocalInputPreferredSlotSharingStrategy
> ---------------------------------------------------------------------
>
>                 Key: FLINK-22767
>                 URL: https://issues.apache.org/jira/browse/FLINK-22767
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>    Affects Versions: 1.13.0
>            Reporter: Zhilong Hong
>            Assignee: Zhilong Hong
>            Priority: Major
>              Labels: pull-request-available
>
> Based on the scheduler benchmark introduced in FLINK-21731, we find that 
> during the initialization of {{LocalInputPreferredSlotSharingStrategy}}, 
> there's a procedure that has O(N^2) complexity: 
> {{ExecutionSlotSharingGroupBuilder#tryFindAvailableProducerExecutionSlotSharingGroupFor}}
>  located in {{LocalInputPreferredSlotSharingStrategy}}.
> The original implementation is: 
> {code:java}
> for all SchedulingExecutionVertex in DefaultScheduler:
>   for all consumed SchedulingResultPartition of the SchedulingExecutionVertex:
>     get the result partition's producer vertex and determine the 
> ExecutionSlotSharingGroup where the producer vertex locates is available for 
> current vertex{code}
> This procedure has O(N^2) complexity.
> It's obvious that the result partitions in the same ConsumedPartitionGroup 
> have the same producer vertex. So we can just iterate over the 
> ConsumedPartitionGroups instead of all the consumed partitions. This will 
> decrease the complexity from O(N^2) to O(N).
> The optimization of this procedure will speed up the initialization of 
> DefaultScheduler. It will accelerate the submission of a new job, especially 
> for OLAP jobs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to