[
https://issues.apache.org/jira/browse/FLINK-22767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhilong Hong updated FLINK-22767:
---------------------------------
Summary: Optimize the initialization of
LocalInputPreferredSlotSharingStrategy (was: Optimize the initialization of
ExecutionSlotSharingGroupBuilder)
> Optimize the initialization of LocalInputPreferredSlotSharingStrategy
> ---------------------------------------------------------------------
>
> Key: FLINK-22767
> URL: https://issues.apache.org/jira/browse/FLINK-22767
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Affects Versions: 1.13.0
> Reporter: Zhilong Hong
> Priority: Major
>
> Based on the scheduler benchmark introduced in FLINK-21731, we find that
> during the initialization of ExecutionSlotSharingGroupBuilder, there's a
> procedure that has O(N^2) complexity:
> {{ExecutionSlotSharingGroupBuilder#tryFindAvailableProducerExecutionSlotSharingGroupFor}}.
> This initialization happens during the initialization of
> LocalInputPreferredSlotSharingStrategy.
> The original implementation is:
> {code:java}
> for all SchedulingExecutionVertex in DefaultScheduler:
> for all consumed SchedulingResultPartition of the SchedulingExecutionVertex:
> get the result partition's producer vertex and determine the
> ExecutionSlotSharingGroup where the producer vertex locates is available for
> current vertex{code}
> This procedure has O(N^2) complexity.
> It's obvious that the result partitions in the same ConsumedPartitionGroup
> have the same producer vertex. So we can just iterate over the
> ConsumedPartitionGroups instead of all the consumed partitions. This will
> decrease the complexity from O(N^2) to O(N).
> The optimization of this procedure will speed up the initialization of
> DefaultScheduler. It will accelerate the submission of a new job, especially
> for OLAP jobs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)