[
https://issues.apache.org/jira/browse/FLINK-22767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhilong Hong updated FLINK-22767:
---------------------------------
Fix Version/s: 1.14.0
Description:
Based on the scheduler benchmark introduced in FLINK-21731, we find that during
the initialization of {{LocalInputPreferredSlotSharingStrategy}}, there's a
procedure that has O(N^2) complexity:
{{ExecutionSlotSharingGroupBuilder#build}} located in
{{LocalInputPreferredSlotSharingStrategy}}.
The original implementation is:
{code:java}
for all SchedulingExecutionVertex in DefaultScheduler:
for all consumed SchedulingResultPartition of the SchedulingExecutionVertex:
get the result partition's producer vertex and determine the
ExecutionSlotSharingGroup where the producer vertex locates is available for
current vertex{code}
This procedure has O(N^2) complexity.
Instead of iterating over the ExecutionSlotSharingGroups of producer vertices
for all consumed result partitions, we can maintain a set of all available
ExecutionSlotSharingGroups for the consumed result partitions. Once a group is
assigned, it will be removed from the available group set. This will decrease
the complexity from O(N^2) to O(N).
The optimization of this procedure will speed up the initialization of
DefaultScheduler. It will accelerate the submission of a new job, especially
for OLAP jobs.
was:
Based on the scheduler benchmark introduced in FLINK-21731, we find that during
the initialization of {{LocalInputPreferredSlotSharingStrategy}}, there's a
procedure that has O(N^2) complexity:
{{ExecutionSlotSharingGroupBuilder#tryFindAvailableProducerExecutionSlotSharingGroupFor}}
located in {{LocalInputPreferredSlotSharingStrategy}}.
The original implementation is:
{code:java}
for all SchedulingExecutionVertex in DefaultScheduler:
for all consumed SchedulingResultPartition of the SchedulingExecutionVertex:
get the result partition's producer vertex and determine the
ExecutionSlotSharingGroup where the producer vertex locates is available for
current vertex{code}
This procedure has O(N^2) complexity.
Instead of iterating over the ExecutionSlotSharingGroups of producer vertices
for all consumed result partitions, we can maintain a set of all available
ExecutionSlotSharingGroups for the consumed result partitions. Once a group is
assigned, it will be removed from the available group set. This will decrease
the complexity from O(N^2) to O(N).
The optimization of this procedure will speed up the initialization of
DefaultScheduler. It will accelerate the submission of a new job, especially
for OLAP jobs.
> Optimize the initialization of LocalInputPreferredSlotSharingStrategy
> ---------------------------------------------------------------------
>
> Key: FLINK-22767
> URL: https://issues.apache.org/jira/browse/FLINK-22767
> Project: Flink
> Issue Type: Sub-task
> Components: Runtime / Coordination
> Affects Versions: 1.13.0
> Reporter: Zhilong Hong
> Assignee: Zhilong Hong
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Based on the scheduler benchmark introduced in FLINK-21731, we find that
> during the initialization of {{LocalInputPreferredSlotSharingStrategy}},
> there's a procedure that has O(N^2) complexity:
> {{ExecutionSlotSharingGroupBuilder#build}} located in
> {{LocalInputPreferredSlotSharingStrategy}}.
> The original implementation is:
> {code:java}
> for all SchedulingExecutionVertex in DefaultScheduler:
> for all consumed SchedulingResultPartition of the SchedulingExecutionVertex:
> get the result partition's producer vertex and determine the
> ExecutionSlotSharingGroup where the producer vertex locates is available for
> current vertex{code}
> This procedure has O(N^2) complexity.
> Instead of iterating over the ExecutionSlotSharingGroups of producer vertices
> for all consumed result partitions, we can maintain a set of all available
> ExecutionSlotSharingGroups for the consumed result partitions. Once a group
> is assigned, it will be removed from the available group set. This will
> decrease the complexity from O(N^2) to O(N).
> The optimization of this procedure will speed up the initialization of
> DefaultScheduler. It will accelerate the submission of a new job, especially
> for OLAP jobs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)