[
https://issues.apache.org/jira/browse/FLINK-31771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17712223#comment-17712223
]
Weihua Hu commented on FLINK-31771:
-----------------------------------
[~wanglijie] Thanks for the reply. I think select slots for a bulk will be
great helpful. How about create a new issue to track it?
And, Currently select slot from getFreeSlotInformations only occur in failover.
IMO we also add this scenario to benchmark.
I would like to create tickets for these if these make sense
> Improve select available slot from SlotPool
> -------------------------------------------
>
> Key: FLINK-31771
> URL: https://issues.apache.org/jira/browse/FLINK-31771
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Coordination
> Reporter: Weihua Hu
> Priority: Major
>
> DefaultScheduler will request slots from SlotPool for tasks one by one.
> For each task, the PhysicalSlotProviderImpl#tryAllocateFromAvailable will
> retrieve all available slots from
> DefaultAllocatedSlotPool#getFreeSlotsInformation, and then select the best
> slot by SlotSelectionStrategy.
> Currently DefaultAllocatedSlotPool#getFreeSlotsInformation always calculates
> the taskExecutorUtilization. This causes task schedules to be too slow when
> there are lots of slots, such as 20000 slots total. But only the
> EvenlySpreadOutLocationPreferenceSlotSelectionStrategy uses this utilization.
> So I would like to move the calculation of taskExecutorUtilization to usage.
> DefaultAllocatedSlotPool provides a function: getTaskExecutorUtilization, and
> is only used in EvenlySpreadOutLocationPreferenceSlotSelectionStrategy.
> This change could reduce the latency of allocated 20000 slots from 72s to 12s
> in my local IDE.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)