[ 
https://issues.apache.org/jira/browse/FLINK-38622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18035301#comment-18035301
 ] 

RocMarshal commented on FLINK-38622:
------------------------------------

During the batch allocation of resources, where resource requests are allocated 
in a single, non-interleaved operation, it is impossible to make immediate 
individual adjustments to unmatched resource requests. This may lead to 
situations where not all resource requests can be successfully fulfilled. For 
example:
  resource requests:
   - resource request-1: ResourceProfile-1(UNKNOWN)
   - resource request-2: ResourceProfile-2(cpu=2 core, memory=2G)
 
  available slots:
   - slot-a: ResourceProfile-a(cpu=1 core, memory=1G)
   - slot-b: ResourceProfile-b(cpu=2 core, memory=2G)
  
When the strategy TasksBalancedRequestSlotMatchingStrategy performs resource 
allocation, the following matching mapping might occur, preventing all slot 
requests from being successfully assigned in a consistent manner and thus 
hindering the scheduling of the entire job:
  the unexpected mapping case:
    - resource request-1: ResourceProfile-1(UNKNOWN) was matched with slot-b: 
ResourceProfile-b(cpu=2 core, memory=2G)
    - resource request-2: ResourceProfile-2(cpu=2 core, memory=2G) was not 
matched
  
Therefore, it is crucial to determine how ResourceProfiles should match before 
the batch allocation of resource requests, aiming to assure the allocation 
successfully at least. An ideal matching relationship would be:
  - ResourceProfile-1(UNKNOWN)               -> ResourceProfile-a(cpu=1 core, 
memory=1G)
  - ResourceProfile-2(cpu=2 core, memory=2G) -> ResourceProfile-b(cpu=2 core, 
memory=2G)
  
This is the motivation for introducing the current ticket.

> Enhance the requests and slots balanced allocation logic in DefaultScheduler
> ----------------------------------------------------------------------------
>
>                 Key: FLINK-38622
>                 URL: https://issues.apache.org/jira/browse/FLINK-38622
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Coordination
>            Reporter: RocMarshal
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to