zentol opened a new pull request #13553:
URL: https://github.com/apache/flink/pull/13553


   Based on #13544.
   
   Adds the `DeclarativeSlotManager` (DSM), an alternative `SlotManager` 
implementation that supports FLIP-138.
   
   The core difference to the existing implementation is that the DSM does not 
receive individual slot requests from the JobMaster, but a 
`ResourceRequirements` instead that outlines the absolute requirements for the 
job.
   The DSM keeps track of these, along with all registered and pending slots, 
matches requirements to slots and initiates the allocation of the slot.
   
   Most of this logic and book-keeping is handled by recently added components 
(`SlotTracker`, `ResourceTracker`, `TaskExecutorManager`).
   
   What is left in the DSM itself is
   a) a bunch of glue between these components and the `SlotManager` API
   b) the logic for matching missing to available/pending resources and 
initiating the corresponding slot allocations.
   
   The entry point for b) is `checkResourceRequirements()`, which is called on 
any significant change to the state of requirements or resources; it is called 
when requirements change, slots are added/removed, allocations are 
complete/failed.
   This method essentially does the following:
   - retrieve the currently missing resources from the `ResourceTracker`
   - try assigning these resources to free slots, provided by the `SlotTracker`
     - if a match was found, start the allocation of the slot
   - if at the end of this process there are still missing resource, match the 
remaining missing resources against the pending slots, provided by the 
`TaskExecutorManager`
     - if a match was found for a requirement, do nothing. We do not maintain a 
mapping of requirements to pending resources as it would double the 
book-keeping. We just consider this requirement as fulfilled within the scope 
of this check.
     - if no match was found for a requirement, try allocating a new worker 
(implicitly increasing the pool of pending slots)
   - if at the end of this process missing resource are still present (== no 
more workers can be allocated) inform the JobMaster about there not being 
enough resources to fulfill the requirement
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to