[ 
https://issues.apache.org/jira/browse/FLINK-9455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624754#comment-16624754
 ] 

ASF GitHub Bot commented on FLINK-9455:
---------------------------------------

tillrohrmann opened a new pull request #6734: [FLINK-9455][RM] Add support for 
multi task slot TaskExecutors
URL: https://github.com/apache/flink/pull/6734
 
 
   ## What is the purpose of the change
   
   This PR adds support for multi task slot TaskExecutors to Flink. Before it 
was recommended to start a Flink cluster with single slot `TaskExecutors`. Now 
also multi slot `TaskExecutors` can be configured and Flink won't allocate 
resources over-excessively.
   
   ## Brief change log
   
   - Extend `ResourceActions#allocateResource` to return 
`Collection<ResourceProfile>` indicating the set of slots to expect
   - Store expected slots as `PendingTaskManagerSlot`
   - Use `PendingTaskManagerSlot` to fulfill `PendingSlotRequest`
   - Only ask for new resources if there are no more `TaskManagerSlots` and 
`PendingTaskManagerSlots`
   
   ## Verifying this change
   
   - Added `SlotManagerTest#`: `testRequestNewResources`, 
`testFailingAllocationReturnsPendingTaskManagerSlot`, 
`testPendingTaskManagerSlotCompletion`, `testRegistrationOfDifferentSlot`, 
`testOnlyFreeSlotsCanFulfillPendingTaskManagerSlot`, 
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
     - The serializers: (no)
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (**yes**)
     - The S3 file system connector: (no)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (no)
     - If yes, how is the feature documented? (not applicable)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Make SlotManager aware of multi slot TaskManagers
> -------------------------------------------------
>
>                 Key: FLINK-9455
>                 URL: https://issues.apache.org/jira/browse/FLINK-9455
>             Project: Flink
>          Issue Type: Improvement
>          Components: Distributed Coordination, ResourceManager
>    Affects Versions: 1.5.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.7.0
>
>
> The {{SlotManager}} responsible for managing all available slots of a Flink 
> cluster can request to start new {{TaskManagers}} if it cannot fulfill a slot 
> request. The started {{TaskManager}} can be started with multiple slots 
> configured but currently, the {{SlotManager}} thinks that it will be started 
> with a single slot. As a consequence, it might issue multiple requests to 
> start new TaskManagers even though a single one would be sufficient to 
> fulfill all pending slot requests.
> In order to avoid requesting unnecessary resources which are freed after the 
> idle timeout, I suggest to make the {{SlotManager}} aware of how many slots a 
> {{TaskManager}} is started with. That way the SlotManager only needs to 
> request a new {{TaskManager}} if all of the previously started slots 
> (potentially not yet registered and, thus, future slots) are being assigned 
> to slot requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to