[ 
https://issues.apache.org/jira/browse/FLINK-9455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502918#comment-16502918
 ] 

Till Rohrmann commented on FLINK-9455:
--------------------------------------

Hi [~sihuazhou], the issue is indeed a bit more involved than it might look on 
the first glance.
* Given FLINK-5131, we should be able to handle that TMs might be started with 
different {{ResourceProfiles}}. However, until this issue will be resolved, all 
slots will have the same immutable {{ResourceProfile}} defined at start up time 
of the cluster.
* I think we should assign slot ids on the {{ResourceManager}} side in order to 
match registered slots to slot allocations. However, it should also work that 
we can fulfill a pending slot allocation with any other freed slot which 
fulfills the resource requirements. In a sense the {{SlotManager}} should work 
in a similar fashion as the {{SlotPool}} does at the moment.

Since this change will be quite delicate, let's start with a design document 
where we capture all the requirements such that we are sure that we are not 
going off in the wrong direction.

> Make SlotManager aware of multi slot TaskManagers
> -------------------------------------------------
>
>                 Key: FLINK-9455
>                 URL: https://issues.apache.org/jira/browse/FLINK-9455
>             Project: Flink
>          Issue Type: Improvement
>          Components: ResourceManager
>    Affects Versions: 1.5.0
>            Reporter: Till Rohrmann
>            Assignee: Sihua Zhou
>            Priority: Major
>             Fix For: 1.6.0, 1.5.1
>
>
> The {{SlotManager}} responsible for managing all available slots of a Flink 
> cluster can request to start new {{TaskManagers}} if it cannot fulfill a slot 
> request. The started {{TaskManager}} can be started with multiple slots 
> configured but currently, the {{SlotManager}} thinks that it will be started 
> with a single slot. As a consequence, it might issue multiple requests to 
> start new TaskManagers even though a single one would be sufficient to 
> fulfill all pending slot requests.
> In order to avoid requesting unnecessary resources which are freed after the 
> idle timeout, I suggest to make the {{SlotManager}} aware of how many slots a 
> {{TaskManager}} is started with. That way the SlotManager only needs to 
> request a new {{TaskManager}} if all of the previously started slots 
> (potentially not yet registered and, thus, future slots) are being assigned 
> to slot requests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to