[ 
https://issues.apache.org/jira/browse/FLINK-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16710041#comment-16710041
 ] 

TisonKun edited comment on FLINK-10640 at 12/5/18 1:36 PM:
-----------------------------------------------------------

@[~wuzang]

After an offline discuss with [~till.rohrmann], for part of "TM Management" 
issue, i.e., start arbitrary TMs on yarn session launched, I propose introduce 
a pair (min, max) represents the minimum and maximum for the number of running 
{{TaskExecutor}}s.

With such option, when setting {{minimum = maximum = n}} we effectively have 
the same behaviour as before with the pre-Flip-6 code, that is, a fixed number 
of pre-allocated TMs; and when setting {{minimum = 0, maximum = inf}} we 
effectively have the same behaviour as current code path. I think such a 
feature improve "TM Management" especially when user want to running job on a 
specific cluster and require less changes than achieving an arbitrarily 
flexible "TM Management".

What do you think? (FYI I create a separated JIRA FLINK-11078 to discuss this 
topic)


was (Author: tison):
@[~wuzang]

After an offline discuss with [~till.rohrmann], for part of "TM Management" 
issue, i.e., start arbitrary TMs on yarn session launched, I propose introduce 
a pair (min, max) represents the minimum and maximum for the number of running 
{{TaskExecutor}}s.

With such option, when setting {{minimum = maximum = n}} we effectively have 
the same behaviour as before with the pre-Flip-6 code, that is, a fixed number 
of pre-allocated TMs; and when setting {{minimum = 0, maximum = inf}} we 
effectively have the same behaviour as current code path. I think such a 
feature improve "TM Management" especially when user want to running job on a 
specific cluster and require less changes than achieving an arbitrarily 
flexible "TM Management".

What do you think?

> Enable Slot Resource Profile for Resource Management
> ----------------------------------------------------
>
>                 Key: FLINK-10640
>                 URL: https://issues.apache.org/jira/browse/FLINK-10640
>             Project: Flink
>          Issue Type: New Feature
>          Components: ResourceManager
>            Reporter: Tony Xintong Song
>            Priority: Major
>
> Motivation & Backgrounds
>  * The existing concept of task slots roughly represents how many pipeline of 
> tasks a TaskManager can hold. However, it does not consider the differences 
> in resource needs and usage of individual tasks. Enabling resource profiles 
> of slots may allow Flink to better allocate execution resources according to 
> tasks fine-grained resource needs.
>  * The community version Flink already contains APIs and some implementation 
> for slot resource profile. However, such logic is not truly used. 
> (ResourceProfile of slot requests is by default set to UNKNOWN with negative 
> values, thus matches any given slot.)
> Preliminary Design
>  * Slot Management
>  A slot represents a certain amount of resources for a single pipeline of 
> tasks to run in on a TaskManager. Initially, a TaskManager does not have any 
> slots but a total amount of resources. When allocating, the ResourceManager 
> finds proper TMs to generate new slots for the tasks to run according to the 
> slot requests. Once generated, the slot's size (resource profile) does not 
> change until it's freed. ResourceManager can apply different, portable 
> strategies to allocate slots from TaskManagers.
>  * TM Management
>  The size and number of TaskManagers and when to start them can also be 
> flexible. TMs can be started and released dynamically, and may have different 
> sizes. We may have many different, portable strategies. E.g., an elastic 
> session that can run multiple jobs like the session mode while dynamically 
> adjusting the size of session (number of TMs) according to the realtime 
> working load.
>  * About Slot Sharing
>  Slot sharing is a good heuristic to easily calculate how many slots needed 
> to get the job running and get better utilization when there is no resource 
> profile in slots. However, with resource profiles enabling finer-grained 
> resource management, each individual task has its specific resource need and 
> it does not make much sense to have multiple tasks sharing the resource of 
> the same slot. Instead, we may introduce locality preferences/constraints to 
> support the semantics of putting tasks in same/different TMs in a more 
> general way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to