[ 
https://issues.apache.org/jira/browse/FLINK-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

loyi updated FLINK-23190:
-------------------------
    Description: 
Description:

FLINK-12122 only guarantees spreading out tasks across the set of TMs which are 
registered at the time of scheduling, but our jobs are all runing on active 
yarn mode, the job with smaller source parallelism offen cause load-balance 
issues. 

 

For this job:
{code:java}
//  -ys 4     means 10 taskmanagers

env.addSource(...).name("A").setParallelism(10).
 map(...).name("B").setParallelism(30)
 .map(...).name("C").setParallelism(40)
 .addSink(...).name("D").setParallelism(20);
{code}
 

 released-1.12.3  allocation: 
||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
|A| 
1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
|B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
|C|4|4|4|4|4|4|4|4|4|4|
|D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|

 

Suggestions:

When TM register slots to slotManager , we could group the pendingRequests by 
their "ExecutionVertexGroup" , then allocate the slots proportionally to each 
group.

 

I have implement a concept version based on release-1.12.3 , the job have fully 
evenly task allocation . I want to know if there are other point that have not 
been considered ?  

 

 

 

  was:
Description:

FLINK-12122 only guarantees spreading out tasks across the set of TMs which are 
registered at the time of scheduling, but our jobs are all runing on active 
yarn mode, the job with smaller source parallelism offen cause load-balance 
issues. 

For this job:

 

 
{code:java}
//  -ys 4     means 10 taskmanagers

env.addSource(...).name("A").setParallelism(10).
 map(...).name("B").setParallelism(30)
 .map(...).name("C").setParallelism(40)
 .addSink(...).name("D").setParallelism(20);
{code}
 

 released-1.12.3  allocation:

 

 
||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
|A| 
1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
|B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
|C|4|4|4|4|4|4|4|4|4|4|
|D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|

 

Suggestions:

When TM register slots to slotManager , we could group the pendingRequests by 
their "ExecutionVertexGroup" , then allocate the slots proportionally to each 
group.

 

I have implement a concept version based on release-1.12.3 , the job have fully 
evenly task allocation . I want to know if there are other point that have not 
been considered ?  

 

 

 


> Make task-slot allocation much more evenly
> ------------------------------------------
>
>                 Key: FLINK-23190
>                 URL: https://issues.apache.org/jira/browse/FLINK-23190
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Task
>    Affects Versions: 1.12.3
>            Reporter: loyi
>            Priority: Minor
>
> Description:
> FLINK-12122 only guarantees spreading out tasks across the set of TMs which 
> are registered at the time of scheduling, but our jobs are all runing on 
> active yarn mode, the job with smaller source parallelism offen cause 
> load-balance issues. 
>  
> For this job:
> {code:java}
> //  -ys 4     means 10 taskmanagers
> env.addSource(...).name("A").setParallelism(10).
>  map(...).name("B").setParallelism(30)
>  .map(...).name("C").setParallelism(40)
>  .addSink(...).name("D").setParallelism(20);
> {code}
>  
>  released-1.12.3  allocation: 
> ||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
> |A| 
> 1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
> |B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
> |C|4|4|4|4|4|4|4|4|4|4|
> |D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|
>  
> Suggestions:
> When TM register slots to slotManager , we could group the pendingRequests by 
> their "ExecutionVertexGroup" , then allocate the slots proportionally to each 
> group.
>  
> I have implement a concept version based on release-1.12.3 , the job have 
> fully evenly task allocation . I want to know if there are other point that 
> have not been considered ?  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to