[
https://issues.apache.org/jira/browse/FLINK-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
loyi updated FLINK-23190:
-------------------------
Description:
Description:
FLINK-12122 only guarantees spreading out tasks across the set of TMs which are
registered at the time of scheduling, but our jobs are all runing on active
yarn mode, the job with smaller source parallelism offen cause load-balance
issues.
For this job:
{code:java}
// -ys 4 means 10 taskmanagers
env.addSource(...).name("A").setParallelism(10).
map(...).name("B").setParallelism(30)
.map(...).name("C").setParallelism(40)
.addSink(...).name("D").setParallelism(20);
{code}
released-1.12.3 allocation:
||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
|A|
1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
|B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
|C|4|4|4|4|4|4|4|4|4|4|
|D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|
Suggestions:
When TM register slots to slotManager , we could group the pendingRequests by
their "ExecutionVertexGroup" , then allocate the slots proportionally to each
group.
I have implement a concept version based on release-1.12.3 , the job have fully
evenly task allocation . I want to know if there are other point that have not
been considered ?
was:
Description:
FLINK-12122 only guarantees spreading out tasks across the set of TMs which are
registered at the time of scheduling, but our jobs are all runing on active
yarn mode, the job with smaller source parallelism offen cause load-balance
issues.
For this job:
{code:java}
// -ys 4 means 10 taskmanagers
env.addSource(...).name("A").setParallelism(10).
map(...).name("B").setParallelism(30)
.map(...).name("C").setParallelism(40)
.addSink(...).name("D").setParallelism(20);
{code}
released-1.12.3 allocation:
||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
|A|
1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
|B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
|C|4|4|4|4|4|4|4|4|4|4|
|D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|
Suggestions:
When TM register slots to slotManager , we could group the pendingRequests by
their "ExecutionVertexGroup" , then allocate the slots proportionally to each
group.
I have implement a concept version based on release-1.12.3 , the job have fully
evenly task allocation . I want to know if there are other point that have not
been considered ?
> Make task-slot allocation much more evenly
> ------------------------------------------
>
> Key: FLINK-23190
> URL: https://issues.apache.org/jira/browse/FLINK-23190
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Task
> Affects Versions: 1.12.3
> Reporter: loyi
> Priority: Minor
>
> Description:
> FLINK-12122 only guarantees spreading out tasks across the set of TMs which
> are registered at the time of scheduling, but our jobs are all runing on
> active yarn mode, the job with smaller source parallelism offen cause
> load-balance issues.
>
> For this job:
> {code:java}
> // -ys 4 means 10 taskmanagers
> env.addSource(...).name("A").setParallelism(10).
> map(...).name("B").setParallelism(30)
> .map(...).name("C").setParallelism(40)
> .addSink(...).name("D").setParallelism(20);
> {code}
>
> released-1.12.3 allocation:
> ||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
> |A|
> 1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
> |B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
> |C|4|4|4|4|4|4|4|4|4|4|
> |D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|
>
> Suggestions:
> When TM register slots to slotManager , we could group the pendingRequests by
> their "ExecutionVertexGroup" , then allocate the slots proportionally to each
> group.
>
> I have implement a concept version based on release-1.12.3 , the job have
> fully evenly task allocation . I want to know if there are other point that
> have not been considered ?
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)