[jira] [Comment Edited] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler

2019-09-21 Thread zhoukang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16935113#comment-16935113
 ] 

zhoukang edited comment on YARN-5139 at 9/21/19 6:14 PM:
-

4. Add PlacementSet and score nodes implementation.
this has not been implemented? [~leftnoteasy]


was (Author: cane):
4. Add PlacementSet and score nodes implementation.
this has not been implemented? [~leftnoteasy]

> [Umbrella] Move YARN scheduler towards global scheduler
> ---
>
> Key: YARN-5139
> URL: https://issues.apache.org/jira/browse/YARN-5139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: Explanantions of Global Scheduling (YARN-5139) 
> Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, 
> YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, 
> wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to 
> sub-optimal decisions because scheduler can only look at one node at the time 
> when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
>Go to parentQueue
>   Go to leafQueue
> for application in leafQueue.applications:
>for resource-request in application.resource-requests
>   try to schedule on node
> {code}
> Considering future complex resource placement requirements, such as node 
> constraints (give me "a && b || c") or anti-affinity (do not allocate HBase 
> regionsevers and Storm workers on the same host), we may need to consider 
> moving YARN scheduler towards global scheduling.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler

2018-06-05 Thread zhuqi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501548#comment-16501548
 ] 

zhuqi edited comment on YARN-5139 at 6/5/18 3:15 PM:
-

Hi [~wangda] , i want to know what the version of hadoop this patch based, and 
i want to try to help  the Fair scheduler to move to gloabal scheduling , 
because our company use the Fair scheduler and we are looking forward to the 
global scheduling to help for faster scheduling especially for high-priority 
applications.

Thanks.

 


was (Author: zhuqi):
Hi [~wangda] , i want to know what the version of hadoop this patch based, and 
i want to try to help  the fairscheduler to move to gloabal scheduling , 
because our company use the fairscheduler and we are looking forward to the 
global scheduling to help for faster scheduling especially for high-priority 
applications.

Thanks.

 

> [Umbrella] Move YARN scheduler towards global scheduler
> ---
>
> Key: YARN-5139
> URL: https://issues.apache.org/jira/browse/YARN-5139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: Explanantions of Global Scheduling (YARN-5139) 
> Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, 
> YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, 
> wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to 
> sub-optimal decisions because scheduler can only look at one node at the time 
> when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
>Go to parentQueue
>   Go to leafQueue
> for application in leafQueue.applications:
>for resource-request in application.resource-requests
>   try to schedule on node
> {code}
> Considering future complex resource placement requirements, such as node 
> constraints (give me "a && b || c") or anti-affinity (do not allocate HBase 
> regionsevers and Storm workers on the same host), we may need to consider 
> moving YARN scheduler towards global scheduling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler

2018-06-05 Thread zhuqi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501548#comment-16501548
 ] 

zhuqi edited comment on YARN-5139 at 6/5/18 11:54 AM:
--

Hi [~wangda] , i want to know what the version of hadoop this patch based, and 
i want to try to help  the fairscheduler to move to gloabal scheduling , 
because our company use the fairscheduler and we are looking forward to the 
global scheduling to help for faster scheduling especially for high-priority 
applications.

Thanks.

 


was (Author: zhuqi):
Hi [~wangda] , what the version of hadoop this patch based, and i want to try 
to help  the fairscheduler to move to gloabal scheduling , because our company 
use the fairscheduler and we are looking forward to the global scheduling to 
solve our problems.

Thanks.

 

> [Umbrella] Move YARN scheduler towards global scheduler
> ---
>
> Key: YARN-5139
> URL: https://issues.apache.org/jira/browse/YARN-5139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Major
> Attachments: Explanantions of Global Scheduling (YARN-5139) 
> Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, 
> YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, 
> wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to 
> sub-optimal decisions because scheduler can only look at one node at the time 
> when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
>Go to parentQueue
>   Go to leafQueue
> for application in leafQueue.applications:
>for resource-request in application.resource-requests
>   try to schedule on node
> {code}
> Considering future complex resource placement requirements, such as node 
> constraints (give me "a && b || c") or anti-affinity (do not allocate HBase 
> regionsevers and Storm workers on the same host), we may need to consider 
> moving YARN scheduler towards global scheduling.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler

2016-10-30 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620108#comment-15620108
 ] 

Arun Suresh edited comment on YARN-5139 at 10/30/16 3:32 PM:
-

[~leftnoteasy], Was just going thru the design.

Was wondering how you tackle uniform distribution of allocations.
This was one nice thing the existing Node Heartbeat based implementation gives 
you for free.

For example, assuming you have just a single default queue and you have a 
cluster of say 1 nodes.
Say we have around 100 apps running. Since the ClusterNodeTracker will always 
give the same ordering of the 1 nodes, It is possible this new scheduling 
logic would 'front-load' all allocations to the Node that appears in the front 
of the PlacementSet (Since the placement set provided to each application would 
be fundamentally the same). In the NodeHeartbeat driven case, the node that has 
just 'heartebeat-ed' will be preferred for allocation, and since heartbeats 
from all nodes are distributed uniformly, you will generally never see this 
issue. This is probably not too much of an issue in a fully pegged cluster, but 
for clusters that are running at around 50% utilization, you will probably see 
half the nodes fully pegged and the other half mostly sitting idle.

Another thing that came to mind is that, given that you are kind of 
'late-binding' the request to a group of nodes. In large clusters of sizes > 
10K, it is very common to have around 5% of nodes to keep going up and down. In 
which case, you might have to re-do you allocation if the Node you had selected 
for an allocation had gone down. In a node heartbeat driven scheme, the chances 
of that happening are less, since you are allocating on a node that just 
'heartbeat-ed' so you can be fairly certain that the node should be healthy.

Let me know what you think.










was (Author: asuresh):
[~leftnoteasy], Was just going thru the design.

Was wondering how you tackle uniform distribution of allocations.
This was one nice thing the existing Node Heartbeat based implementation gives 
you for free.

For example, assuming you have just a single default queue and you have a 
cluster of say 1 nodes.
Say we have around 100 apps running. Since the ClusterNodeTracker will always 
give the same ordering of the 1 nodes, It is possible this new scheduling 
logic fill 'front-load' all allocations to the Node that appears in the front 
of the PlacementSet (Since the placement set provided to each application would 
be fundamentally the same). In the NodeHeartbeat driven case, the node that has 
just 'heartebeat-ed' will be preferred for allocation, and since heartbeats 
from all nodes are distributed uniformly, you will generally never see this 
issue. This is probably not too much of an issue in a fully pegged cluster, but 
for clusters that are running at around 50% utilization, you will probably see 
half the nodes fully pegged and the other half mostly sitting idle.

Another thing that came to mind is that, given that you are kind of 
'late-binding' the request to a group of nodes. In large clusters of sizes > 
10K, it is very common to have around 5% of nodes to keep going up and down. In 
which case, you might have to re-do you allocation if the Node you had selected 
for an allocation had gone down. In a node heartbeat driven scheme, the chances 
of that happening are less, since you are allocating on a node that just 
'heartbeat-ed' so you can be fairly certain that the node should be healthy.

Let me know what you think.









> [Umbrella] Move YARN scheduler towards global scheduler
> ---
>
> Key: YARN-5139
> URL: https://issues.apache.org/jira/browse/YARN-5139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: Explanantions of Global Scheduling (YARN-5139) 
> Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf, 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, 
> YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, 
> wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to 
> sub-optimal decisions because scheduler can only look at one node at the time 
> when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
>Go to parentQueue
>   Go to leafQueue
> for application in leafQueue.applications:
>for resource-request in application.resource-requests
>   try to schedule on node
> {code}
> Considering future complex resource placement requirements, 

[jira] [Comment Edited] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler

2016-08-04 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408573#comment-15408573
 ] 

Wangda Tan edited comment on YARN-5139 at 8/4/16 10:04 PM:
---

Attached design and implementation notes. Thanks [~vinodkv], [~hitesh] and 
[~bikassaha] for valuable offline suggestions.


was (Author: leftnoteasy):
Attached design and implementation notes. Thanks [~vinodkv], [~hitesh] and 
[~bikassaha] for great offline suggestions.

> [Umbrella] Move YARN scheduler towards global scheduler
> ---
>
> Key: YARN-5139
> URL: https://issues.apache.org/jira/browse/YARN-5139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: 
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf, 
> wip-1.YARN-5139.patch, wip-2.YARN-5139.patch, wip-3.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to 
> sub-optimal decisions because scheduler can only look at one node at the time 
> when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
>Go to parentQueue
>   Go to leafQueue
> for application in leafQueue.applications:
>for resource-request in application.resource-requests
>   try to schedule on node
> {code}
> Considering future complex resource placement requirements, such as node 
> constraints (give me "a && b || c") or anti-affinity (do not allocate HBase 
> regionsevers and Storm workers on the same host), we may need to consider 
> moving YARN scheduler towards global scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5139) [Umbrella] Move YARN scheduler towards global scheduler

2016-05-24 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15299024#comment-15299024
 ] 

Karthik Kambatla edited comment on YARN-5139 at 5/24/16 10:07 PM:
--

[~leftnoteasy] - thanks for filing this.

In our experience with continuous scheduling, this approach will help for 
faster scheduling especially for high-priority applications. That said, once 
you gain more experience with your prototype, can we actually decide on the 
right abstractions so the schedulers share more code in the new world. 


was (Author: kasha):
[~leftnoteasy] - thanks for filing this.

In our experience with continuous scheduling, this approach will help for 
faster scheduling especially for high-priority applications. That said, once 
you gain more experience with your prototype, can we actually decide on the 
right abstracts so in the new world the schedulers share more code. 

> [Umbrella] Move YARN scheduler towards global scheduler
> ---
>
> Key: YARN-5139
> URL: https://issues.apache.org/jira/browse/YARN-5139
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: wip-1.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to 
> sub-optimal decisions because scheduler can only look at one node at the time 
> when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
>Go to parentQueue
>   Go to leafQueue
> for application in leafQueue.applications:
>for resource-request in application.resource-requests
>   try to schedule on node
> {code}
> Considering future complex resource placement requirements, such as node 
> constraints (give me "a && b || c") or anti-affinity (do not allocate HBase 
> regionsevers and Storm workers on the same host), we may need to consider 
> moving YARN scheduler towards global scheduling.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org