[jira] [Updated] (YARN-5846) Improve the fairscheduler attemptScheduler

2017-01-05 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-5846:
-
Fix Version/s: (was: 2.7.1)

> Improve the fairscheduler attemptScheduler 
> ---
>
> Key: YARN-5846
> URL: https://issues.apache.org/jira/browse/YARN-5846
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.7.1
> Environment: CentOS-7.1
>Reporter: zhengchenyu
>Priority: Critical
>  Labels: fairscheduler
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> when I assign a container, we must consider two factor:
> (1) sort the queue and application, and select the proper request. 
> (2) then we assure this request's host is just this node (data locality). 
> or skip this loop!
> this algorithm regard the sorting queue and application as primary factor. 
> when yarn consider data locality, for example, 
> yarn.scheduler.fair.locality.threshold.node=1, 
> yarn.scheduler.fair.locality.threshold.rack=1 (or 
> yarn.scheduler.fair.locality-delay-rack-ms and 
> yarn.scheduler.fair.locality-delay-node-ms is very large) and lots of 
> applications are runnig, the process of assigning contianer becomes very slow.
> I think data locality is more important then the sequence of the queue and 
> applications. 
> I wanna a new algorithm like this:
>   (1) when resourcemanager accept a new request, notice the RMNodeImpl, 
> and then record this association between RMNode and request
>   (2) when assign containers for node, we assign container by 
> RMNodeImpl's association between RMNode and request directly
>   (3) then I consider the priority of queue and applation. In one object 
> of RMNodeImpl, we sort the request of association.
>   (4) and I think the sorting of current algorithm is consuming, in 
> especial, losts of applications are running, lots of sorting are called. so I 
> think we should sort the queue and applicaiton in a daemon thread, because 
> less error of queues's sequences is allowed.
>   
>   
>   
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5846) Improve the fairscheduler attemptScheduler

2016-11-09 Thread zhengchenyu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhengchenyu updated YARN-5846:
--
Priority: Critical  (was: Minor)

> Improve the fairscheduler attemptScheduler 
> ---
>
> Key: YARN-5846
> URL: https://issues.apache.org/jira/browse/YARN-5846
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.7.1
> Environment: CentOS-7.1
>Reporter: zhengchenyu
>Priority: Critical
>  Labels: fairscheduler
> Fix For: 2.7.1
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> when I assign a container, we must consider two factor:
> (1) sort the queue and application, and select the proper request. 
> (2) then we assure this request's host is just this node (data locality). 
> or skip this loop!
> this algorithm regard the sorting queue and application as primary factor. 
> when yarn consider data locality, for example, 
> yarn.scheduler.fair.locality.threshold.node=1, 
> yarn.scheduler.fair.locality.threshold.rack=1 (or 
> yarn.scheduler.fair.locality-delay-rack-ms and 
> yarn.scheduler.fair.locality-delay-node-ms is very large) and lots of 
> applications are runnig, the process of assigning contianer becomes very slow.
> I think data locality is more important then the sequence of the queue and 
> applications. 
> I wanna a new algorithm like this:
>   (1) when resourcemanager accept a new request, notice the RMNodeImpl, 
> and then record this association between RMNode and request
>   (2) when assign containers for node, we assign container by 
> RMNodeImpl's association between RMNode and request directly
>   (3) then I consider the priority of queue and applation. In one object 
> of RMNodeImpl, we sort the request of association.
>   (4) and I think the sorting of current algorithm is consuming, in 
> especial, losts of applications are running, lots of sorting are called. so I 
> think we should sort the queue and applicaiton in a daemon thread, because 
> less error of queues's sequences is allowed.
>   
>   
>   
>   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org