[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2017-11-01 Thread George Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16234816#comment-16234816
 ] 

George Wu commented on YARN-3453:
-

Just curious, was there any reason why this wasn't backported to the 2.7 branch?

> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
>Priority: Major
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626489#comment-14626489
 ] 

Hudson commented on YARN-3453:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #254 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/254/])
YARN-3453. Ensure preemption logic in FairScheduler uses 
DominantResourceCalculator in DRF queues to prevent unnecessary thrashing. 
(asuresh) (Arun Suresh: rev ac94ba3e185115b83351e35c610c2b8ff91b1ebc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java
* hadoop-yarn-project/CHANGES.txt


> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626449#comment-14626449
 ] 

Hudson commented on YARN-3453:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #2202 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2202/])
YARN-3453. Ensure preemption logic in FairScheduler uses 
DominantResourceCalculator in DRF queues to prevent unnecessary thrashing. 
(asuresh) (Arun Suresh: rev ac94ba3e185115b83351e35c610c2b8ff91b1ebc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java


> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626400#comment-14626400
 ] 

Hudson commented on YARN-3453:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #244 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/244/])
YARN-3453. Ensure preemption logic in FairScheduler uses 
DominantResourceCalculator in DRF queues to prevent unnecessary thrashing. 
(asuresh) (Arun Suresh: rev ac94ba3e185115b83351e35c610c2b8ff91b1ebc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java


> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626384#comment-14626384
 ] 

Hudson commented on YARN-3453:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2183 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2183/])
YARN-3453. Ensure preemption logic in FairScheduler uses 
DominantResourceCalculator in DRF queues to prevent unnecessary thrashing. 
(asuresh) (Arun Suresh: rev ac94ba3e185115b83351e35c610c2b8ff91b1ebc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626238#comment-14626238
 ] 

Hudson commented on YARN-3453:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #986 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/986/])
YARN-3453. Ensure preemption logic in FairScheduler uses 
DominantResourceCalculator in DRF queues to prevent unnecessary thrashing. 
(asuresh) (Arun Suresh: rev ac94ba3e185115b83351e35c610c2b8ff91b1ebc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java


> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626227#comment-14626227
 ] 

Hudson commented on YARN-3453:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #256 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/256/])
YARN-3453. Ensure preemption logic in FairScheduler uses 
DominantResourceCalculator in DRF queues to prevent unnecessary thrashing. 
(asuresh) (Arun Suresh: rev ac94ba3e185115b83351e35c610c2b8ff91b1ebc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625985#comment-14625985
 ] 

Hudson commented on YARN-3453:
--

FAILURE: Integrated in Hadoop-trunk-Commit #8160 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/8160/])
YARN-3453. Ensure preemption logic in FairScheduler uses 
DominantResourceCalculator in DRF queues to prevent unnecessary thrashing. 
(asuresh) (Arun Suresh: rev ac94ba3e185115b83351e35c610c2b8ff91b1ebc)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/SchedulingPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSLeafQueue.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/DominantResourceFairnessPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java


> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-13 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625019#comment-14625019
 ] 

Arun Suresh commented on YARN-3453:
---

Thanks for the reviews [~kasha], [~ashwinshankar77] and [~peng.zhang]
Will be committing this shortly..

> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-13 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625000#comment-14625000
 ] 

Karthik Kambatla commented on YARN-3453:


+1

> Fair Scheduler: Parts of preemption logic uses DefaultResourceCalculator even 
> in DRF mode causing thrashing
> ---
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-10 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622450#comment-14622450
 ] 

Arun Suresh commented on YARN-3453:
---

The test case failures are unrelated and they run fine on my laptop.
[~kasha], can you please give it a quick rev (only major thing changed apart 
from addressing your previous comments is addition of test cases) 

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621970#comment-14621970
 ] 

Hadoop QA commented on YARN-3453:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m  4s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 46s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  6s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 19s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  50m 50s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 53s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions |
|   | 
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates |
|   | hadoop.yarn.server.resourcemanager.TestResourceTrackerService |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12744652/YARN-3453.5.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / d66302e |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8493/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8493/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8493/console |


This message was automatically generated.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-10 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621919#comment-14621919
 ] 

Peng Zhang commented on YARN-3453:
--

Yes, I agree with more fix in separate JIRA. And {{YARN-3453.5.patch}} LGTM.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-10 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621902#comment-14621902
 ] 

Arun Suresh commented on YARN-3453:
---

[~peng.zhang],

bq. After review above comments, I am reminded that the case (0 GB, non-zero 
cores) like (non-zero GB, 0 cores) will also cause preempt more resources than 
necessary.
I agree... But I feel instead of fixing it here, if we can have a comprehensive 
fix as requested by YARN-2154 ( [~kasha] and myself had an offline discussion 
about how we should actually break from the preemption loop when incoming 
requests are satisfied), then we wont even hit this case.
Further more, this JIRA fixes the {{isStarved()}} method in the Queue 
correctly, so at the very least, the {{toPreempt}} resource object would be 
smaller (and thus would implicitly result in less pre-emptions)

I also agree fining the ratio of demand is definitely useful. But again, let us 
grab all the low hanging fruit first. I propose we create a separate JIRA for 
that.


> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-10 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621883#comment-14621883
 ] 

Peng Zhang commented on YARN-3453:
--

I understood your thought. My suggestion is based on our practice: I found it's 
confusing to use different policy in queue configuration: eg. parent use fair, 
child use drf may cause child queue has no resource on cpu dimension, so job 
will hang there. So we use only drf in one cluster, and change the code to 
support setting the calculator class in scheduler scope.

After review above comments, I am reminded that the case (0 GB, non-zero cores) 
like (non-zero GB, 0 cores) will also cause preempt more resources than 
necessary.

I mentioned before:
bq. To decrease this kind of waste, I want to found what's the ratio of demand 
can be fulfilled by resourceUpperBound, and use this ratio * resourceUpperBound 
to be targetResource.
Actually, current implementation ignored the resource boundary of each 
requested container, so even after above logic, it still will has some waste.

As for YARN-2154, if we want to only preempt containers can satisfy incoming 
request, IMHO, we should to do preemption for each incoming request instead 
count them up with {{resourceDeficit}}.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch, YARN-3453.5.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-09 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621631#comment-14621631
 ] 

Peng Zhang commented on YARN-3453:
--

Thanks [~asuresh] for working on this

comments:
# Why not changing all usage of calculator in FairScheduler to policy related. 
In below code, RESOURCE_CALCULATOR only calculate memory, and it may return 
false when resToPreempt is (0, non-zero) for DRF policy:
{code:title=FairScheduler.java|borderStyle=solid}
 if (Resources.greaterThan(RESOURCE_CALCULATOR, clusterResource, 
resToPreempt,
 Resources.none())) {
{code}




> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-09 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620643#comment-14620643
 ] 

Arun Suresh commented on YARN-3453:
---

Test case failure is un-related. Jenkins had passed when i kicked it off 
manually 
[here|https://issues.apache.org/jira/browse/YARN-3453?focusedCommentId=14620218&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14620218]

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620228#comment-14620228
 ] 

Hadoop QA commented on YARN-3453:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m  0s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 44s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  6s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 23s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 26s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  61m 14s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  99m 10s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/1276/YARN-3453.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 63d0365 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8473/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8473/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8473/console |


This message was automatically generated.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-09 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620218#comment-14620218
 ] 

Hadoop QA commented on YARN-3453:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  16m  5s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 43s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 47s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  6s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 20s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 27s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 58s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m  1s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/1276/YARN-3453.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 63d0365 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8474/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8474/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8474/console |


This message was automatically generated.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, 
> YARN-3453.4.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-07-06 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615236#comment-14615236
 ] 

Arun Suresh commented on YARN-3453:
---

Apologize for the delay.
Thanks for the review [~ashwinshankar77].

bq. Why are we not using componentwisemin here ?
Yup.. this can be changed to componentwisemin.

bq. FairScheduler.preemptResources() uses DefaultResourceCalculator and hence 
would look at only memory.
Agree.. And as you mentioned, think we will have to fix YARN-2154 too.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-06-09 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14579588#comment-14579588
 ] 

Ashwin Shankar commented on YARN-3453:
--

hey Arun,
Thanks for working on this !
Couple more comments in addition to Karthik's comments  :
1. Why are we not using componentwisemin here ?
Resource target = Resources.min(calc, clusterResource,
   sched.getMinShare(), sched.getDemand());

2. FairScheduler.preemptResources() uses DefaultResourceCalculator and hence 
would look at only memory.
This could lead to a problem in the following scenario :
Preemption round0 : toPreempt = (100G,10 cores)
... Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-06-05 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14574928#comment-14574928
 ] 

Ashwin Shankar commented on YARN-3453:
--

hey folks, Looking into the patch, will get back with comments.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-06-04 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573754#comment-14573754
 ] 

Karthik Kambatla commented on YARN-3453:


Few comments:
# New imports in FairScheduler and FSLeafQueue are not required.
# Looking at the remaining uses of DefaultResourceCalculator in FairScheduler, 
we could benefit from updating all of them to DominantResourceCalculator? 
[~ashwinshankar77] - do you concur? 
# In FairScheduler, changing the scope of RESOURCE_CALCULATOR and 
DOMINANT_RESOURCE_CALCULATOR is not required.
# We should add unit-tests to avoid regressions in the future. 
# Nit: In each of the policies, my preference would be not make the calculator 
and comparator members static unless required. We have had cases where our 
tests would invoke multiple instances of the class leading to issues. Not that 
I foresee multiple instantiations for these classes, but would like to avoid it 
if we can.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-06-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571859#comment-14571859
 ] 

Hadoop QA commented on YARN-3453:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | pre-patch |  15m  7s | Findbugs (version ) appears to 
be broken on trunk. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 30s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m  7s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 58s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12737386/YARN-3453.2.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bc85959 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8187/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8187/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8187/console |


This message was automatically generated.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch, YARN-3453.2.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-06-03 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571667#comment-14571667
 ] 

Karthik Kambatla commented on YARN-3453:


Should we add {{SchedulingPolicy#getResourceCalculator()}} and use that 
instead? 

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-06-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571480#comment-14571480
 ] 

Hadoop QA commented on YARN-3453:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 55s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 37s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 46s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 14s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  88m 10s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12737319/YARN-3453.1.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c59e745 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8179/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8179/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8179/console |


This message was automatically generated.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>Assignee: Arun Suresh
> Attachments: YARN-3453.1.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-04-22 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14508519#comment-14508519
 ] 

Peng Zhang commented on YARN-3453:
--

Update code snippet for calculation of resDueToFairShare
{code}
  Resource target;
  if (resourceCalculator instanceof DominantResourceCalculator) {
float targetRatio = Math.min(1,
((DominantResourceCalculator) resourceCalculator)
.getResourceAsValue(sched.getDemand(), resourceUpperBound, false));
target = Resources.multiply(sched.getDemand(), targetRatio);
  } else {
target = resourceUpperBound;
  }
  resDueToFairShare = Resources.max(resourceCalculator, clusterResource,
  Resources.none(), Resources.subtract(target, 
sched.getResourceUsage()));
{code}

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-04-13 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493529#comment-14493529
 ] 

Peng Zhang commented on YARN-3453:
--

[~ashwinshankar77]
bq. Would it help if we have a separate method or modify 
DominantResourceCalculator.compare() (in a backward compatible way), such that 
if the dominant resource is equal, then return -1.
Changing compare() to fix this corner case may cause some conflict with 
"dominant" notion. 
I found some operations in DominantResourceCalculator now are already not be 
consistent with "Dominant" notion. I think they are more related with multiple 
resources calculator. 
So I think maybe we'd better extract new layer for these operations. One layer 
for resource dimension(only memory or multiple resources), and the other for 
policy( fair or dominant, and maybe other policy). 

bq. Doing componentwiseMin makes sense. But why should we calculate targetRatio 
? Shouldnt we just preempt (resourceUpperBound - usage) ? Can you pls give an 
example ?
eg: fairshare (100g, 3core) demand (10g, 10core), componentwiseMin return (10g, 
3core). 
When do preemption, assume each preempted container is (1g, 1core).
After preemping 3 containers, toPreempt is (7g, 0core), still bigger than 
None(0g, 0core) even using Dominant calculator. So it will still preempt other 
7 containers(7g, 7core) which cannot be scheduled. This cause waste of 
preemption.
To decrease this kind of waste, I want to found what's the ratio of demand can 
be fulfilled by resourceUpperBound, and use this ratio * resourceUpperBound to 
be targetResource.
 
Actually, current implementation ignored the resource boundary of each 
requested container, so even after above logic, it still will has some waste.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-04-13 Thread Ashwin Shankar (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492792#comment-14492792
 ] 

Ashwin Shankar commented on YARN-3453:
--

[~peng.zhang],
bq. Calculator should be configurable, so isStarved() can use 
DominantResourceCalculator to test. But this still has corner case: 
cluster(40960, 8), expected( 16384, 4), real(4096, 4) it still result with 
"starved". My idea for this is to lower starve threshold to ignore this. Any 
good suggestions?

Would it help if we have a separate method or modify 
DominantResourceCalculator.compare() (in a backward compatible way), such that 
if the dominant
resource is equal, then return -1. Basically NOT do the following while 
checking for starvation :
{code}
else {
  l = getResourceAsValue(clusterResource, lhs, false);
  r = getResourceAsValue(clusterResource, rhs, false);
  if (l < r) {
return -1;
  } else if (l > r) {
return 1;
  }
}
{code} 

bq. in resToPreempt(), we should change logic of calculation, changing 
calculator is not enough. 

Doing componentwiseMin makes sense. But why should we calculate targetRatio ? 
Shouldnt we just preempt (resourceUpperBound - usage) ?
Can you pls give an example ?

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing

2015-04-13 Thread Peng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492440#comment-14492440
 ] 

Peng Zhang commented on YARN-3453:
--

[~ashwinshankar77]
I‘ve the same problem in our cluster. I think 2 points should be done to make 
it work:
# Calculator should be configurable, so isStarved() can use 
DominantResourceCalculator to test. But this still has corner case: 
cluster(40960, 8), expected( 16384, 4), real(4096, 4) it still result with 
"starved". My idea for this is to lower starve threshold to ignore this. Any 
good suggestions?
# in resToPreempt(), we should change logic of calculation, changing calculator 
is not enough. My hacked code is like below:
{code}
-  Resource target = Resources.min(getResourceCalculator(), clusterResource,
-  sched.getFairShare(), sched.getDemand());
+  Resource resourceUpperBound = Resources.componentwiseMin(
   sched.getFairShare(), sched.getDemand()); // min value of both cpu 
and memory as upperBound of request resource
+  float targetRatio = 0;
+// getResourceAsValue is not super method in ResourceCalculator, and i 
cannot figure out one name for this calculation logic, so hack like this
+  if (resourceCalculator instanceof DominantResourceCalculator) {
+// the ratio of demand can be get under resourceUpperBound.
+targetRatio = ((DominantResourceCalculator)resourceCalculator)
+.getResourceAsValue(sched.getDemand(), resourceUpperBound, false); 
 // min of ratio for cpu and memory
+  } else {
+targetRatio = Resources.ratio(resourceCalculator, sched.getDemand(),
+resourceUpperBound);
+  }
+  Resource target = Resources.multiply(sched.getDemand(), targetRatio);  
// demand resource can be fulfilled under fair share. 
{code} 

Besides "drf" policy problem with preemption, I also filed YARN-3405 to 
describe some common problems with preemption.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator 
> even in DRF mode causing thrashing
> 
>
> Key: YARN-3453
> URL: https://issues.apache.org/jira/browse/YARN-3453
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Ashwin Shankar
>
> There are two places in preemption code flow where DefaultResourceCalculator 
> is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, and 
> those extra preempted containers aren’t even getting to the “starved” queue 
> since scheduling logic is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --
> One more thing that I believe needs to change in DRF mode is : during a 
> preemption round,if preempting a few containers results in satisfying needs 
> of a resource type, then we should exit that preemption round, since the 
> containers that we just preempted should bring the dominant resource usage to 
> min/fair share.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)