date:20161020

[jira] [Commented] (YARN-5047) Refactor nodeUpdate across schedulers

2016-10-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593965#comment-15593965
 ] 

Hudson commented on YARN-5047:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10652 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10652/])
YARN-5047. Refactor nodeUpdate across schedulers. (Ray Chiang via kasha) 
(kasha: rev 754cb4e30fac1c5fe8d44626968c0ddbfe459335)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java


> Refactor nodeUpdate across schedulers
> -
>
> Key: YARN-5047
> URL: https://issues.apache.org/jira/browse/YARN-5047
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, fairscheduler, scheduler
>Affects Versions: 3.0.0-alpha1
>Reporter: Ray Chiang
>Assignee: Ray Chiang
> Fix For: 2.9.0
>
> Attachments: YARN-5047.001.patch, YARN-5047.002.patch, 
> YARN-5047.003.patch, YARN-5047.004.patch, YARN-5047.005.patch, 
> YARN-5047.006.patch, YARN-5047.007.patch, YARN-5047.008.patch, 
> YARN-5047.009.patch, YARN-5047.010.patch, YARN-5047.011.patch, 
> YARN-5047.012.patch
>
>
> FairScheduler#nodeUpdate() and CapacityScheduler#nodeUpdate() have a lot of 
> commonality in their code.  See about refactoring the common parts into 
> AbstractYARNScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-4597) Add SCHEDULE to NM container lifecycle

2016-10-20 Thread Arun Suresh (JIRA)

[
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593945#comment-15593945
]

Arun Suresh edited comment on YARN-4597 at 10/21/16 4:31 AM:
-

[~jianhe], thanks again for taking a look.

bq. I think there might be some behavior change or bug for scheduling
guaranteed containers when the oppotunistic-queue is enabled. Previously, when
launching container, NM will not check for current vmem usage, and cpu usage.
It assumes what RM allocated can be launched. Now, NM will check these limits
and won't launch the container if hits the limit.
Yup, we do a *hasResources* check only at the start of a container and when a
container is killed. We assumed that resources requested by a container is
constant, essentially we considered only actual *allocated* resources which we
assume will not varying during the lifetime of the container... which implies,
there is no point in checking this at any other time other than start and kill
of containers.
But like you stated, if we consider container resource *utilization*, based on
the work [~kasha] is doing in YARN-1011, then yes, we should have a timer
thread that periodically checks the vmem and cpu usage and starts (and kills)
containers based on that.

bq. the ResourceUtilizationManager looks like only incorporated some utility
methods, not sure how we will make this pluggable later.
Following on my point above, the idea was to have a
{{ResourceUtilizationManager}} that can provide a different value of
{{getCurrentUtilization}}, {{addResource}} and {{subtractResource}} which is
used by the ContainerScheduler to calculate the resources to free up. For
instance, the current default one only takes into account actual resource
*allocated* to containers... for YARN-1011, we might replace that with the
resource *utilized* by running containers, and provide a different value for
{{getCurrentUtilization}}. The timer thread I mentioned in the previous point,
which can be apart of this new ResourceUtilizationManager, can send events to
the scheduler to re-process queued containers when utilization has changed.

bq. The logic to select opportunisitic container: we may kill more
opportunistic containers than required. e.g...
Good catch, in the {{resourcesToFreeUp}}, I needed to decrement any
already-marked-for-kill opportunistic container. It was there earlier, Had
removed it when I was testing something, but forgot to put it back :)

bq. we don't need to synchronize on the currentUtilization object? I don't see
any other place it's synchronized
Yup, It isnt required. Varun did point out the same.. I thought I had fixed it,
think I might have missed 'git add'ing the change

w.r.t Adding the new transitions, I was seeing some error messages in some
testcases. Will rerun and see if they are required… but in anycase, having them
there should be harmless right?

The rest of your comments makes sense.. will address them shortly.

was (Author: asuresh):
[~jianhe], thanks again for taking a look.

bq. I think there might be some behavior change or bug for scheduling
guaranteed containers when the oppotunistic-queue is enabled. Previously, when
launching container, NM will not check for current vmem usage, and cpu usage.
It assumes what RM allocated can be launched.
Now, NM will check these limits and won't launch the container if hits the
limit.
Yup, we do a *hasResources* check only at the start of a container and when a
container is killed. We assumed that resources requested by a container is
constant, essentially we considered only actual *allocated* resources which we
assume will not varying during the lifetime of the container... which implies,
there is no point in checking this at any other time other than start and kill
of containers.
But like you stated, if we consider container resource *utilization*, based on
the work [~kasha] is doing in YARN-1011, then yes, we should have a timer
thread that periodically checks the vmem and cpu usage and starts (and kills)
containers based on that.

[jira] [Comment Edited] (YARN-4597) Add SCHEDULE to NM container lifecycle

2016-10-20 Thread Arun Suresh (JIRA)

[
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593945#comment-15593945
]

Arun Suresh edited comment on YARN-4597 at 10/21/16 4:31 AM:
-

[~jianhe], thanks again for taking a look.

bq. I think there might be some behavior change or bug for scheduling
guaranteed containers when the oppotunistic-queue is enabled. Previously, when
launching container, NM will not check for current vmem usage, and cpu usage.
It assumes what RM allocated can be launched.
Now, NM will check these limits and won't launch the container if hits the
limit.
Yup, we do a *hasResources* check only at the start of a container and when a
container is killed. We assumed that resources requested by a container is
constant, essentially we considered only actual *allocated* resources which we
assume will not varying during the lifetime of the container... which implies,
there is no point in checking this at any other time other than start and kill
of containers.
But like you stated, if we consider container resource *utilization*, based on
the work [~kasha] is doing in YARN-1011, then yes, we should have a timer
thread that periodically checks the vmem and cpu usage and starts (and kills)
containers based on that.

w.r.t Adding the new transitions, I was seeing some error messages in some
testcases. Will rerun and see if they are required… but in anycase, having them
there should be harmless right?

The rest of your comments makes sense.. will address them shortly.

was (Author: asuresh):
[~jianhe], thanks again for taking a look.

bq. I think there might be some behavior change or bug for scheduling
guaranteed containers when the oppotunistic-queue is enabled.
Previously, when launching container, NM will not check for current vmem usage,
and cpu usage. It assumes what RM allocated can be launched.
Now, NM will check these limits and won't launch the container if hits the
limit.
Yup, we do a *hasResources* check only at the start of a container and when a
container is killed. We assumed that resources requested by a container is
constant, essentially we considered only actual *allocated* resources which we
assume will not varying during the lifetime of the container... which implies,
there is no point in checking this at any other time other than start and kill
of containers.
But like you stated, if we consider container resource *utilization*, based on
the work [~kasha] is doing in YARN-1011, then yes, we should have a timer
thread that periodically checks the vmem and cpu usage and starts (and kills)
containers based on that.

1 2 >

1 - 100 of 127 matches

Mail list logo