[jira] [Commented] (YARN-2187) FairScheduler: Disable max-AM-share check by default

2014-06-21 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039726#comment-14039726
 ] 

Karthik Kambatla commented on YARN-2187:


+1. Committing this. 

 FairScheduler: Disable max-AM-share check by default
 

 Key: YARN-2187
 URL: https://issues.apache.org/jira/browse/YARN-2187
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Attachments: YARN-2187.patch


 Say you have a small cluster with 8gb memory and 5 queues.  This means that 
 equal queue can have 8gb / 5 = 1.6gb but an AM requires 2gb to start so no 
 AMs can be started.  By default, max-am-share check should be disabled so 
 users don't see a regression. On medium-sized clusters, it still makes sense 
 to set the max-am-share to a value between 0 and 1. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2187) FairScheduler: Disable max-AM-share check by default

2014-06-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039729#comment-14039729
 ] 

Hudson commented on YARN-2187:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5749 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5749/])
YARN-2187. FairScheduler: Disable max-AM-share check by default. (Robert Kanter 
via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1604321)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 FairScheduler: Disable max-AM-share check by default
 

 Key: YARN-2187
 URL: https://issues.apache.org/jira/browse/YARN-2187
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.5.0

 Attachments: YARN-2187.patch


 Say you have a small cluster with 8gb memory and 5 queues.  This means that 
 equal queue can have 8gb / 5 = 1.6gb but an AM requires 2gb to start so no 
 AMs can be started.  By default, max-am-share check should be disabled so 
 users don't see a regression. On medium-sized clusters, it still makes sense 
 to set the max-am-share to a value between 0 and 1. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1680) availableResources sent to applicationMaster in heartbeat should exclude blacklistedNodes free memory.

2014-06-21 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039733#comment-14039733
 ] 

Niels Basjes commented on YARN-1680:


Looks like  YARN-2105 has been fixed. Can someone please retrigger this patch?

 availableResources sent to applicationMaster in heartbeat should exclude 
 blacklistedNodes free memory.
 --

 Key: YARN-1680
 URL: https://issues.apache.org/jira/browse/YARN-1680
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.2.0, 2.3.0
 Environment: SuSE 11 SP2 + Hadoop-2.3 
Reporter: Rohith
Assignee: Chen He
 Attachments: YARN-1680-v2.patch, YARN-1680.patch


 There are 4 NodeManagers with 8GB each.Total cluster capacity is 32GB.Cluster 
 slow start is set to 1.
 Job is running reducer task occupied 29GB of cluster.One NodeManager(NM-4) is 
 become unstable(3 Map got killed), MRAppMaster blacklisted unstable 
 NodeManager(NM-4). All reducer task are running in cluster now.
 MRAppMaster does not preempt the reducers because for Reducer preemption 
 calculation, headRoom is considering blacklisted nodes memory. This makes 
 jobs to hang forever(ResourceManager does not assing any new containers on 
 blacklisted nodes but returns availableResouce considers cluster free 
 memory). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2014-06-21 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039744#comment-14039744
 ] 

Varun Vasudev commented on YARN-1039:
-

I agree with [~zjshen]. Using the tags field also means we don't have to worry 
about switching to an enum like [~cwelch] mentioned in one of earlier comments.

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
Priority: Minor
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2187) FairScheduler: Disable max-AM-share check by default

2014-06-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039783#comment-14039783
 ] 

Hudson commented on YARN-2187:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #590 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/590/])
YARN-2187. FairScheduler: Disable max-AM-share check by default. (Robert Kanter 
via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1604321)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 FairScheduler: Disable max-AM-share check by default
 

 Key: YARN-2187
 URL: https://issues.apache.org/jira/browse/YARN-2187
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.5.0

 Attachments: YARN-2187.patch


 Say you have a small cluster with 8gb memory and 5 queues.  This means that 
 equal queue can have 8gb / 5 = 1.6gb but an AM requires 2gb to start so no 
 AMs can be started.  By default, max-am-share check should be disabled so 
 users don't see a regression. On medium-sized clusters, it still makes sense 
 to set the max-am-share to a value between 0 and 1. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2187) FairScheduler: Disable max-AM-share check by default

2014-06-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039843#comment-14039843
 ] 

Hudson commented on YARN-2187:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1781 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1781/])
YARN-2187. FairScheduler: Disable max-AM-share check by default. (Robert Kanter 
via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1604321)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 FairScheduler: Disable max-AM-share check by default
 

 Key: YARN-2187
 URL: https://issues.apache.org/jira/browse/YARN-2187
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.5.0

 Attachments: YARN-2187.patch


 Say you have a small cluster with 8gb memory and 5 queues.  This means that 
 equal queue can have 8gb / 5 = 1.6gb but an AM requires 2gb to start so no 
 AMs can be started.  By default, max-am-share check should be disabled so 
 users don't see a regression. On medium-sized clusters, it still makes sense 
 to set the max-am-share to a value between 0 and 1. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2187) FairScheduler: Disable max-AM-share check by default

2014-06-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039867#comment-14039867
 ] 

Hudson commented on YARN-2187:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1808 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1808/])
YARN-2187. FairScheduler: Disable max-AM-share check by default. (Robert Kanter 
via kasha) (kasha: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1604321)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationFileLoaderService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/FairScheduler.apt.vm


 FairScheduler: Disable max-AM-share check by default
 

 Key: YARN-2187
 URL: https://issues.apache.org/jira/browse/YARN-2187
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.5.0
Reporter: Robert Kanter
Assignee: Robert Kanter
 Fix For: 2.5.0

 Attachments: YARN-2187.patch


 Say you have a small cluster with 8gb memory and 5 queues.  This means that 
 equal queue can have 8gb / 5 = 1.6gb but an AM requires 2gb to start so no 
 AMs can be started.  By default, max-am-share check should be disabled so 
 users don't see a regression. On medium-sized clusters, it still makes sense 
 to set the max-am-share to a value between 0 and 1. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2144) Add logs when preemption occurs

2014-06-21 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039934#comment-14039934
 ] 

Carlo Curino commented on YARN-2144:


I only skimmed the patch quickly, but I see several places where you are 
changing method signatures by adding booleans to communicate the container was 
preempted. 
Would it be possibly to use/extend some of the container state / event objects 
that are already passed around? It might be less intrusive, and if we ever get 
to different levels of preemption or anything like that, will also be more 
flexible of a mechanism.

 Add logs when preemption occurs
 ---

 Key: YARN-2144
 URL: https://issues.apache.org/jira/browse/YARN-2144
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.5.0
Reporter: Tassapol Athiapinya
Assignee: Wangda Tan
 Attachments: AM-page-preemption-info.png, YARN-2144.patch, 
 YARN-2144.patch, YARN-2144.patch, YARN-2144.patch, YARN-2144.patch


 There should be easy-to-read logs when preemption does occur. 
 RM logs should have following properties:
 * Logs are retrievable when an application is still running and often flushed.
 * Can distinguish between AM container preemption and task container 
 preemption with container ID shown.
 * Should be INFO level log.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2190) Provide a Windows container executor that can limit memory and CPU

2014-06-21 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039973#comment-14039973
 ] 

Steve Loughran commented on YARN-2190:
--

# what would the implications for the move windows 8 for the API? What does it 
mean for server versions and builds?
# something is cutting off all the ASF copyright comments ... probably the IDE. 
That'll have the RAT tool complaining. It may be something to ignore during 
iterative development , but would need to be fixed before committing
# would it be possible to have a command line like
{code}
task create --memory 2048 name command-line
{code}
so that new options could go in (--cpu, --io) without confusion...the current 
approach looks a bit brittle


 Provide a Windows container executor that can limit memory and CPU
 --

 Key: YARN-2190
 URL: https://issues.apache.org/jira/browse/YARN-2190
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Reporter: Chuan Liu
 Attachments: YARN-2190-prototype.patch


 Yarn default container executor on Windows does not set the resource limit on 
 the containers currently. The memory limit is enforced by a separate 
 monitoring thread. The container implementation on Windows uses Job Object 
 right now. The latest Windows (8 or later) API allows CPU and memory limits 
 on the job objects. We want to create a Windows container executor that sets 
 the limits on job objects thus provides resource enforcement at OS level.
 http://msdn.microsoft.com/en-us/library/windows/desktop/ms686216(v=vs.85).aspx



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2014-06-21 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039976#comment-14039976
 ] 

Steve Loughran commented on YARN-1039:
--

# I'd make the long-lived flag a container request, *not the AM launch 
request*. An AM may wish to indicate that some containers are shortlife, others 
long-lived. 
# If the tag approach lets my AM add this request while running with the 2.4 
JARs -even though the hint will be ignored- I'm happy. Protobuf may be agile, 
but the generated proto classes aren't, and working with fields directly is 
hard to do, introspection brittle. I know that from working with the am restart 
flag.
# Otherwise, I'd like a long64 with bits we can set and read. It's the 
cross-platform way and would give us a single field for future additions

 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
Priority: Minor
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate long lived

2014-06-21 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040001#comment-14040001
 ] 

Zhijie Shen commented on YARN-1039:
---

bq. An AM may wish to indicate that some containers are shortlife, others 
long-lived.

Container-level long-live flag is an interesting idea. Given any container of 
an app is long-lived, the AM container is automatically going to be long-lived 
as well, right? Suppose AM should last until the exit of the whole app. Shall 
we mark an app long-lived, and then allow long-lived app to start a long-lived 
container?

bq. If the tag approach lets my AM add this request while running with the 2.4 
JARs even though the hint will be ignored I'm happy.

If the granularity is going to be container, the tag may not help, as it's an 
application-level information



 Add parameter for YARN resource requests to indicate long lived
 -

 Key: YARN-1039
 URL: https://issues.apache.org/jira/browse/YARN-1039
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 3.0.0, 2.1.1-beta
Reporter: Steve Loughran
Assignee: Craig Welch
Priority: Minor
 Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch


 A container request could support a new parameter long-lived. This could be 
 used by a scheduler that would know not to host the service on a transient 
 (cloud: spot priced) node.
 Schedulers could also decide whether or not to allocate multiple long-lived 
 containers on the same node



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-2144) Add logs when preemption occurs

2014-06-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040010#comment-14040010
 ] 

Wangda Tan commented on YARN-2144:
--

Since suggested by [~jianhe], to propose of this JIRA, I will simply add logs 
to CapacityScheduler.killContainer().
[~curino], thanks for your comment, since I may not need change event objects 
as suggested by Jian, I will do as your suggested when working on other items 
like YARN-2181

 Add logs when preemption occurs
 ---

 Key: YARN-2144
 URL: https://issues.apache.org/jira/browse/YARN-2144
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.5.0
Reporter: Tassapol Athiapinya
Assignee: Wangda Tan
 Attachments: AM-page-preemption-info.png, YARN-2144.patch, 
 YARN-2144.patch, YARN-2144.patch, YARN-2144.patch, YARN-2144.patch


 There should be easy-to-read logs when preemption does occur. 
 RM logs should have following properties:
 * Logs are retrievable when an application is still running and often flushed.
 * Can distinguish between AM container preemption and task container 
 preemption with container ID shown.
 * Should be INFO level log.



--
This message was sent by Atlassian JIRA
(v6.2#6252)