[jira] [Commented] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572318#comment-13572318
 ] 

Hudson commented on YARN-370:
-

Integrated in Hadoop-Yarn-trunk #119 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/119/])
YARN-370. Fix SchedulerUtils to correctly round up the resource for 
containers. Contributed by Zhijie Shen. (Revision 1442840)

 Result = FAILURE
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442840
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java


 CapacityScheduler app submission fails when min alloc size not multiple of AM 
 size
 --

 Key: YARN-370
 URL: https://issues.apache.org/jira/browse/YARN-370
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.3-alpha
Reporter: Thomas Graves
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-370-branch-2_1.patch, YARN-370-branch-2.patch


 I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
 minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
 resource calculator so it was using DefaultResourceCalculator.  The am launch 
 failed with the error below:
 Application application_1359688216672_0001 failed 1 times due to Error 
 launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
 at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 RemoteTrace: at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 Unauthorized request to start container. Expected resource memory:2048, 
 vCores:1 but found memory:1536, vCores:1 at 
 org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
  at 
 org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
  at 
 org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
  at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:415) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
  at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
  at 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
  at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
  at 
 org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
  at 
 org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
 It looks like the launchcontext for the app didn't have the resources rounded 
 

[jira] [Commented] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572389#comment-13572389
 ] 

Hudson commented on YARN-370:
-

Integrated in Hadoop-Hdfs-trunk #1308 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1308/])
YARN-370. Fix SchedulerUtils to correctly round up the resource for 
containers. Contributed by Zhijie Shen. (Revision 1442840)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442840
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java


 CapacityScheduler app submission fails when min alloc size not multiple of AM 
 size
 --

 Key: YARN-370
 URL: https://issues.apache.org/jira/browse/YARN-370
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.3-alpha
Reporter: Thomas Graves
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-370-branch-2_1.patch, YARN-370-branch-2.patch


 I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
 minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
 resource calculator so it was using DefaultResourceCalculator.  The am launch 
 failed with the error below:
 Application application_1359688216672_0001 failed 1 times due to Error 
 launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
 at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 RemoteTrace: at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 Unauthorized request to start container. Expected resource memory:2048, 
 vCores:1 but found memory:1536, vCores:1 at 
 org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
  at 
 org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
  at 
 org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
  at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:415) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
  at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
  at 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
  at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
  at 
 org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
  at 
 org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
 It looks like the launchcontext for the app didn't have the resources rounded 
 

[jira] [Commented] (YARN-370) CapacityScheduler app submission fails when min alloc size not multiple of AM size

2013-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572421#comment-13572421
 ] 

Hudson commented on YARN-370:
-

Integrated in Hadoop-Mapreduce-trunk #1336 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1336/])
YARN-370. Fix SchedulerUtils to correctly round up the resource for 
containers. Contributed by Zhijie Shen. (Revision 1442840)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442840
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulerUtils.java


 CapacityScheduler app submission fails when min alloc size not multiple of AM 
 size
 --

 Key: YARN-370
 URL: https://issues.apache.org/jira/browse/YARN-370
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.0.3-alpha
Reporter: Thomas Graves
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-370-branch-2_1.patch, YARN-370-branch-2.patch


 I was running 2.0.3-SNAPSHOT with the capacity scheduler configured with 
 minimum allocation size 1G. The AM size was set to 1.5G. I didn't specify 
 resource calculator so it was using DefaultResourceCalculator.  The am launch 
 failed with the error below:
 Application application_1359688216672_0001 failed 1 times due to Error 
 launching appattempt_1359688216672_0001_01. Got exception: RemoteTrace: 
 at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 RemoteTrace: at LocalTrace: 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
 Unauthorized request to start container. Expected resource memory:2048, 
 vCores:1 but found memory:1536, vCores:1 at 
 org.apache.hadoop.yarn.factories.impl.pb.YarnRemoteExceptionFactoryPBImpl.createYarnRemoteException(YarnRemoteExceptionFactoryPBImpl.java:39)
  at org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:47) at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.authorizeRequest(ContainerManagerImpl.java:383)
  at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.startContainer(ContainerManagerImpl.java:400)
  at 
 org.apache.hadoop.yarn.api.impl.pb.service.ContainerManagerPBServiceImpl.startContainer(ContainerManagerPBServiceImpl.java:68)
  at 
 org.apache.hadoop.yarn.proto.ContainerManager$ContainerManagerService$2.callBlockingMethod(ContainerManager.java:83)
  at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:454)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1014) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1735) at 
 org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1731) at 
 java.security.AccessController.doPrivileged(Native Method) at 
 javax.security.auth.Subject.doAs(Subject.java:415) at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1441)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1729) at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at 
 org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
  at 
 org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
  at 
 org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:123)
  at 
 org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:109)
  at 
 org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:111)
  at 
 org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:255)
  at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722) . Failing the application. 
 It looks like the launchcontext for the app didn't have the resources 

[jira] [Updated] (YARN-382) SchedulerUtils improve way normalizeRequest sets the resource capabilities

2013-02-06 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-382:
---

Description: 
In YARN-370, we changed it from setting the capability to directly setting 
memory and cores:

-ask.setCapability(normalized);
+ask.getCapability().setMemory(normalized.getMemory());
+ask.getCapability().setVirtualCores(normalized.getVirtualCores());

We did this because it is directly setting the values in the original resource 
object passed in when the AM gets allocated and without it the AM doesn't get 
the resource normalized correctly in the submission context. See YARN-370 for 
more details.

I think we should find a better way of doing this long term, one so we don't 
have to keep adding things there when new resources are added, two because its 
a bit confusing as to what its doing and prone to someone accidentally breaking 
it in the future again.  Something closer to what Arun suggested in YARN-370 
would be better but we need to make sure all the places work and get some more 
testing on it before putting it in. 

  was:
In YARN-370, we changed it from setting the capability to directly setting 
memory and cores:

-ask.setCapability(normalized);
+ask.getCapability().setMemory(normalized.getMemory());
+ask.getCapability().setVirtualCores(normalized.getVirtualCores());

We did this because before it is directly setting the values in the original 
resource object passed in when the AM gets allocated and without it the AM 
doesn't get the resource normalized correctly in the submission context. See 
YARN-370 for more details.

I think we should find a better way of doing this long term, one so we don't 
have to keep adding things there when new resources are added, two because its 
a bit confusing as to what its doing and prone to someone accidentally breaking 
it in the future again.  Something closer to what Arun suggested in YARN-370 
would be better but we need to make sure all the places work and get some more 
testing on it before putting it in. 


 SchedulerUtils improve way normalizeRequest sets the resource capabilities
 --

 Key: YARN-382
 URL: https://issues.apache.org/jira/browse/YARN-382
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Thomas Graves

 In YARN-370, we changed it from setting the capability to directly setting 
 memory and cores:
 -ask.setCapability(normalized);
 +ask.getCapability().setMemory(normalized.getMemory());
 +ask.getCapability().setVirtualCores(normalized.getVirtualCores());
 We did this because it is directly setting the values in the original 
 resource object passed in when the AM gets allocated and without it the AM 
 doesn't get the resource normalized correctly in the submission context. See 
 YARN-370 for more details.
 I think we should find a better way of doing this long term, one so we don't 
 have to keep adding things there when new resources are added, two because 
 its a bit confusing as to what its doing and prone to someone accidentally 
 breaking it in the future again.  Something closer to what Arun suggested in 
 YARN-370 would be better but we need to make sure all the places work and get 
 some more testing on it before putting it in. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-3) Add support for CPU isolation/monitoring of containers

2013-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572493#comment-13572493
 ] 

Hudson commented on YARN-3:
---

Integrated in Hadoop-trunk-Commit #3329 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3329/])
YARN-3. Merged to branch-2. (Revision 1443011)

 Result = SUCCESS
acmurthy : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1443011
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt


 Add support for CPU isolation/monitoring of containers
 --

 Key: YARN-3
 URL: https://issues.apache.org/jira/browse/YARN-3
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun C Murthy
Assignee: Andrew Ferguson
 Fix For: 2.0.3-alpha

 Attachments: mapreduce-4334-design-doc.txt, 
 mapreduce-4334-design-doc-v2.txt, MAPREDUCE-4334-executor-v1.patch, 
 MAPREDUCE-4334-executor-v2.patch, MAPREDUCE-4334-executor-v3.patch, 
 MAPREDUCE-4334-executor-v4.patch, MAPREDUCE-4334-pre1.patch, 
 MAPREDUCE-4334-pre2.patch, MAPREDUCE-4334-pre2-with_cpu.patch, 
 MAPREDUCE-4334-pre3.patch, MAPREDUCE-4334-pre3-with_cpu.patch, 
 MAPREDUCE-4334-v1.patch, MAPREDUCE-4334-v2.patch, YARN-3-lce_only-v1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-357) App submission should not be synchronized

2013-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572545#comment-13572545
 ] 

Hadoop QA commented on YARN-357:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12568245/YARN-357.branch-23.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/384//console

This message is automatically generated.

 App submission should not be synchronized
 -

 Key: YARN-357
 URL: https://issues.apache.org/jira/browse/YARN-357
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 0.23.3, 3.0.0, 2.0.0-alpha
Reporter: Daryn Sharp
Assignee: Daryn Sharp
 Attachments: YARN-357.branch-23.patch, YARN-357.patch, 
 YARN-357.patch, YARN-357.txt


 MAPREDUCE-2953 fixed a race condition with querying of app status by making 
 {{RMClientService#submitApplication}} synchronously invoke 
 {{RMAppManager#submitApplication}}. However, the {{synchronized}} keyword was 
 also added to {{RMAppManager#submitApplication}} with the comment:
 bq. I made the submitApplication synchronized to keep it consistent with the 
 other routines in RMAppManager although I do not believe it needs it since 
 the rmapp datastructure is already a concurrentMap and I don't see anything 
 else that would be an issue.
 It's been observed that app submission latency is being unnecessarily 
 impacted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-355) RM app submission jams under load

2013-02-06 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated YARN-355:
-

Attachment: YARN-355.patch
YARN-355.branch-23.patch

Thanks Sid.  Updated patches to set TSM service after client has started.

 RM app submission jams under load
 -

 Key: YARN-355
 URL: https://issues.apache.org/jira/browse/YARN-355
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.0-alpha, 0.23.6
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-355.branch-23.patch, YARN-355.branch-23.patch, 
 YARN-355.branch-23.patch, YARN-355.patch, YARN-355.patch, YARN-355.patch


 The RM performs a loopback connection to itself to renew its own tokens.  If 
 app submissions consume all RPC handlers for {{ClientRMProtocol}}, then app 
 submissions block because it cannot loopback to itself to do the renewal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-355) RM app submission jams under load

2013-02-06 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-355:
---

Target Version/s: 2.0.3-alpha, 0.23.7  (was: 2.0.3-alpha, 0.23.6)

 RM app submission jams under load
 -

 Key: YARN-355
 URL: https://issues.apache.org/jira/browse/YARN-355
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.0-alpha, 0.23.6
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Attachments: YARN-355.branch-23.patch, YARN-355.branch-23.patch, 
 YARN-355.branch-23.patch, YARN-355.patch, YARN-355.patch, YARN-355.patch


 The RM performs a loopback connection to itself to renew its own tokens.  If 
 app submissions consume all RPC handlers for {{ClientRMProtocol}}, then app 
 submissions block because it cannot loopback to itself to do the renewal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-383) AMRMClientImpl should handle null rmClient in stop()

2013-02-06 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-383:
-

Attachment: YARN-383.1.patch

Trivial patch.

 AMRMClientImpl should handle null rmClient in stop()
 

 Key: YARN-383
 URL: https://issues.apache.org/jira/browse/YARN-383
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
Priority: Minor
 Attachments: YARN-383.1.patch


 2013-02-06 09:31:33,813 INFO  [Thread-2] service.CompositeService 
 (CompositeService.java:stop(101)) - Error stopping 
 org.apache.hadoop.yarn.client.AMRMClientImpl
 org.apache.hadoop.HadoopIllegalArgumentException: Cannot close proxy since it 
 is null
 at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:605)
 at 
 org.apache.hadoop.yarn.client.AMRMClientImpl.stop(AMRMClientImpl.java:150)
 at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
 at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)
 at 
 org.apache.hadoop.yarn.app.ampool.AMPoolAppMaster.stop(AMPoolAppMaster.java:171)
 at 
 org.apache.hadoop.yarn.app.ampool.AMPoolAppMaster$AMPoolAppMasterShutdownHook.run(AMPoolAppMaster.java:196)
 at 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-40) Provide support for missing yarn commands

2013-02-06 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-40:
--

Fix Version/s: 0.23.7

 Provide support for missing yarn commands
 -

 Key: YARN-40
 URL: https://issues.apache.org/jira/browse/YARN-40
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.0-alpha
Reporter: Devaraj K
Assignee: Devaraj K
 Fix For: 2.0.3-alpha, 0.23.7

 Attachments: MAPREDUCE-4155-1.patch, MAPREDUCE-4155.patch, 
 YARN-40-1.patch, YARN-40-20120917.1.txt, YARN-40-20120917.txt, 
 YARN-40-20120924.txt, YARN-40-20121008.txt, YARN-40.patch


 1. status app-id
 2. kill app-id (Already issue present with Id : MAPREDUCE-3793)
 3. list-apps [all]
 4. nodes-report

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-249) Capacity Scheduler web page should show list of active users per queue like it used to (in 1.x)

2013-02-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-249:
--

Attachment: YARN-249.branch-0.23.patch

Updated patch for branch-0.23


 Capacity Scheduler web page should show list of active users per queue like 
 it used to (in 1.x)
 ---

 Key: YARN-249
 URL: https://issues.apache.org/jira/browse/YARN-249
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 3.0.0, 0.23.5
Reporter: Ravi Prakash
Assignee: Ravi Prakash
  Labels: scheduler, web-ui
 Attachments: YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, 
 YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.patch, YARN-249.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.patch, YARN-249.png


 On the jobtracker, the web ui showed the active users for each queue and how 
 much resources each of those users were using. That currently isn't being 
 displayed on the RM capacity scheduler web ui.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-249) Capacity Scheduler web page should show list of active users per queue like it used to (in 1.x)

2013-02-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-249:
--

Attachment: YARN-249.patch

Updated patch for trunk


 Capacity Scheduler web page should show list of active users per queue like 
 it used to (in 1.x)
 ---

 Key: YARN-249
 URL: https://issues.apache.org/jira/browse/YARN-249
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 3.0.0, 0.23.5
Reporter: Ravi Prakash
Assignee: Ravi Prakash
  Labels: scheduler, web-ui
 Attachments: YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, 
 YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.patch, YARN-249.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.patch, YARN-249.png


 On the jobtracker, the web ui showed the active users for each queue and how 
 much resources each of those users were using. That currently isn't being 
 displayed on the RM capacity scheduler web ui.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-355) RM app submission jams under load

2013-02-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572695#comment-13572695
 ] 

Hudson commented on YARN-355:
-

Integrated in Hadoop-trunk-Commit # (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit//])
YARN-355. Fixes a bug where RM app submission could jam under load. 
Contributed by Daryn Sharp. (Revision 1443131)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1443131
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/YarnClientImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/security/RMDelegationTokenRenewer.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/resources
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/RMDelegationTokenIdentifier.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/META-INF/services/org.apache.hadoop.security.token.TokenRenewer
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMTokens.java


 RM app submission jams under load
 -

 Key: YARN-355
 URL: https://issues.apache.org/jira/browse/YARN-355
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.0-alpha, 0.23.6
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Blocker
 Fix For: 2.0.3-alpha, 0.23.7

 Attachments: YARN-355.branch-23.patch, YARN-355.branch-23.patch, 
 YARN-355.branch-23.patch, YARN-355.patch, YARN-355.patch, YARN-355.patch


 The RM performs a loopback connection to itself to renew its own tokens.  If 
 app submissions consume all RPC handlers for {{ClientRMProtocol}}, then app 
 submissions block because it cannot loopback to itself to do the renewal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-150) AppRejectedTransition does not unregister app from master service and scheduler

2013-02-06 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-150:
---

Fix Version/s: 0.23.7

 AppRejectedTransition does not unregister app from master service and 
 scheduler
 ---

 Key: YARN-150
 URL: https://issues.apache.org/jira/browse/YARN-150
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 0.23.3, 3.0.0, 2.0.0-alpha
Reporter: Bikas Saha
Assignee: Bikas Saha
 Fix For: 2.0.3-alpha, 0.23.7

 Attachments: MAPREDUCE-4436.1.patch


 AttemptStartedTransition() adds the app to the ApplicationMasterService and 
 scheduler. when the scheduler rejects the app then AppRejectedTransition() 
 forgets to unregister it from the ApplicationMasterService.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-249) Capacity Scheduler web page should show list of active users per queue like it used to (in 1.x)

2013-02-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-249:
--

Attachment: YARN-249.branch-0.23.patch

Updated docs to include active and pending applications

 Capacity Scheduler web page should show list of active users per queue like 
 it used to (in 1.x)
 ---

 Key: YARN-249
 URL: https://issues.apache.org/jira/browse/YARN-249
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 3.0.0, 0.23.5
Reporter: Ravi Prakash
Assignee: Ravi Prakash
  Labels: scheduler, web-ui
 Attachments: YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, 
 YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, 
 YARN-249.branch-0.23.patch, YARN-249.patch, YARN-249.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.patch, YARN-249.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.png


 On the jobtracker, the web ui showed the active users for each queue and how 
 much resources each of those users were using. That currently isn't being 
 displayed on the RM capacity scheduler web ui.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-249) Capacity Scheduler web page should show list of active users per queue like it used to (in 1.x)

2013-02-06 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572709#comment-13572709
 ] 

Ravi Prakash commented on YARN-249:
---

Thanks a lot for your review Tom! I've incorporated all your suggestions in 
these updated patches for branch-0.23 and trunk. I chose to put the % in a span 
element, so that when you mouse over it, it shows what that %ge is based on.


 Capacity Scheduler web page should show list of active users per queue like 
 it used to (in 1.x)
 ---

 Key: YARN-249
 URL: https://issues.apache.org/jira/browse/YARN-249
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 3.0.0, 0.23.5
Reporter: Ravi Prakash
Assignee: Ravi Prakash
  Labels: scheduler, web-ui
 Attachments: YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, 
 YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, 
 YARN-249.branch-0.23.patch, YARN-249.patch, YARN-249.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.patch, YARN-249.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.png


 On the jobtracker, the web ui showed the active users for each queue and how 
 much resources each of those users were using. That currently isn't being 
 displayed on the RM capacity scheduler web ui.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-249) Capacity Scheduler web page should show list of active users per queue like it used to (in 1.x)

2013-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572742#comment-13572742
 ] 

Hadoop QA commented on YARN-249:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12568277/YARN-249.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/388//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/388//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/388//console

This message is automatically generated.

 Capacity Scheduler web page should show list of active users per queue like 
 it used to (in 1.x)
 ---

 Key: YARN-249
 URL: https://issues.apache.org/jira/browse/YARN-249
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacityscheduler
Affects Versions: 2.0.2-alpha, 3.0.0, 0.23.5
Reporter: Ravi Prakash
Assignee: Ravi Prakash
  Labels: scheduler, web-ui
 Attachments: YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, 
 YARN-249.branch-0.23.patch, YARN-249.branch-0.23.patch, 
 YARN-249.branch-0.23.patch, YARN-249.patch, YARN-249.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.patch, YARN-249.patch, YARN-249.patch, 
 YARN-249.patch, YARN-249.png


 On the jobtracker, the web ui showed the active users for each queue and how 
 much resources each of those users were using. That currently isn't being 
 displayed on the RM capacity scheduler web ui.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-383) AMRMClientImpl should handle null rmClient in stop()

2013-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13572892#comment-13572892
 ] 

Hadoop QA commented on YARN-383:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12568299/YARN-383.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/389//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/389//console

This message is automatically generated.

 AMRMClientImpl should handle null rmClient in stop()
 

 Key: YARN-383
 URL: https://issues.apache.org/jira/browse/YARN-383
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah
Assignee: Hitesh Shah
Priority: Minor
 Attachments: YARN-383.1.patch, YARN-383.2.patch, YARN-383.3.patch


 2013-02-06 09:31:33,813 INFO  [Thread-2] service.CompositeService 
 (CompositeService.java:stop(101)) - Error stopping 
 org.apache.hadoop.yarn.client.AMRMClientImpl
 org.apache.hadoop.HadoopIllegalArgumentException: Cannot close proxy since it 
 is null
 at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:605)
 at 
 org.apache.hadoop.yarn.client.AMRMClientImpl.stop(AMRMClientImpl.java:150)
 at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
 at 
 org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:89)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-385) ResourceRequestPBImpl's toString() is missing location and # containers

2013-02-06 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-385:
---

 Summary: ResourceRequestPBImpl's toString() is missing location 
and # containers
 Key: YARN-385
 URL: https://issues.apache.org/jira/browse/YARN-385
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza


ResourceRequestPBImpl's toString method includes priority and resource 
capability, but omits location and number of containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-359) NodeManager container-related tests fail on branch-trunk-win

2013-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-359.
--

Resolution: Fixed

I just committed this to branch-trunk-win. Thanks Chris!

 NodeManager container-related tests fail on branch-trunk-win
 

 Key: YARN-359
 URL: https://issues.apache.org/jira/browse/YARN-359
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: YARN-359-branch-trunk-win.1.patch, 
 YARN-359-branch-trunk-win.2.patch


 On branch-trunk-win, there are test failures in {{TestContainerManager}}, 
 {{TestNodeManagerShutdown}}, {{TestContainerLaunch}}, and 
 {{TestContainersMonitor}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-359) NodeManager container-related tests fail on branch-trunk-win

2013-02-06 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573116#comment-13573116
 ] 

Bikas Saha commented on YARN-359:
-

The main reason for moving these to Shell was to reduce the number of places 
where OS specific forks happen in code and limit all such behavior to the Shell 
object that mainly performs OS dependent tasks that cannot be done in Java.

 NodeManager container-related tests fail on branch-trunk-win
 

 Key: YARN-359
 URL: https://issues.apache.org/jira/browse/YARN-359
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: YARN-359-branch-trunk-win.1.patch, 
 YARN-359-branch-trunk-win.2.patch


 On branch-trunk-win, there are test failures in {{TestContainerManager}}, 
 {{TestNodeManagerShutdown}}, {{TestContainerLaunch}}, and 
 {{TestContainersMonitor}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-20) More information for yarn.resourcemanager.webapp.address in yarn-default.xml

2013-02-06 Thread nemon lou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-20?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nemon lou updated YARN-20:
--

Attachment: YARN-20.patch

Adding annotation just as Harsh J said.Sorry for comming back so late.No test 
case is added since it's only a trivial document change.

 More information for yarn.resourcemanager.webapp.address in yarn-default.xml
 --

 Key: YARN-20
 URL: https://issues.apache.org/jira/browse/YARN-20
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.0.0-alpha
Reporter: nemon lou
Priority: Trivial
 Attachments: YARN-20.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

   The parameter  yarn.resourcemanager.webapp.address in yarn-default.xml  is 
 in host:port format,which is noted in the cluster set up guide 
 (http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html).
   When i read though the code,i find host format is also supported. In 
 host format,the port will be random.
   So we may add more documentation in  yarn-default.xml for easy understood.
   I will submit a patch if it's helpful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-111) Application level priority in Resource Manager Schedulers

2013-02-06 Thread nemon lou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573142#comment-13573142
 ] 

nemon lou commented on YARN-111:


Finally i use two Queues in Capacity Scheduler to basically meet our needs.
Both queue has a Absolute Max Capacity of 100% .The queue with higher priority 
has more Absolute Capacity configured(85%).
Job which need high priority will be submitted to the queue which has more 
Absolute Capacity configured.

 Application level priority in Resource Manager Schedulers
 -

 Key: YARN-111
 URL: https://issues.apache.org/jira/browse/YARN-111
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.1-alpha
Reporter: nemon lou

 We need application level priority for Hadoop 2.0,both in FIFO scheduler and 
 Capacity Scheduler.
 In Hadoop 1.0.x,job priority is supported.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-362) Unexpected extra results when using the task attempt table search

2013-02-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-362:
--

Attachment: YARN-362.branch-0.23.patch

Thanks for the review Jason! I hadn't realized that I hadn't jsonified the 
attempts table. I'm doing so in this patch. I've also fixed the pollution of 
search results, along with some minor code improvements.

 Unexpected extra results when using the task attempt table search
 -

 Key: YARN-362
 URL: https://issues.apache.org/jira/browse/YARN-362
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Ravi Prakash
Priority: Minor
 Attachments: MAPREDUCE-4960.patch, YARN-362.branch-0.23.patch, 
 YARN-362.patch


 When using the search box on the web UI to search for a specific task number 
 (e.g.: 0831), sometimes unexpected extra results are shown.  Using the web 
 browser's built-in search-within-page does not show any hits, so these look 
 like completely spurious results.
 It looks like the raw timestamp value for time columns, which is not shown in 
 the table, is also being searched with the search box.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-362) Unexpected extra results when using the task attempt table search

2013-02-06 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash updated YARN-362:
--

Attachment: YARN-362.patch

The patch ported to trunk

 Unexpected extra results when using the task attempt table search
 -

 Key: YARN-362
 URL: https://issues.apache.org/jira/browse/YARN-362
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Ravi Prakash
Priority: Minor
 Attachments: MAPREDUCE-4960.patch, YARN-362.branch-0.23.patch, 
 YARN-362.patch


 When using the search box on the web UI to search for a specific task number 
 (e.g.: 0831), sometimes unexpected extra results are shown.  Using the web 
 browser's built-in search-within-page does not show any hits, so these look 
 like completely spurious results.
 It looks like the raw timestamp value for time columns, which is not shown in 
 the table, is also being searched with the search box.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-111) Application level priority in Resource Manager Schedulers

2013-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573157#comment-13573157
 ] 

Vinod Kumar Vavilapalli commented on YARN-111:
--

So, can we close this as won't fix?

Though it is a useful feature, it has many dangerous pitfalls as noted and 
clearly also has alternative means of achieving it.

 Application level priority in Resource Manager Schedulers
 -

 Key: YARN-111
 URL: https://issues.apache.org/jira/browse/YARN-111
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.1-alpha
Reporter: nemon lou

 We need application level priority for Hadoop 2.0,both in FIFO scheduler and 
 Capacity Scheduler.
 In Hadoop 1.0.x,job priority is supported.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-374) Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication

2013-02-06 Thread nemon lou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573165#comment-13573165
 ] 

nemon lou commented on YARN-374:


Thanks for the information.
But why not have one more API like gracefullyKillApplication(or just change 
force kill's behavior).
With this method,RM will ask AM to kill the app itself,
a force kill will be triggered if AM haven't killed itself during some period.

 Job History Server doesn't show jobs which killed by 
 ClientRMProtocol.forceKillApplication
 --

 Key: YARN-374
 URL: https://issues.apache.org/jira/browse/YARN-374
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Affects Versions: 2.0.1-alpha
Reporter: nemon lou

 After i kill a app by typing bin/yarn rmadmin app -kill APP_ID,
 no job info is kept on JHS web page.
 However, when i kill a job by typing  bin/mapred  job -kill JOB_ID ,
 i can see a killed job left on JHS.
 Some hive users are confused by that their jobs been killed but nothing left 
 on JHS ,and killed app's info on RM web page is not enough.(They kill job by 
 clientRMProtocol)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-359) NodeManager container-related tests fail on branch-trunk-win

2013-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573166#comment-13573166
 ] 

Vinod Kumar Vavilapalli commented on YARN-359:
--

bq. The main reason for moving these to Shell was to reduce the number of 
places where OS specific forks happen in code and limit all such behavior to 
the Shell object that mainly performs OS dependent tasks that cannot be done in 
Java.
Sure, I found only one usage and didn't see this arguments otherwise, so 
suggested moving it out. As I mentioned, if we already have other uses, we can 
promote it.

 NodeManager container-related tests fail on branch-trunk-win
 

 Key: YARN-359
 URL: https://issues.apache.org/jira/browse/YARN-359
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: YARN-359-branch-trunk-win.1.patch, 
 YARN-359-branch-trunk-win.2.patch


 On branch-trunk-win, there are test failures in {{TestContainerManager}}, 
 {{TestNodeManagerShutdown}}, {{TestContainerLaunch}}, and 
 {{TestContainersMonitor}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-362) Unexpected extra results when using the task attempt table search

2013-02-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573168#comment-13573168
 ] 

Hadoop QA commented on YARN-362:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12568368/YARN-362.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/391//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/391//console

This message is automatically generated.

 Unexpected extra results when using the task attempt table search
 -

 Key: YARN-362
 URL: https://issues.apache.org/jira/browse/YARN-362
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha, 0.23.5
Reporter: Jason Lowe
Assignee: Ravi Prakash
Priority: Minor
 Attachments: MAPREDUCE-4960.patch, YARN-362.branch-0.23.patch, 
 YARN-362.patch


 When using the search box on the web UI to search for a specific task number 
 (e.g.: 0831), sometimes unexpected extra results are shown.  Using the web 
 browser's built-in search-within-page does not show any hits, so these look 
 like completely spurious results.
 It looks like the raw timestamp value for time columns, which is not shown in 
 the table, is also being searched with the search box.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-236) RM should point tracking URL to RM web page when app fails to start

2013-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573184#comment-13573184
 ] 

Vinod Kumar Vavilapalli commented on YARN-236:
--

I agree that the null check can hit before the app starts where redirecting is 
useful.

But for crashing AMs: Shouldn't YARN-165 have already fixed the original 
tracking url to point to RM web age already in case of crashing AMs? I just 
checked the patch and seems so.

 RM should point tracking URL to RM web page when app fails to start
 ---

 Key: YARN-236
 URL: https://issues.apache.org/jira/browse/YARN-236
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 0.23.4
Reporter: Jason Lowe
Assignee: Jason Lowe
 Attachments: YARN-236.patch


 Similar to YARN-165, the RM should redirect the tracking URL to the specific 
 app page on the RM web UI when the application fails to start.  For example, 
 if the AM completely fails to start due to bad AM config or bad job config 
 like invalid queuename, then the user gets the unhelpful The requested 
 application exited before setting a tracking URL.
 Usually the diagnostic string on the RM app page has something useful, so we 
 might as well point there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-359) NodeManager container-related tests fail on branch-trunk-win

2013-02-06 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573183#comment-13573183
 ] 

Chris Nauroth commented on YARN-359:


Thanks for the commit.

Sorry, Bikas.  I had forgotten the earlier discussion on YARN-233 when we chose 
to place these methods in Shell, so I forgot to point this out to Vinod during 
his review of this patch.  We don't currently have other uses for these 
methods.  However, a potential argument for moving them back to Shell is that 
if a need arises, then developers are far more likely to look in Shell for a 
utility method than to remember to promote something out of the nodemanager 
codebase.

I'd be happy to do more refactoring if you want to discuss further.

 NodeManager container-related tests fail on branch-trunk-win
 

 Key: YARN-359
 URL: https://issues.apache.org/jira/browse/YARN-359
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Attachments: YARN-359-branch-trunk-win.1.patch, 
 YARN-359-branch-trunk-win.2.patch


 On branch-trunk-win, there are test failures in {{TestContainerManager}}, 
 {{TestNodeManagerShutdown}}, {{TestContainerLaunch}}, and 
 {{TestContainersMonitor}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-209) Capacity scheduler can leave application in pending state

2013-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573188#comment-13573188
 ] 

Vinod Kumar Vavilapalli commented on YARN-209:
--

Haven't looked at the code yet, trying to understand the scenario.

So, in other words, if an application gets submitted to the RM before any NM 
registered, the application will be stuck in pending state. Right?

If so, we can write a test like that.

 Capacity scheduler can leave application in pending state
 -

 Key: YARN-209
 URL: https://issues.apache.org/jira/browse/YARN-209
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Fix For: 3.0.0

 Attachments: YARN-209.1.patch, YARN-209-test.patch


 Say application A is submitted but at that time it does not meet the bar for 
 activation because of resource limit settings for applications. After that if 
 more hardware is added to the system and the application becomes valid it 
 still remains in pending state, likely forever.
 This might be rare to hit in real life because enough NM's heartbeat to the 
 RM before applications can get submitted. But a change in settings or 
 heartbeat interval might make it easier to repro. In RM restart scenarios, 
 this will likely hit more if its implemented by re-playing events and 
 re-submitting applications to the scheduler before the RPC to NM's is 
 activated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-387) Fix inconsistent protocol naming

2013-02-06 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-387:


 Summary: Fix inconsistent protocol naming
 Key: YARN-387
 URL: https://issues.apache.org/jira/browse/YARN-387
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli


We now have different and inconsistent naming schemes for various protocols. It 
was hard to explain to users, mainly in direct interactions at 
talks/presentations and user group meetings, with such naming.

We should fix these before we go beta. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-387) Fix inconsistent protocol naming

2013-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-387:
-

Labels: incompatible  (was: )

This is going to be an incompatible change for existing users of the alpha 
releases.

 Fix inconsistent protocol naming
 

 Key: YARN-387
 URL: https://issues.apache.org/jira/browse/YARN-387
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
  Labels: incompatible

 We now have different and inconsistent naming schemes for various protocols. 
 It was hard to explain to users, mainly in direct interactions at 
 talks/presentations and user group meetings, with such naming.
 We should fix these before we go beta. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-387) Fix inconsistent protocol naming

2013-02-06 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573206#comment-13573206
 ] 

Vinod Kumar Vavilapalli commented on YARN-387:
--

I propose we do the following conversions:

Main protocols:
 - client_RM_protocol.proto - client_rm_protocol.proto
 - AM_RM_protocol.proto -  am_rm_protocol.proto
 - container_manager.proto - am_nm_protocol.proto
 - ResourceTracker.proto - rm_nm_protocol.proto
 - LocalizationProtocol.proto - nm_localizer_protocol.proto
 - RMAdminProtocol.proto - rm_admin_protocol.proto

Misc:
 - yarnprototunnelrpc.proto - yarn_rpc_tunnel_protos.proto

In addition, we should
 - similarly rename all the java API classes backing the above protocols
 - add comments to all the proto files description as to what they do and can 
contain.

Thoughts?

 Fix inconsistent protocol naming
 

 Key: YARN-387
 URL: https://issues.apache.org/jira/browse/YARN-387
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 We now have different and inconsistent naming schemes for various protocols. 
 It was hard to explain to users, mainly in direct interactions at 
 talks/presentations and user group meetings, with such naming.
 We should fix these before we go beta. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-387) Fix inconsistent protocol naming

2013-02-06 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573234#comment-13573234
 ] 

Sandy Ryza commented on YARN-387:
-

+1 to the proposal

If I understand the RMAdminProtocol correctly, RPCs are sent to the RM?  Would 
it make sense to call it the AdminRMProtocol to reflect this in line with the 
ordering in the other protocols?

I think it would also be helpful to add/go over the comments for the java 
protocol classes, as that is the first place many developers will go when 
trying to understand how YARN works and how to program against it.  Not sure if 
that's in the scope of this JIRA or not? 

 Fix inconsistent protocol naming
 

 Key: YARN-387
 URL: https://issues.apache.org/jira/browse/YARN-387
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
  Labels: incompatible

 We now have different and inconsistent naming schemes for various protocols. 
 It was hard to explain to users, mainly in direct interactions at 
 talks/presentations and user group meetings, with such naming.
 We should fix these before we go beta. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-374) Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication

2013-02-06 Thread nemon lou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573238#comment-13573238
 ] 

nemon lou commented on YARN-374:


Agree that YARN-321 will help.

 Job History Server doesn't show jobs which killed by 
 ClientRMProtocol.forceKillApplication
 --

 Key: YARN-374
 URL: https://issues.apache.org/jira/browse/YARN-374
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client, resourcemanager
Affects Versions: 2.0.1-alpha
Reporter: nemon lou

 After i kill a app by typing bin/yarn rmadmin app -kill APP_ID,
 no job info is kept on JHS web page.
 However, when i kill a job by typing  bin/mapred  job -kill JOB_ID ,
 i can see a killed job left on JHS.
 Some hive users are confused by that their jobs been killed but nothing left 
 on JHS ,and killed app's info on RM web page is not enough.(They kill job by 
 clientRMProtocol)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-365) Each NM heartbeat should not generate and event for the Scheduler

2013-02-06 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573244#comment-13573244
 ] 

Siddharth Seth commented on YARN-365:
-

This isn't very different from configuring all nodes to have a higher heartbeat 
interval. With a high heartbeat interval, the NM would send a batch of updates 
over to the RM, and this heartbeat would trigger a scheduling pass.

This change de-links RM scheduling passes from NM heartbeats. The NM can 
continue to provide node updates with a smaller interval, and the RM handles 
these, along with a scheduling pass, as and when it chooses to. In this 
particular case, the scheduler queue ends up with a single scheduling event per 
node - but will attempt a scheduling run only on the next heartbeat from that 
node. At a later point, the scheduling could be changed to be triggered by the 
arrival of a new application - or to just run in a tight loop.

If the scheduler cannot keep up, it ends up scheduling as fast as it can - 
without node heartbeats affecting the queue size. Also, completed container 
information from heartbeats is processed earlier (instead of waiting for the 
event in the queue to be processed) - making each scheduler pass more efficient.

bq. I can see cases where the all at once is actually worse as it will spend 
more time on a single heartbeat and potentially not get to other things in the 
queue like apps added as fast. 
The event should not be delayed more than the time required to complete one 
scheduling pass across all nodes. I don't think this will be much better in the 
case of a growing scheduler queue.

bq. The only way I can see this being beneficial is if we can aggregate the 
heartbeats and have the scheduler process less.
Do you mean somehow aggregating heartbeats across nodes ? This approach does 
aggregate heartbeats for a single node.

 Each NM heartbeat should not generate and event for the Scheduler
 -

 Key: YARN-365
 URL: https://issues.apache.org/jira/browse/YARN-365
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager, scheduler
Affects Versions: 0.23.5
Reporter: Siddharth Seth
Assignee: Xuan Gong
 Attachments: Prototype2.txt, Prototype3.txt, YARN-365.1.patch, 
 YARN-365.2.patch, YARN-365.3.patch


 Follow up from YARN-275
 https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-387) Fix inconsistent protocol naming

2013-02-06 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573273#comment-13573273
 ] 

Karthik Kambatla commented on YARN-387:
---

Good idea!

 Fix inconsistent protocol naming
 

 Key: YARN-387
 URL: https://issues.apache.org/jira/browse/YARN-387
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
  Labels: incompatible

 We now have different and inconsistent naming schemes for various protocols. 
 It was hard to explain to users, mainly in direct interactions at 
 talks/presentations and user group meetings, with such naming.
 We should fix these before we go beta. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira