[jira] [Updated] (YARN-556) RM Restart phase 2 - Work preserving restart

2013-04-25 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-556:


Summary: RM Restart phase 2 - Work preserving restart  (was: RM Restart 
phase 2 - Design for work preserving restart)

> RM Restart phase 2 - Work preserving restart
> 
>
> Key: YARN-556
> URL: https://issues.apache.org/jira/browse/YARN-556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: Bikas Saha
>  Labels: gsoc2013
>
> The basic idea is already documented on YARN-128. This will describe further 
> details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642610#comment-13642610
 ] 

Vinod Kumar Vavilapalli commented on YARN-562:
--

Jian, the patch isn't applying on branch-2. Can you please generate a patch for 
branch-2, run all the tests and upload it? Tx.

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642604#comment-13642604
 ] 

Hudson commented on YARN-562:
-

Integrated in Hadoop-trunk-Commit #3669 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3669/])
YARN-562. Missed files from previous commit. (Revision 1476038)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1476038
Files : 
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/ResourceManagerConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/InvalidContainerException.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/NMNotYetReadyException.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationmasterservice
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/applicationmasterservice/TestApplicationMasterService.java


> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642603#comment-13642603
 ] 

Tsuyoshi OZAWA commented on YARN-562:
-

This build failure can be fixed by using the attached patch in YARN-619.

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-619) Build break due to changes in YARN-562

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642600#comment-13642600
 ] 

Tsuyoshi OZAWA commented on YARN-619:
-

NP!

> Build break due to changes in YARN-562
> --
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-619.patch
>
>
> YARN-562 break the build in trunk, because ResourceManagerConstants, 
> NMNotYetReadyException, InvalidContainerException are not included.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-619) Build break due to changes in YARN-562

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-619:


Description: YARN-562 break the build in trunk, because 
ResourceManagerConstants, NMNotYetReadyException, InvalidContainerException are 
not included.  (was: YARN-562 break the build in trunk, because 
ResourceManagerConstants is not included.)

> Build break due to changes in YARN-562
> --
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-619.patch
>
>
> YARN-562 break the build in trunk, because ResourceManagerConstants, 
> NMNotYetReadyException, InvalidContainerException are not included.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-619) Build break due to changes in YARN-562

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-619:


Attachment: YARN-619.patch

Fixed build by adding forgotten files.

> Build break due to changes in YARN-562
> --
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-619.patch
>
>
> YARN-562 break the build in trunk, because ResourceManagerConstants is not 
> included.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-619) Build break due to changes in YARN-562

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-619.
--

Resolution: Fixed
  Assignee: Vinod Kumar Vavilapalli

Okay, I originally missed some new files in the commit. I just put them in. 
Thanks for reporting!

> Build break due to changes in YARN-562
> --
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>Assignee: Vinod Kumar Vavilapalli
>
> YARN-562 break the build in trunk, because ResourceManagerConstants is not 
> included.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-619) Build break due to changes in YARN-562

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642592#comment-13642592
 ] 

Vinod Kumar Vavilapalli commented on YARN-619:
--

Am looking at this. I committed the trunk changes a little while ago. Let me 
look at it and revert it if I can reproduce the build issues.

> Build break due to changes in YARN-562
> --
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>
> YARN-562 break the build in trunk, because ResourceManagerConstants is not 
> included.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-619) Build break due to changes in YARN-562

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-619:


Description: YARN-562 break the build in trunk, because 
ResourceManagerConstants is not included.  (was: YARN-562 doesn't take into 
account the changes ResourceManagerConstants.RM_INVALID_IDENTIFIER.
This breaks the build in trunk.)

> Build break due to changes in YARN-562
> --
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>
> YARN-562 break the build in trunk, because ResourceManagerConstants is not 
> included.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-619) Build break due to changes in YARN-562

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-619:


Summary: Build break due to changes in YARN-562  (was: Build break due to 
changes in YARN related to ApplicationConstants.Environment.)

> Build break due to changes in YARN-562
> --
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>
> YARN-562 doesn't take into account the changes 
> ResourceManagerConstants.RM_INVALID_IDENTIFIER.
> This breaks the build in trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-619) Build break due to changes in YARN related to ApplicationConstants.Environment.

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-619:


Description: 
YARN-562 doesn't take into account the changes 
ResourceManagerConstants.RM_INVALID_IDENTIFIER.
This breaks the build in trunk.

  was:
YARN-562 doesn't take into account the changes ApplicationConstants.Environment.
This breaks the build in trunk.


> Build break due to changes in YARN related to 
> ApplicationConstants.Environment.
> ---
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>
> YARN-562 doesn't take into account the changes 
> ResourceManagerConstants.RM_INVALID_IDENTIFIER.
> This breaks the build in trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-619) Build break due to changes in YARN related to ApplicationConstants.Environment.

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-619:


Component/s: nodemanager

> Build break due to changes in YARN related to 
> ApplicationConstants.Environment.
> ---
>
> Key: YARN-619
> URL: https://issues.apache.org/jira/browse/YARN-619
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Tsuyoshi OZAWA
>
> YARN-562 doesn't take into account the changes 
> ApplicationConstants.Environment.
> This breaks the build in trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-619) Build break due to changes in YARN related to ApplicationConstants.Environment.

2013-04-25 Thread Tsuyoshi OZAWA (JIRA)
Tsuyoshi OZAWA created YARN-619:
---

 Summary: Build break due to changes in YARN related to 
ApplicationConstants.Environment.
 Key: YARN-619
 URL: https://issues.apache.org/jira/browse/YARN-619
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Tsuyoshi OZAWA


YARN-562 doesn't take into account the changes ApplicationConstants.Environment.
This breaks the build in trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642563#comment-13642563
 ] 

Hudson commented on YARN-562:
-

Integrated in Hadoop-trunk-Commit #3668 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3668/])
YARN-562. Modified NM to reject any containers allocated by a previous 
ResourceManager. Contributed by Jian He.
MAPREDUCE-5167. Update MR App after YARN-562 to use the new builder API for the 
container. Contributed by Jian He. (Revision 1476034)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1476034
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRApp.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/test/java/org/apache/hadoop/mapreduce/v2/app/MRAppBenchmark.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/Container.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ContainerPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestAMRMClientAsync.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerLaunchRPC.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPC.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerResponse.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerResponsePBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/DummyContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn

[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642554#comment-13642554
 ] 

Vinod Kumar Vavilapalli commented on YARN-562:
--

The latest patch looks good to me, +1. 

Agreed about setting it to say a -ve number. +1 for doing it via YARN-618, this 
has become big enough already.

Thanks for the testing too. I am checking this in.

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-578) NodeManager should use SecureIOUtils for serving logs and intermediate outputs

2013-04-25 Thread Omkar Vinit Joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642524#comment-13642524
 ] 

Omkar Vinit Joshi commented on YARN-578:


There are 3 issues related to symlink attacks in serving logs and ShuffleService
* Index file (file.out.index) :- [Location - SpillRecord.SpillRecord() - 
FSDataInputStream ] Here we are directly trying to read from file.out.index 
file (So the potential problem is that we ShuffleHandler may end up reading 
files of yarn user or yarn group user. [ yarn:yarn is running nodemanager ]
* Map output file (file.out) :- [Location - ShuffleHandler.sendMapOutput() - 
RandomAccessFile ] Here too we are directly accessing file.out file.
* Container Logs :- [Location - ContainerLogsPage.printLogs() - FileInputStream 
] Here we are directly accessing container logs as yarn:yarn user.

At present SecureIOUtils supports only FileInputStream, so I am adding support 
for 2 more streams, FSDataInputStream (This is required if you want a stream to 
be position readable or seekable) and RandomAccessFile. Filing a separate JIRA 
for this. HADOOP-9511

> NodeManager should use SecureIOUtils for serving logs and intermediate outputs
> --
>
> Key: YARN-578
> URL: https://issues.apache.org/jira/browse/YARN-578
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Omkar Vinit Joshi
>
> Log servlets for serving logs and the ShuffleService for serving intermediate 
> outputs both should use SecureIOUtils for avoiding symlink attacks.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Closed] (YARN-429) capacity-scheduler config missing from yarn-test artifact

2013-04-25 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy closed YARN-429.
--


> capacity-scheduler config missing from yarn-test artifact
> -
>
> Key: YARN-429
> URL: https://issues.apache.org/jira/browse/YARN-429
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Blocker
> Fix For: 2.0.4-alpha
>
> Attachments: YARN-429.txt
>
>
> MiniYARNCluster and MiniMRCluster are unusable by downstream projects with 
> the 2.0.3-alpha release, since the capacity-scheduler configuration is 
> missing from the test artifact.
> hadoop-yarn-server-tests-3.0.0-SNAPSHOT-tests.jar should include the default 
> capacity-scheduler configuration. Also, this doesn't need to be part of the 
> default classpath - and should be moved out of the top level directory in the 
> dist package.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Closed] (YARN-449) HBase test failures when running against Hadoop 2

2013-04-25 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy closed YARN-449.
--


> HBase test failures when running against Hadoop 2
> -
>
> Key: YARN-449
> URL: https://issues.apache.org/jira/browse/YARN-449
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Siddharth Seth
>Priority: Blocker
> Fix For: 2.0.4-alpha
>
> Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
> hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt, 
> minimr_randomdir-branch2.txt
>
>
> Post YARN-429, unit tests for HBase continue to fail since the classpath for 
> the MRAppMaster is not being set correctly.
> Reverting YARN-129 may fix this, but I'm not sure that's the correct 
> solution. My guess is, as Alexandro pointed out in YARN-129, maven 
> classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Closed] (YARN-470) Support a way to disable resource monitoring on the NodeManager

2013-04-25 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy closed YARN-470.
--


> Support a way to disable resource monitoring on the NodeManager
> ---
>
> Key: YARN-470
> URL: https://issues.apache.org/jira/browse/YARN-470
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Hitesh Shah
>Assignee: Siddharth Seth
>  Labels: usability
> Fix For: 2.0.4-alpha
>
> Attachments: YARN-470_2.txt, YARN-470.txt
>
>
> Currently, the memory management monitor's check is disabled when the maxMem 
> is set to -1. However, the maxMem is also sent to the RM when the NM 
> registers with it ( to define the max limit of allocate-able resources ). 
> We need an explicit flag to disable monitoring to avoid the problems caused 
> by the overloading of the max memory value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Closed] (YARN-443) allow OS scheduling priority of NM to be different than the containers it launches

2013-04-25 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy closed YARN-443.
--


> allow OS scheduling priority of NM to be different than the containers it 
> launches
> --
>
> Key: YARN-443
> URL: https://issues.apache.org/jira/browse/YARN-443
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Thomas Graves
>Assignee: Thomas Graves
> Fix For: 3.0.0, 0.23.7, 2.0.4-alpha
>
> Attachments: YARN-443-branch-0.23.patch, YARN-443-branch-0.23.patch, 
> YARN-443-branch-0.23.patch, YARN-443-branch-0.23.patch, 
> YARN-443-branch-2.patch, YARN-443-branch-2.patch, YARN-443-branch-2.patch, 
> YARN-443.patch, YARN-443.patch, YARN-443.patch, YARN-443.patch, 
> YARN-443.patch, YARN-443.patch, YARN-443.patch
>
>
> It would be nice if we could have the nodemanager run at a different OS 
> scheduling priority than the containers so that you can still communicate 
> with the nodemanager if the containers out of control.  
> On linux we could launch the nodemanager at a higher priority, but then all 
> the containers it launches would also be at that higher priority, so we need 
> a way for the container executor to launch them at a lower priority.
> I'm not sure how this applies to windows if at all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642499#comment-13642499
 ] 

Jian He commented on YARN-562:
--

bq. RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 
0. Probably a -ve number is what we want. Looks good other than that.
The jira to track this is YARN-618

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-04-25 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-618:
-

Description: RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many 
tests set it to 0. Probably a -ve number is what we want.

> Modify RM_INVALID_IDENTIFIER to  a -ve number
> -
>
> Key: YARN-618
> URL: https://issues.apache.org/jira/browse/YARN-618
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
>
> RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
> Probably a -ve number is what we want.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-618) Modify RM_INVALID_IDENTIFIER to a -ve number

2013-04-25 Thread Jian He (JIRA)
Jian He created YARN-618:


 Summary: Modify RM_INVALID_IDENTIFIER to  a -ve number
 Key: YARN-618
 URL: https://issues.apache.org/jira/browse/YARN-618
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642485#comment-13642485
 ] 

Jian He commented on YARN-562:
--

bq. RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 
0. Probably a -ve number is what we want. Looks good other than that.
This will be fixed in a separate jira.
bq. Did we run a single node setup with a few jobs to do a real sanity check?
I did a few single-node tests, looks good. 

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642482#comment-13642482
 ] 

Bikas Saha commented on YARN-562:
-

RM_INVALID_IDENTIFIER set to 0 doesnt sound right as many tests set it to 0. 
Probably a -ve number is what we want. Looks good other than that.

Did we run a single node setup with a few jobs to do a real sanity check?

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642475#comment-13642475
 ] 

Hadoop QA commented on YARN-562:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12580633/YARN-562.12.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 17 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/827//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/827//console

This message is automatically generated.

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-435) Make it easier to access cluster topology information in an AM

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642450#comment-13642450
 ] 

Vinod Kumar Vavilapalli commented on YARN-435:
--

I think that opening a new connection isn't really a problem. I propose:
 - Keeping the protocols separate - ClientRMProtocol and AMRMProtocol and not 
duplicate any APIs
 - Expect AMs to open two connections always. The AMRMClient library can do 
this for java users
 - For secure mode, make the ClientRMProtocol also accept AMTokens. Once we do 
that, any AM even in secure mode can talk to RM on both the protocols. After 
this and YARN-613, AMToken becomes a single sign-on token - with a AMToken, an 
AM can talk to ClientRMProtocol and AMRMProtocol on ResourceManager and also on 
ContainerManagerProtocol on NodeManager. 

> Make it easier to access cluster topology information in an AM
> --
>
> Key: YARN-435
> URL: https://issues.apache.org/jira/browse/YARN-435
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
>
> ClientRMProtocol exposes a getClusterNodes api that provides a report on all 
> nodes in the cluster including their rack information. 
> However, this requires the AM to open and establish a separate connection to 
> the RM in addition to one for the AMRMProtocol. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-575) ContainerManager APIs should be user accessible

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642445#comment-13642445
 ] 

Vinod Kumar Vavilapalli commented on YARN-575:
--

I don't think we really want the APIs to be user-accessible by opening up NM 
itself to the users.

startContainer():
 - should only be called by the AM.

stopContainer()/getContainerStatus():
 - Today these are only callable by the AM which launches containers - which is 
bad of course. Once YARN-613 is done, we will use the AMToken for 
authentication to the NM, so any AM can talk to a NM irrespective of whether it 
launched containers or not.
 - If user really wants to stop a container, or get a container-status, we can 
add this as an RM API - RM has enough information to tell the users - should we 
go that way?

> ContainerManager APIs should be user accessible
> ---
>
> Key: YARN-575
> URL: https://issues.apache.org/jira/browse/YARN-575
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Siddharth Seth
>Priority: Critical
>
> Auth for ContainerManager is based on the containerId being accessed - since 
> this is what is used to launch containers (There's likely another jira 
> somewhere to change this to not be containerId based).
> What this also means is the API is effectively not usable with kerberos 
> credentials.
> Also, it should be possible to use this API with some generic tokens 
> (RMDelegation?), instead of with Container specific tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-575) ContainerManager APIs should be user accessible

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned YARN-575:


Assignee: Vinod Kumar Vavilapalli

> ContainerManager APIs should be user accessible
> ---
>
> Key: YARN-575
> URL: https://issues.apache.org/jira/browse/YARN-575
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.0.4-alpha
>Reporter: Siddharth Seth
>Assignee: Vinod Kumar Vavilapalli
>Priority: Critical
>
> Auth for ContainerManager is based on the containerId being accessed - since 
> this is what is used to launch containers (There's likely another jira 
> somewhere to change this to not be containerId based).
> What this also means is the API is effectively not usable with kerberos 
> credentials.
> Also, it should be possible to use this API with some generic tokens 
> (RMDelegation?), instead of with Container specific tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-562:
-

Attachment: YARN-562.12.patch

new patch rebase..

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.12.patch, 
> YARN-562.1.patch, YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, 
> YARN-562.5.patch, YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, 
> YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642435#comment-13642435
 ] 

Vinod Kumar Vavilapalli commented on YARN-613:
--

bq. I wanted to do it all together at YARN-571, but in retrospect, I think we 
should keep it separate.
Apologies, I meant YARN-617.

> Create NM proxy per NM instead of per container
> ---
>
> Key: YARN-613
> URL: https://issues.apache.org/jira/browse/YARN-613
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Vinod Kumar Vavilapalli
>
> Currently a new NM proxy has to be created per container since the secure 
> authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (YARN-613) Create NM proxy per NM instead of per container

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reopened YARN-613:
--

  Assignee: Vinod Kumar Vavilapalli

I wanted to do it all together at YARN-571, but in retrospect, I think we 
should keep it separate.

Here's my proposal
- Use the AMToken(today called ApplicationToken, but it is per 
AM/ApplicationAttemptId) for authentication to the NM. Due to this, we only 
need to create one connection per NM. So, we will no longer need to latch onto 
ContainerTokens for the sake of {{stopContainer()/getContainerStatus()}}
 - Add authorization checks also for {{stopContainer()/getContainerStatus()}} - 
today there are none.
 - Use ContainerToken for authorization of {{startContainer()}} irrespective of 
security like I proposed on YARN-617.
 - Today we have authentication based on ContainerTokens for 
{{stopContainer()/getContainerStatus()}}, but not authorization. Once we 
authenticate based on AMTokens, they become automatically accessible to users 
(YARN-575 will be a duplicate) without latching onto ContainerTokens for long 
times. We just need to add more authorization checks for these two RPCs.
 - One catch is AM restart - thanks to [~bikassaha] for bringing this up 
offline. If AM restarts, it will get a new AMToken, will be successfully able 
to authenticate to NMs with the new AMToken but authorization can be an issue 
for {{stopContainer()/getContainerStatus()}}. For this to work, authorization 
should only be based on ApplicationId and not ApplicationAttemptID - that way a 
second appAttempt can kill containers spawned by previous appAttempt.

> Create NM proxy per NM instead of per container
> ---
>
> Key: YARN-613
> URL: https://issues.apache.org/jira/browse/YARN-613
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Vinod Kumar Vavilapalli
>
> Currently a new NM proxy has to be created per container since the secure 
> authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642425#comment-13642425
 ] 

Hadoop QA commented on YARN-562:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12580629/YARN-562.11.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/826//console

This message is automatically generated.

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.1.patch, 
> YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, YARN-562.5.patch, 
> YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-562) NM should reject containers allocated by previous RM

2013-04-25 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-562:
-

Attachment: YARN-562.11.patch

New patch address above comments, and rename 
ContainerAllocatedByPreviousRMException to InvalidContainerException

> NM should reject containers allocated by previous RM
> 
>
> Key: YARN-562
> URL: https://issues.apache.org/jira/browse/YARN-562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-562.10.patch, YARN-562.11.patch, YARN-562.1.patch, 
> YARN-562.2.patch, YARN-562.3.patch, YARN-562.4.patch, YARN-562.5.patch, 
> YARN-562.6.patch, YARN-562.7.patch, YARN-562.8.patch, YARN-562.9.patch
>
>
> Its possible that after RM shutdown, before AM goes down,AM still call 
> startContainer on NM with containers allocated by previous RM. When RM comes 
> back, NM doesn't know whether this container launch request comes from 
> previous RM or the current RM. we should reject containers allocated by 
> previous RM 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642405#comment-13642405
 ] 

Hadoop QA commented on YARN-326:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12580624/YARN-326.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/825//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/825//console

This message is automatically generated.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch, 
> YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642403#comment-13642403
 ] 

Vinod Kumar Vavilapalli commented on YARN-617:
--

Like I mentioned in the description, we can do this by adding ContainerTokens 
to the payload and still using the same ContainerTokens for authentication. We 
don't want to remove the authentication altogether as we need mutual 
authentication (AMs need to be sure they are talking to valid NMs). So,
 - in unsecure mode, RM and NMs share the container-master-key, use it to 
validate the ContainerTokens from the payload
 - in secure mode, RM and NMs continue to share the container-master-key, use 
it to validate the ContainerTokens from the payload. On top of that, 
ContainerTokens will be used to authenticate the connection.

> In unsercure mode, AM can fake resource requirements 
> -
>
> Key: YARN-617
> URL: https://issues.apache.org/jira/browse/YARN-617
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>
> Without security, it is impossible to completely avoid AMs faking resources. 
> We can at the least make it as difficult as possible by using the same 
> container tokens and the RM-NM shared key mechanism over unauthenticated 
> RM-NM channel.
> In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1

2013-04-25 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642401#comment-13642401
 ] 

Bikas Saha commented on YARN-614:
-

1) Node lost
2) ContainerExitStatus of AM denotes hardware error. eg. existing DISK_FAILED 
status and others like it in the future.

> Retry attempts automatically for hardware failures or YARN issues and set 
> default app retries to 1
> --
>
> Key: YARN-614
> URL: https://issues.apache.org/jira/browse/YARN-614
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bikas Saha
>
> Attempts can fail due to a large number of user errors and they should not be 
> retried unnecessarily. The only reason YARN should retry an attempt is when 
> the hardware fails or YARN has an error. NM failing, lost NM and NM disk 
> errors are the hardware errors that come to mind.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-591) RM recovery related records do not belong to the API

2013-04-25 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642399#comment-13642399
 ] 

Bikas Saha commented on YARN-591:
-

Its a clean move. Thanks! I should have done it correct the first time. +1

> RM recovery related records do not belong to the API
> 
>
> Key: YARN-591
> URL: https://issues.apache.org/jira/browse/YARN-591
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-591-20130418.txt
>
>
> We need to move out AppliationStateData and ApplicationAttemptStateData into 
> resourcemanager module. They are not part of the public API..

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-613) Create NM proxy per NM instead of per container

2013-04-25 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642387#comment-13642387
 ] 

Bikas Saha commented on YARN-613:
-

Looks related to YARN-617 but not a duplicate. [~vinodkv] Can you please check?

> Create NM proxy per NM instead of per container
> ---
>
> Key: YARN-613
> URL: https://issues.apache.org/jira/browse/YARN-613
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>
> Currently a new NM proxy has to be created per container since the secure 
> authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: YARN-326.patch

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch, 
> YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642380#comment-13642380
 ] 

Sandy Ryza commented on YARN-326:
-

Latest patch should fix the findbugs warning and adds in some tests I forgot to 
include.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch, 
> YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-616) Support guaranteed shares in the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642371#comment-13642371
 ] 

Sandy Ryza commented on YARN-616:
-

Agreed.

> Support guaranteed shares in the fair scheduler
> ---
>
> Key: YARN-616
> URL: https://issues.apache.org/jira/browse/YARN-616
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> A common requested feature in the fair scheduler is to reserve shares of the 
> cluster for queues that no other queue can trample on, even if they are 
> unused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-616) Support guaranteed shares in the fair scheduler

2013-04-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642369#comment-13642369
 ] 

Karthik Kambatla commented on YARN-616:
---

Sure. I think the best way to achieve this might be to add another field in the 
allocations file (say, nonpreempt) and verify the value specified is less than 
min.

> Support guaranteed shares in the fair scheduler
> ---
>
> Key: YARN-616
> URL: https://issues.apache.org/jira/browse/YARN-616
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> A common requested feature in the fair scheduler is to reserve shares of the 
> cluster for queues that no other queue can trample on, even if they are 
> unused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642370#comment-13642370
 ] 

Sandy Ryza commented on YARN-326:
-

Thanks for taking a look Andrew.  Minimum shares exceeding the cluster size and 
particularly the cluster size when nodes are down is definitely a concern.  
However, I'm wary of diverging from how it's been configured in the past, which 
is in terms of absolute amounts.  Currently, nothing crazy will happen in this 
situation either, resources will just go to the apps that are the farthest 
behind their fair shares.  

Also, I think it's reasonable that minimum shares should be able to be 
different amounts for different resources.  Suppose the geology department, 
which requires a bunch of cores, and the history department, which runs jobs 
that need a lot of memory, decide to get together on a cluster.  I think we 
should be able to give the geology department lots of CPU without giving it 
lots of RAM.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-528) Make IDs read only

2013-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642366#comment-13642366
 ] 

Hadoop QA commented on YARN-528:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12580600/y528_ApplicationIdComplete_WIP.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 27 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-shuffle
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/824//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/824//console

This message is automatically generated.

> Make IDs read only
> --
>
> Key: YARN-528
> URL: https://issues.apache.org/jira/browse/YARN-528
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Attachments: y528_AppIdPart_01_Refactor.txt, 
> y528_AppIdPart_02_AppIdChanges.txt, y528_AppIdPart_03_fixUsage.txt, 
> y528_ApplicationIdComplete_WIP.txt, YARN-528.txt, YARN-528.txt
>
>
> I really would like to rip out most if not all of the abstraction layer that 
> sits in-between Protocol Buffers, the RPC, and the actual user code.  We have 
> no plans to support any other serialization type, and the abstraction layer 
> just, makes it more difficult to change protocols, makes changing them more 
> error prone, and slows down the objects themselves.  
> Completely doing that is a lot of work.  This JIRA is a first step towards 
> that.  It makes the various ID objects immutable.  If this patch is wel 
> received I will try to go through other objects/classes of objects and update 
> them in a similar way.
> This is probably the last time we will be able to make a change like this 
> before 2.0 stabilizes and YARN APIs will not be able to be changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-616) Support guaranteed shares in the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642361#comment-13642361
 ] 

Sandy Ryza commented on YARN-616:
-

Currently, minimum shares in the fair scheduler will go to other queues if they 
are unused.  This means that, even with preemption, it can take some time for a 
queue to get the minimum resources it deserves.

> Support guaranteed shares in the fair scheduler
> ---
>
> Key: YARN-616
> URL: https://issues.apache.org/jira/browse/YARN-616
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> A common requested feature in the fair scheduler is to reserve shares of the 
> cluster for queues that no other queue can trample on, even if they are 
> unused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-616) Support guaranteed shares in the fair scheduler

2013-04-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642357#comment-13642357
 ] 

Karthik Kambatla commented on YARN-616:
---

Doesn't the min value address this? 

> Support guaranteed shares in the fair scheduler
> ---
>
> Key: YARN-616
> URL: https://issues.apache.org/jira/browse/YARN-616
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
>
> A common requested feature in the fair scheduler is to reserve shares of the 
> cluster for queues that no other queue can trample on, even if they are 
> unused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1

2013-04-25 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642354#comment-13642354
 ] 

Karthik Kambatla commented on YARN-614:
---

Will the hardware errors include IOException and SocketTimeoutException etc.?

> Retry attempts automatically for hardware failures or YARN issues and set 
> default app retries to 1
> --
>
> Key: YARN-614
> URL: https://issues.apache.org/jira/browse/YARN-614
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bikas Saha
>
> Attempts can fail due to a large number of user errors and they should not be 
> retried unnecessarily. The only reason YARN should retry an attempt is when 
> the hardware fails or YARN has an error. NM failing, lost NM and NM disk 
> errors are the hardware errors that come to mind.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned YARN-617:


Assignee: Vinod Kumar Vavilapalli

> In unsercure mode, AM can fake resource requirements 
> -
>
> Key: YARN-617
> URL: https://issues.apache.org/jira/browse/YARN-617
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>
> Without security, it is impossible to completely avoid AMs faking resources. 
> We can at the least make it as difficult as possible by using the same 
> container tokens and the RM-NM shared key mechanism over unauthenticated 
> RM-NM channel.
> In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (YARN-613) Create NM proxy per NM instead of per container

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-613.
--

Resolution: Duplicate

Just moved YARN-617 over from mapreduce. Closing this as duplicate.

> Create NM proxy per NM instead of per container
> ---
>
> Key: YARN-613
> URL: https://issues.apache.org/jira/browse/YARN-613
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>
> Currently a new NM proxy has to be created per container since the secure 
> authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-617:
-

Issue Type: Sub-task  (was: Bug)
Parent: YARN-47

> In unsercure mode, AM can fake resource requirements 
> -
>
> Key: YARN-617
> URL: https://issues.apache.org/jira/browse/YARN-617
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Priority: Minor
>
> Without security, it is impossible to completely avoid AMs faking resources. 
> We can at the least make it as difficult as possible by using the same 
> container tokens and the RM-NM shared key mechanism over unauthenticated 
> RM-NM channel.
> In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (YARN-617) [MR-279] In unsercure mode, AM can fake resource requirements

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli moved MAPREDUCE-2744 to YARN-617:
-

  Component/s: (was: mrv2)
   (was: security)
Fix Version/s: (was: 0.24.0)
Affects Version/s: (was: 0.23.0)
  Key: YARN-617  (was: MAPREDUCE-2744)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> [MR-279] In unsercure mode, AM can fake resource requirements 
> --
>
> Key: YARN-617
> URL: https://issues.apache.org/jira/browse/YARN-617
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Priority: Minor
>
> Without security, it is impossible to completely avoid AMs faking resources. 
> We can at the least make it as difficult as possible by using the same 
> container tokens and the RM-NM shared key mechanism over unauthenticated 
> RM-NM channel.
> In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-617) In unsercure mode, AM can fake resource requirements

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-617:
-

Summary: In unsercure mode, AM can fake resource requirements   (was: 
[MR-279] In unsercure mode, AM can fake resource requirements )

> In unsercure mode, AM can fake resource requirements 
> -
>
> Key: YARN-617
> URL: https://issues.apache.org/jira/browse/YARN-617
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Priority: Minor
>
> Without security, it is impossible to completely avoid AMs faking resources. 
> We can at the least make it as difficult as possible by using the same 
> container tokens and the RM-NM shared key mechanism over unauthenticated 
> RM-NM channel.
> In the minimum, this will avoid accidental bugs in AMs in unsecure mode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-589) Expose a REST API for monitoring the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642305#comment-13642305
 ] 

Sandy Ryza commented on YARN-589:
-

Attached xml output from a first cut at this.  Is it missing anything?  Any 
opinions on whether outputting the whole thing as one chunk (as the capacity 
scheduler does) rather than making metrics available for individual queues at 
different URLs is a bad idea?

> Expose a REST API for monitoring the fair scheduler
> ---
>
> Key: YARN-589
> URL: https://issues.apache.org/jira/browse/YARN-589
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: fairscheduler.xml
>
>
> The fair scheduler should have an HTTP interface that exposes information 
> such as applications per queue, fair shares, demands, current allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-589) Expose a REST API for monitoring the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-589:


Attachment: fairscheduler.xml

> Expose a REST API for monitoring the fair scheduler
> ---
>
> Key: YARN-589
> URL: https://issues.apache.org/jira/browse/YARN-589
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: scheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: fairscheduler.xml
>
>
> The fair scheduler should have an HTTP interface that exposes information 
> such as applications per queue, fair shares, demands, current allocations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Andrew Ferguson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642286#comment-13642286
 ] 

Andrew Ferguson commented on YARN-326:
--

hi Sandy,

I'm wondering if you want minimum and maximum shares to actually be fractions 
of the cluster, rather than resource vectors? that would fit more with the 
"fairness" aspect of the FairScheduler, but it's completely a design decision.

for example, what happens if the sum of the minimum shares for each queue 
exceeds the size of the cluster? (or the size of the cluster during a failure?)

or, if my queue has been given a minimum share of (2 CPU, 240 GB RAM) -- 
because I was originally using tasks with high-memory, what happens if I decide 
to switch to using tasks with high-CPU and low-memory?  I think a minimum share 
of "1/8" might make more sense since it would allow the queue's users to 
request the resources as they see fit.


anyway, just a thought.


cheers,
Andrew

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-528) Make IDs read only

2013-04-25 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-528:


Attachment: y528_ApplicationIdComplete_WIP.txt

Complete patch for jenkins.

> Make IDs read only
> --
>
> Key: YARN-528
> URL: https://issues.apache.org/jira/browse/YARN-528
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Attachments: y528_AppIdPart_01_Refactor.txt, 
> y528_AppIdPart_02_AppIdChanges.txt, y528_AppIdPart_03_fixUsage.txt, 
> y528_ApplicationIdComplete_WIP.txt, YARN-528.txt, YARN-528.txt
>
>
> I really would like to rip out most if not all of the abstraction layer that 
> sits in-between Protocol Buffers, the RPC, and the actual user code.  We have 
> no plans to support any other serialization type, and the abstraction layer 
> just, makes it more difficult to change protocols, makes changing them more 
> error prone, and slows down the objects themselves.  
> Completely doing that is a lot of work.  This JIRA is a first step towards 
> that.  It makes the various ID objects immutable.  If this patch is wel 
> received I will try to go through other objects/classes of objects and update 
> them in a similar way.
> This is probably the last time we will be able to make a change like this 
> before 2.0 stabilizes and YARN APIs will not be able to be changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-528) Make IDs read only

2013-04-25 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-528:


Attachment: y528_AppIdPart_03_fixUsage.txt
y528_AppIdPart_02_AppIdChanges.txt
y528_AppIdPart_01_Refactor.txt

Uploading a bunch of patches to change ApplicationId to be immutable, without 
letting PB laek in.
01_Refactor - moves some of the common Record code into the API module.
02_ApplicationId is the actual ApplicationId change.
03_fixUsage - fixes places where the set api was being used.

Chanhes ApplicationId creation to be via a static method in ApplicationId 
itself.

Will upload another patch to run through jenkins.
This is all WIP.

> Make IDs read only
> --
>
> Key: YARN-528
> URL: https://issues.apache.org/jira/browse/YARN-528
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Robert Joseph Evans
>Assignee: Robert Joseph Evans
> Attachments: y528_AppIdPart_01_Refactor.txt, 
> y528_AppIdPart_02_AppIdChanges.txt, y528_AppIdPart_03_fixUsage.txt, 
> y528_ApplicationIdComplete_WIP.txt, YARN-528.txt, YARN-528.txt
>
>
> I really would like to rip out most if not all of the abstraction layer that 
> sits in-between Protocol Buffers, the RPC, and the actual user code.  We have 
> no plans to support any other serialization type, and the abstraction layer 
> just, makes it more difficult to change protocols, makes changing them more 
> error prone, and slows down the objects themselves.  
> Completely doing that is a lot of work.  This JIRA is a first step towards 
> that.  It makes the various ID objects immutable.  If this patch is wel 
> received I will try to go through other objects/classes of objects and update 
> them in a similar way.
> This is probably the last time we will be able to make a change like this 
> before 2.0 stabilizes and YARN APIs will not be able to be changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-506) Move to common utils FileUtil#setReadable/Writable/Executable and FileUtil#canRead/Write/Execute

2013-04-25 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642204#comment-13642204
 ] 

Chris Nauroth commented on YARN-506:


Similar to HDFS-4610, I still saw a failure in {{TestNodeHealthService}} after 
applying this patch.  Ivan, did this patch fix that test for you?


> Move to common utils FileUtil#setReadable/Writable/Executable and 
> FileUtil#canRead/Write/Execute
> 
>
> Key: YARN-506
> URL: https://issues.apache.org/jira/browse/YARN-506
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan Mitic
>Assignee: Ivan Mitic
> Attachments: YARN-506.commonfileutils.2.patch, 
> YARN-506.commonfileutils.patch
>
>
> Move to common utils described in HADOOP-9413 that work well cross-platform.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

2013-04-25 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642168#comment-13642168
 ] 

Chris Riccomini commented on YARN-611:
--

Correction. We wouldn't need -1 to solve this problem. Simply setting some 
am.max-retries to a reasonable number, and ignoring system/hardware/YARN 
failures would fix this, I think.

> Add an AM retry count reset window to YARN RM
> -
>
> Key: YARN-611
> URL: https://issues.apache.org/jira/browse/YARN-611
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Riccomini
>
> YARN currently has the following config:
> yarn.resourcemanager.am.max-retries
> This config defaults to 2, and defines how many times to retry a "failed" AM 
> before failing the whole YARN job. YARN counts an AM as failed if the node 
> that it was running on dies (the NM will timeout, which counts as a failure 
> for the AM), or if the AM dies.
> This configuration is insufficient for long running (or infinitely running) 
> YARN jobs, since the machine (or NM) that the AM is running on will 
> eventually need to be restarted (or the machine/NM will fail). In such an 
> event, the AM has not done anything wrong, but this is counted as a "failure" 
> by the RM. Since the retry count for the AM is never reset, eventually, at 
> some point, the number of machine/NM failures will result in the AM failure 
> count going above the configured value for 
> yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the 
> job as failed, and shut it down. This behavior is not ideal.
> I propose that we add a second configuration:
> yarn.resourcemanager.am.retry-count-window-ms
> This configuration would define a window of time that would define when an AM 
> is "well behaved", and it's safe to reset its failure count back to zero. 
> Every time an AM fails the RmAppImpl would check the last time that the AM 
> failed. If the last failure was less than retry-count-window-ms ago, and the 
> new failure count is > max-retries, then the job should fail. If the AM has 
> never failed, the retry count is < max-retries, or if the last failure was 
> OUTSIDE the retry-count-window-ms, then the job should be restarted. 
> Additionally, if the last failure was outside the retry-count-window-ms, then 
> the failure count should be set back to 0.
> This would give developers a way to have well-behaved AMs run forever, while 
> still failing mis-behaving AMs after a short period of time.
> I think the work to be done here is to change the RmAppImpl to actually look 
> at app.attempts, and see if there have been more than max-retries failures in 
> the last retry-count-window-ms milliseconds. If there have, then the job 
> should fail, if not, then the job should go forward. Additionally, we might 
> also need to add an endTime in either RMAppAttemptImpl or 
> RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the 
> failure.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

2013-04-25 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642162#comment-13642162
 ] 

Chris Riccomini commented on YARN-611:
--

@zshen Simply bumping the retries to infinite is not a good solution because, 
in the case where an AM is legitimately failing, we don't want it to just run 
forever.

@sandy/@vinod If we had -1 as infinite retries, and separated 
machine/hardware/YARN failures from AM failures, then this would probably be 
fine.

> Add an AM retry count reset window to YARN RM
> -
>
> Key: YARN-611
> URL: https://issues.apache.org/jira/browse/YARN-611
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Riccomini
>
> YARN currently has the following config:
> yarn.resourcemanager.am.max-retries
> This config defaults to 2, and defines how many times to retry a "failed" AM 
> before failing the whole YARN job. YARN counts an AM as failed if the node 
> that it was running on dies (the NM will timeout, which counts as a failure 
> for the AM), or if the AM dies.
> This configuration is insufficient for long running (or infinitely running) 
> YARN jobs, since the machine (or NM) that the AM is running on will 
> eventually need to be restarted (or the machine/NM will fail). In such an 
> event, the AM has not done anything wrong, but this is counted as a "failure" 
> by the RM. Since the retry count for the AM is never reset, eventually, at 
> some point, the number of machine/NM failures will result in the AM failure 
> count going above the configured value for 
> yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the 
> job as failed, and shut it down. This behavior is not ideal.
> I propose that we add a second configuration:
> yarn.resourcemanager.am.retry-count-window-ms
> This configuration would define a window of time that would define when an AM 
> is "well behaved", and it's safe to reset its failure count back to zero. 
> Every time an AM fails the RmAppImpl would check the last time that the AM 
> failed. If the last failure was less than retry-count-window-ms ago, and the 
> new failure count is > max-retries, then the job should fail. If the AM has 
> never failed, the retry count is < max-retries, or if the last failure was 
> OUTSIDE the retry-count-window-ms, then the job should be restarted. 
> Additionally, if the last failure was outside the retry-count-window-ms, then 
> the failure count should be set back to 0.
> This would give developers a way to have well-behaved AMs run forever, while 
> still failing mis-behaving AMs after a short period of time.
> I think the work to be done here is to change the RmAppImpl to actually look 
> at app.attempts, and see if there have been more than max-retries failures in 
> the last retry-count-window-ms milliseconds. If there have, then the job 
> should fail, if not, then the job should go forward. Additionally, we might 
> also need to add an endTime in either RMAppAttemptImpl or 
> RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the 
> failure.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers

2013-04-25 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642163#comment-13642163
 ] 

Chris Douglas commented on YARN-45:
---

If everyone's OK with the current patch as a base, I'll commit it in the next 
couple days.

> Scheduler feedback to AM to release containers
> --
>
> Key: YARN-45
> URL: https://issues.apache.org/jira/browse/YARN-45
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Chris Douglas
>Assignee: Carlo Curino
> Attachments: YARN-45.patch, YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

2013-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642132#comment-13642132
 ] 

Sandy Ryza commented on YARN-611:
-

Not suggesting this as a complete solution to either problem, but maybe it 
would make sense to allow -1 to mean an infinite number of retries?

> Add an AM retry count reset window to YARN RM
> -
>
> Key: YARN-611
> URL: https://issues.apache.org/jira/browse/YARN-611
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Riccomini
>
> YARN currently has the following config:
> yarn.resourcemanager.am.max-retries
> This config defaults to 2, and defines how many times to retry a "failed" AM 
> before failing the whole YARN job. YARN counts an AM as failed if the node 
> that it was running on dies (the NM will timeout, which counts as a failure 
> for the AM), or if the AM dies.
> This configuration is insufficient for long running (or infinitely running) 
> YARN jobs, since the machine (or NM) that the AM is running on will 
> eventually need to be restarted (or the machine/NM will fail). In such an 
> event, the AM has not done anything wrong, but this is counted as a "failure" 
> by the RM. Since the retry count for the AM is never reset, eventually, at 
> some point, the number of machine/NM failures will result in the AM failure 
> count going above the configured value for 
> yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the 
> job as failed, and shut it down. This behavior is not ideal.
> I propose that we add a second configuration:
> yarn.resourcemanager.am.retry-count-window-ms
> This configuration would define a window of time that would define when an AM 
> is "well behaved", and it's safe to reset its failure count back to zero. 
> Every time an AM fails the RmAppImpl would check the last time that the AM 
> failed. If the last failure was less than retry-count-window-ms ago, and the 
> new failure count is > max-retries, then the job should fail. If the AM has 
> never failed, the retry count is < max-retries, or if the last failure was 
> OUTSIDE the retry-count-window-ms, then the job should be restarted. 
> Additionally, if the last failure was outside the retry-count-window-ms, then 
> the failure count should be set back to 0.
> This would give developers a way to have well-behaved AMs run forever, while 
> still failing mis-behaving AMs after a short period of time.
> I think the work to be done here is to change the RmAppImpl to actually look 
> at app.attempts, and see if there have been more than max-retries failures in 
> the last retry-count-window-ms milliseconds. If there have, then the job 
> should fail, if not, then the job should go forward. Additionally, we might 
> also need to add an endTime in either RMAppAttemptImpl or 
> RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the 
> failure.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-616) Support guaranteed shares in the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)
Sandy Ryza created YARN-616:
---

 Summary: Support guaranteed shares in the fair scheduler
 Key: YARN-616
 URL: https://issues.apache.org/jira/browse/YARN-616
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza


A common requested feature in the fair scheduler is to reserve shares of the 
cluster for queues that no other queue can trample on, even if they are unused.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-422) Add AM-NM client library

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642119#comment-13642119
 ] 

Vinod Kumar Vavilapalli commented on YARN-422:
--

startContainer/stopContainer can fail, we need to handle the exceptions 
correctly and report them back to the callers. The CallBackHandler also needs 
APIs to report back such failures.

> Add AM-NM client library
> 
>
> Key: YARN-422
> URL: https://issues.apache.org/jira/browse/YARN-422
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: AMNMClient_Defination.txt, proposal_v1.pdf
>
>
> Create a simple wrapper over the AM-NM container protocol to provide hide the 
> details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642113#comment-13642113
 ] 

Vinod Kumar Vavilapalli commented on YARN-614:
--

We need to consolidate the attempt retries discussion along with what is being 
proposed at YARN-611.

> Retry attempts automatically for hardware failures or YARN issues and set 
> default app retries to 1
> --
>
> Key: YARN-614
> URL: https://issues.apache.org/jira/browse/YARN-614
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Bikas Saha
>
> Attempts can fail due to a large number of user errors and they should not be 
> retried unnecessarily. The only reason YARN should retry an attempt is when 
> the hardware fails or YARN has an error. NM failing, lost NM and NM disk 
> errors are the hardware errors that come to mind.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-615) ContainerLaunchContext.containerTokens should simply be called tokens

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-615:


 Summary: ContainerLaunchContext.containerTokens should simply be 
called tokens
 Key: YARN-615
 URL: https://issues.apache.org/jira/browse/YARN-615
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli


ContainerToken is the name of the specific token that AMs use to launch 
containers on NMs, so we should rename CLC.containerTokens to be simply tokens.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-611) Add an AM retry count reset window to YARN RM

2013-04-25 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642086#comment-13642086
 ] 

Zhijie Shen commented on YARN-611:
--

Each Yarn application can specify its individual retry number in 
ApplicationSubmissionContext. Therefore, to be fair to long running 
applications, we can simply allow a larger number of retries. However, since 
individual retries cannot be larger than the global one, 
yarn.resourcemanager.am.max-retries needs to be changed to a larger number as 
well.

Anyway, the proposed method sounds interesting. We can get more retry quota 
without setting a larger max retries. And it prevents the case of a lot of 
attempt failures within a short period.

> Add an AM retry count reset window to YARN RM
> -
>
> Key: YARN-611
> URL: https://issues.apache.org/jira/browse/YARN-611
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Chris Riccomini
>
> YARN currently has the following config:
> yarn.resourcemanager.am.max-retries
> This config defaults to 2, and defines how many times to retry a "failed" AM 
> before failing the whole YARN job. YARN counts an AM as failed if the node 
> that it was running on dies (the NM will timeout, which counts as a failure 
> for the AM), or if the AM dies.
> This configuration is insufficient for long running (or infinitely running) 
> YARN jobs, since the machine (or NM) that the AM is running on will 
> eventually need to be restarted (or the machine/NM will fail). In such an 
> event, the AM has not done anything wrong, but this is counted as a "failure" 
> by the RM. Since the retry count for the AM is never reset, eventually, at 
> some point, the number of machine/NM failures will result in the AM failure 
> count going above the configured value for 
> yarn.resourcemanager.am.max-retries. Once this happens, the RM will mark the 
> job as failed, and shut it down. This behavior is not ideal.
> I propose that we add a second configuration:
> yarn.resourcemanager.am.retry-count-window-ms
> This configuration would define a window of time that would define when an AM 
> is "well behaved", and it's safe to reset its failure count back to zero. 
> Every time an AM fails the RmAppImpl would check the last time that the AM 
> failed. If the last failure was less than retry-count-window-ms ago, and the 
> new failure count is > max-retries, then the job should fail. If the AM has 
> never failed, the retry count is < max-retries, or if the last failure was 
> OUTSIDE the retry-count-window-ms, then the job should be restarted. 
> Additionally, if the last failure was outside the retry-count-window-ms, then 
> the failure count should be set back to 0.
> This would give developers a way to have well-behaved AMs run forever, while 
> still failing mis-behaving AMs after a short period of time.
> I think the work to be done here is to change the RmAppImpl to actually look 
> at app.attempts, and see if there have been more than max-retries failures in 
> the last retry-count-window-ms milliseconds. If there have, then the job 
> should fail, if not, then the job should go forward. Additionally, we might 
> also need to add an endTime in either RMAppAttemptImpl or 
> RMAppFailedAttemptEvent, so that the RmAppImpl can check the time of the 
> failure.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-614) Retry attempts automatically for hardware failures or YARN issues and set default app retries to 1

2013-04-25 Thread Bikas Saha (JIRA)
Bikas Saha created YARN-614:
---

 Summary: Retry attempts automatically for hardware failures or 
YARN issues and set default app retries to 1
 Key: YARN-614
 URL: https://issues.apache.org/jira/browse/YARN-614
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Bikas Saha


Attempts can fail due to a large number of user errors and they should not be 
retried unnecessarily. The only reason YARN should retry an attempt is when the 
hardware fails or YARN has an error. NM failing, lost NM and NM disk errors are 
the hardware errors that come to mind.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-542) Change the default global AM max-attempts value to be not one

2013-04-25 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642067#comment-13642067
 ] 

Bikas Saha commented on YARN-542:
-

I think that the global default should be one because an app attempt can fail 
due to a large number of user level reasons and should not be retried. The 
system should retry an attempt upon hardware failure or YARN error and set this 
default to 1.

> Change the default global AM max-attempts value to be not one
> -
>
> Key: YARN-542
> URL: https://issues.apache.org/jira/browse/YARN-542
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Zhijie Shen
> Fix For: 2.0.5-beta
>
> Attachments: YARN-542.1.patch
>
>
> Today, the global AM max-attempts is set to 1 which is a bad choice. AM 
> max-attempts accounts for both AM level failures as well as container crashes 
> due to localization issue, lost nodes etc. To account for AM crashes due to 
> problems that are not caused by user code, mainly lost nodes, we want to give 
> AMs some retires.
> I propose we change it to atleast two. Can change it to 4 to match other 
> retry-configs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-613) Create NM proxy per NM instead of per container

2013-04-25 Thread Bikas Saha (JIRA)
Bikas Saha created YARN-613:
---

 Summary: Create NM proxy per NM instead of per container
 Key: YARN-613
 URL: https://issues.apache.org/jira/browse/YARN-613
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha


Currently a new NM proxy has to be created per container since the secure 
authentication is using a containertoken from the container.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-612) Cleanup BuilderUtils

2013-04-25 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-612:


Issue Type: Sub-task  (was: Bug)
Parent: YARN-386

> Cleanup BuilderUtils
> 
>
> Key: YARN-612
> URL: https://issues.apache.org/jira/browse/YARN-612
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>
> There's 4 different methods to create ApplicationId. There's likely other 
> such methods as well which could be consolidated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642059#comment-13642059
 ] 

Hadoop QA commented on YARN-326:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12580560/YARN-326.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/823//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/823//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/823//console

This message is automatically generated.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-612) Cleanup BuilderUtils

2013-04-25 Thread Siddharth Seth (JIRA)
Siddharth Seth created YARN-612:
---

 Summary: Cleanup BuilderUtils
 Key: YARN-612
 URL: https://issues.apache.org/jira/browse/YARN-612
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siddharth Seth


There's 4 different methods to create ApplicationId. There's likely other such 
methods as well which could be consolidated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-386) [Umbrella] YARN API Changes

2013-04-25 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned YARN-386:
---

Assignee: Siddharth Seth

> [Umbrella] YARN API Changes
> ---
>
> Key: YARN-386
> URL: https://issues.apache.org/jira/browse/YARN-386
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Siddharth Seth
>
> This is the umbrella ticket to capture any and every API cleanup that we wish 
> to do before YARN can be deemed beta/stable. Doing this API cleanup now and 
> ASAP will help us escape the pain of supporting bad APIs in beta/stable 
> releases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-610) ClientToken should not be set in the environment

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-610:
-

Issue Type: Sub-task  (was: Bug)
Parent: YARN-47

> ClientToken should not be set in the environment
> 
>
> Key: YARN-610
> URL: https://issues.apache.org/jira/browse/YARN-610
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>
> Similar to YARN-579, this can be set via ContainerTokens

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-610) ClientToken should not be set in the environment

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned YARN-610:


Assignee: Vinod Kumar Vavilapalli

> ClientToken should not be set in the environment
> 
>
> Key: YARN-610
> URL: https://issues.apache.org/jira/browse/YARN-610
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Vinod Kumar Vavilapalli
>
> Similar to YARN-579, this can be set via ContainerTokens

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-611) Add an AM retry count reset window to YARN RM

2013-04-25 Thread Chris Riccomini (JIRA)
Chris Riccomini created YARN-611:


 Summary: Add an AM retry count reset window to YARN RM
 Key: YARN-611
 URL: https://issues.apache.org/jira/browse/YARN-611
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.3-alpha
Reporter: Chris Riccomini


YARN currently has the following config:

yarn.resourcemanager.am.max-retries

This config defaults to 2, and defines how many times to retry a "failed" AM 
before failing the whole YARN job. YARN counts an AM as failed if the node that 
it was running on dies (the NM will timeout, which counts as a failure for the 
AM), or if the AM dies.

This configuration is insufficient for long running (or infinitely running) 
YARN jobs, since the machine (or NM) that the AM is running on will eventually 
need to be restarted (or the machine/NM will fail). In such an event, the AM 
has not done anything wrong, but this is counted as a "failure" by the RM. 
Since the retry count for the AM is never reset, eventually, at some point, the 
number of machine/NM failures will result in the AM failure count going above 
the configured value for yarn.resourcemanager.am.max-retries. Once this 
happens, the RM will mark the job as failed, and shut it down. This behavior is 
not ideal.

I propose that we add a second configuration:

yarn.resourcemanager.am.retry-count-window-ms

This configuration would define a window of time that would define when an AM 
is "well behaved", and it's safe to reset its failure count back to zero. Every 
time an AM fails the RmAppImpl would check the last time that the AM failed. If 
the last failure was less than retry-count-window-ms ago, and the new failure 
count is > max-retries, then the job should fail. If the AM has never failed, 
the retry count is < max-retries, or if the last failure was OUTSIDE the 
retry-count-window-ms, then the job should be restarted. Additionally, if the 
last failure was outside the retry-count-window-ms, then the failure count 
should be set back to 0.

This would give developers a way to have well-behaved AMs run forever, while 
still failing mis-behaving AMs after a short period of time.

I think the work to be done here is to change the RmAppImpl to actually look at 
app.attempts, and see if there have been more than max-retries failures in the 
last retry-count-window-ms milliseconds. If there have, then the job should 
fail, if not, then the job should go forward. Additionally, we might also need 
to add an endTime in either RMAppAttemptImpl or RMAppFailedAttemptEvent, so 
that the RmAppImpl can check the time of the failure.

Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642040#comment-13642040
 ] 

Sandy Ryza commented on YARN-326:
-

Attached a design doc PDF and an initial patch.  It still needs configuration 
of non-memory min resources, and probably needs more tests.

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: (was: YARN-326.patch)

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: YARN-326.patch

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: FairSchedulerDRFDesignDoc.pdf

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: FairSchedulerDRFDesignDoc.pdf, YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-422) Add AM-NM client library

2013-04-25 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13642038#comment-13642038
 ] 

Bikas Saha commented on YARN-422:
-

Overall the approach looks fine. The patch looks like work in progress and so I 
am not reviewing it thoroughly.
A lot of the complexity and verbosity is coming from the fact that we need to 
create a new proxy per container rather than per NM.
We should file a jira to fix that in the protocol.

> Add AM-NM client library
> 
>
> Key: YARN-422
> URL: https://issues.apache.org/jira/browse/YARN-422
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Zhijie Shen
> Attachments: AMNMClient_Defination.txt, proposal_v1.pdf
>
>
> Create a simple wrapper over the AM-NM container protocol to provide hide the 
> details of the protocol implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-610) ClientToken should not be set in the environment

2013-04-25 Thread Siddharth Seth (JIRA)
Siddharth Seth created YARN-610:
---

 Summary: ClientToken should not be set in the environment
 Key: YARN-610
 URL: https://issues.apache.org/jira/browse/YARN-610
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siddharth Seth


Similar to YARN-579, this can be set via ContainerTokens

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-326) Add multi-resource scheduling to the fair scheduler

2013-04-25 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-326:


Attachment: YARN-326.patch

> Add multi-resource scheduling to the fair scheduler
> ---
>
> Key: YARN-326
> URL: https://issues.apache.org/jira/browse/YARN-326
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: YARN-326.patch
>
>
> With YARN-2 in, the capacity scheduler has the ability to schedule based on 
> multiple resources, using dominant resource fairness.  The fair scheduler 
> should be able to do multiple resource scheduling as well, also using 
> dominant resource fairness.
> More details to come on how the corner cases with fair scheduler configs such 
> as min and max resources will be handled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-609) Fix synchronization issues in APIs which take in lists

2013-04-25 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-609:


 Summary: Fix synchronization issues in APIs which take in lists
 Key: YARN-609
 URL: https://issues.apache.org/jira/browse/YARN-609
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli


Some of the APIs take in lists and the setter-APIs don't always do proper 
synchronization. We need to fix these.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-579) Make ApplicationToken part of Container's token list to help RM-restart

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641829#comment-13641829
 ] 

Hudson commented on YARN-579:
-

Integrated in Hadoop-Mapreduce-trunk #1410 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1410/])
YARN-579. Stop setting the Application Token in the AppMaster env, in 
favour of the copy present in the container token field. Contributed by Vinod 
Kumar Vavilapalli. (Revision 1471814)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1471814
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/AMRMClientImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAMAuthorization.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestApplicationTokens.java


> Make ApplicationToken part of Container's token list to help RM-restart
> ---
>
> Key: YARN-579
> URL: https://issues.apache.org/jira/browse/YARN-579
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.4-alpha
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Fix For: 2.0.5-beta
>
> Attachments: YARN-579-20130422.1.txt, 
> YARN-579-20130422.1_YARNChanges.txt
>
>
> Container is already persisted for helping RM restart. Instead of explicitly 
> setting ApplicationToken in AM's env, if we change it to be in Container, we 
> can avoid env and can also help restart.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-605) Failing unit test in TestNMWebServices when using git for source control

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641827#comment-13641827
 ] 

Hudson commented on YARN-605:
-

Integrated in Hadoop-Mapreduce-trunk #1410 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1410/])
YARN-605. Fix failing unit test in TestNMWebServices when versionInfo has 
parantheses like when running on a git checkout. Contributed by Hitesh Shah. 
(Revision 1471608)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1471608
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/WebServicesTestUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java


> Failing unit test in TestNMWebServices when using git for source control 
> -
>
> Key: YARN-605
> URL: https://issues.apache.org/jira/browse/YARN-605
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.0.5-beta
>
> Attachments: YARN-605.1.patch
>
>
> Failed tests:   
> testNode(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): 
> hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfo(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfoSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfoDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testSingleNodesXML(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dc

[jira] [Commented] (YARN-577) ApplicationReport does not provide progress value of application

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641826#comment-13641826
 ] 

Hudson commented on YARN-577:
-

Integrated in Hadoop-Mapreduce-trunk #1410 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1410/])
YARN-577. Add application-progress also to ApplicationReport. Contributed 
by Hitesh Shah.
MAPREDUCE-5178. Update MR App to set progress in ApplicationReport after 
YARN-577. Contributed by Hitesh Shah. (Revision 1475636)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1475636
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/NotRunningJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientServiceDelegate.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationReport.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationReportPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java


> ApplicationReport does not provide progress value of application
> 
>
> Key: YARN-577
> URL: https://issues.apache.org/jira/browse/YARN-577
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.0.5-beta
>
> Attachments: YARN-577.1.patch, YARN-577.2.patch, 
> YARN-577.combined.2.patch, YARN-577.combinedwithMR.patch
>
>
> An application sends its progress % to the RM via AllocateRequest. This 
> should be able to be retrieved by a client via the ApplicationReport.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-595) Refactor fair scheduler to use common Resources

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641821#comment-13641821
 ] 

Hudson commented on YARN-595:
-

Integrated in Hadoop-Mapreduce-trunk #1410 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1410/])
YARN-595. Refactor fair scheduler to use common Resources. Contributed by 
Sandy Ryza. (Revision 1475670)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1475670
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/Resources.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/Resources.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resource
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resource/TestResources.java


> Refactor fair scheduler to use common Resources
> ---
>
> Key: YARN-595
> URL: https://issues.apache.org/jira/browse/YARN-595
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.0.5-beta
>
> Attachments: YARN-595-1.patch, YARN-595.patch, YARN-595.patch
>
>
> resourcemanager.fair and resourcemanager.resources have two copies of 
> basically the same code for operations on Resource objects

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-289) Fair scheduler allows reservations that won't fit on node

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641822#comment-13641822
 ] 

Hudson commented on YARN-289:
-

Integrated in Hadoop-Mapreduce-trunk #1410 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1410/])
YARN-289. Fair scheduler allows reservations that won't fit on node. 
Contributed by Sandy Ryza. (Revision 1475681)

 Result = SUCCESS
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1475681
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


> Fair scheduler allows reservations that won't fit on node
> -
>
> Key: YARN-289
> URL: https://issues.apache.org/jira/browse/YARN-289
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.0.5-beta
>
> Attachments: YARN-289-1.patch, YARN-289.patch
>
>
> An application requests a container with 1024 MB.  It then requests a 
> container with 2048 MB.  A node shows up with 1024 MB available.  Even if the 
> application is the only one running, neither request will be scheduled on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-579) Make ApplicationToken part of Container's token list to help RM-restart

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641753#comment-13641753
 ] 

Hudson commented on YARN-579:
-

Integrated in Hadoop-Hdfs-trunk #1383 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1383/])
YARN-579. Stop setting the Application Token in the AppMaster env, in 
favour of the copy present in the container token field. Contributed by Vinod 
Kumar Vavilapalli. (Revision 1471814)

 Result = FAILURE
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1471814
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/AMRMClientImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAMAuthorization.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestApplicationTokens.java


> Make ApplicationToken part of Container's token list to help RM-restart
> ---
>
> Key: YARN-579
> URL: https://issues.apache.org/jira/browse/YARN-579
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.4-alpha
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Fix For: 2.0.5-beta
>
> Attachments: YARN-579-20130422.1.txt, 
> YARN-579-20130422.1_YARNChanges.txt
>
>
> Container is already persisted for helping RM restart. Instead of explicitly 
> setting ApplicationToken in AM's env, if we change it to be in Container, we 
> can avoid env and can also help restart.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-577) ApplicationReport does not provide progress value of application

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641750#comment-13641750
 ] 

Hudson commented on YARN-577:
-

Integrated in Hadoop-Hdfs-trunk #1383 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1383/])
YARN-577. Add application-progress also to ApplicationReport. Contributed 
by Hitesh Shah.
MAPREDUCE-5178. Update MR App to set progress in ApplicationReport after 
YARN-577. Contributed by Hitesh Shah. (Revision 1475636)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1475636
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/NotRunningJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientServiceDelegate.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationReport.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationReportPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java


> ApplicationReport does not provide progress value of application
> 
>
> Key: YARN-577
> URL: https://issues.apache.org/jira/browse/YARN-577
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.0.5-beta
>
> Attachments: YARN-577.1.patch, YARN-577.2.patch, 
> YARN-577.combined.2.patch, YARN-577.combinedwithMR.patch
>
>
> An application sends its progress % to the RM via AllocateRequest. This 
> should be able to be retrieved by a client via the ApplicationReport.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-605) Failing unit test in TestNMWebServices when using git for source control

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641751#comment-13641751
 ] 

Hudson commented on YARN-605:
-

Integrated in Hadoop-Hdfs-trunk #1383 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1383/])
YARN-605. Fix failing unit test in TestNMWebServices when versionInfo has 
parantheses like when running on a git checkout. Contributed by Hitesh Shah. 
(Revision 1471608)

 Result = FAILURE
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1471608
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/WebServicesTestUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java


> Failing unit test in TestNMWebServices when using git for source control 
> -
>
> Key: YARN-605
> URL: https://issues.apache.org/jira/browse/YARN-605
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.0.5-beta
>
> Attachments: YARN-605.1.patch
>
>
> Failed tests:   
> testNode(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): 
> hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfo(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfoSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfoDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testSingleNodesXML(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac9

[jira] [Commented] (YARN-595) Refactor fair scheduler to use common Resources

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641745#comment-13641745
 ] 

Hudson commented on YARN-595:
-

Integrated in Hadoop-Hdfs-trunk #1383 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1383/])
YARN-595. Refactor fair scheduler to use common Resources. Contributed by 
Sandy Ryza. (Revision 1475670)

 Result = FAILURE
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1475670
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/Resources.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSSchedulerNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/Resources.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FairSharePolicy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/policies/FifoPolicy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resource
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/resource/TestResources.java


> Refactor fair scheduler to use common Resources
> ---
>
> Key: YARN-595
> URL: https://issues.apache.org/jira/browse/YARN-595
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.0.5-beta
>
> Attachments: YARN-595-1.patch, YARN-595.patch, YARN-595.patch
>
>
> resourcemanager.fair and resourcemanager.resources have two copies of 
> basically the same code for operations on Resource objects

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-289) Fair scheduler allows reservations that won't fit on node

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641746#comment-13641746
 ] 

Hudson commented on YARN-289:
-

Integrated in Hadoop-Hdfs-trunk #1383 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1383/])
YARN-289. Fair scheduler allows reservations that won't fit on node. 
Contributed by Sandy Ryza. (Revision 1475681)

 Result = FAILURE
tomwhite : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1475681
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java


> Fair scheduler allows reservations that won't fit on node
> -
>
> Key: YARN-289
> URL: https://issues.apache.org/jira/browse/YARN-289
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Fix For: 2.0.5-beta
>
> Attachments: YARN-289-1.patch, YARN-289.patch
>
>
> An application requests a container with 1024 MB.  It then requests a 
> container with 2048 MB.  A node shows up with 1024 MB available.  Even if the 
> application is the only one running, neither request will be scheduled on it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-579) Make ApplicationToken part of Container's token list to help RM-restart

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641655#comment-13641655
 ] 

Hudson commented on YARN-579:
-

Integrated in Hadoop-Yarn-trunk #194 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/194/])
YARN-579. Stop setting the Application Token in the AppMaster env, in 
favour of the copy present in the container token field. Contributed by Vinod 
Kumar Vavilapalli. (Revision 1471814)

 Result = SUCCESS
sseth : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1471814
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/AMRMClientImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAMAuthorization.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/security/TestApplicationTokens.java


> Make ApplicationToken part of Container's token list to help RM-restart
> ---
>
> Key: YARN-579
> URL: https://issues.apache.org/jira/browse/YARN-579
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.0.4-alpha
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Fix For: 2.0.5-beta
>
> Attachments: YARN-579-20130422.1.txt, 
> YARN-579-20130422.1_YARNChanges.txt
>
>
> Container is already persisted for helping RM restart. Instead of explicitly 
> setting ApplicationToken in AM's env, if we change it to be in Container, we 
> can avoid env and can also help restart.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-577) ApplicationReport does not provide progress value of application

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641652#comment-13641652
 ] 

Hudson commented on YARN-577:
-

Integrated in Hadoop-Yarn-trunk #194 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/194/])
YARN-577. Add application-progress also to ApplicationReport. Contributed 
by Hitesh Shah.
MAPREDUCE-5178. Update MR App to set progress in ApplicationReport after 
YARN-577. Contributed by Hitesh Shah. (Revision 1475636)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1475636
Files : 
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/NotRunningJob.java
* 
/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientServiceDelegate.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ApplicationReport.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ApplicationReportPBImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/BuilderUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java


> ApplicationReport does not provide progress value of application
> 
>
> Key: YARN-577
> URL: https://issues.apache.org/jira/browse/YARN-577
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.0.5-beta
>
> Attachments: YARN-577.1.patch, YARN-577.2.patch, 
> YARN-577.combined.2.patch, YARN-577.combinedwithMR.patch
>
>
> An application sends its progress % to the RM via AllocateRequest. This 
> should be able to be retrieved by a client via the ApplicationReport.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-605) Failing unit test in TestNMWebServices when using git for source control

2013-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641653#comment-13641653
 ] 

Hudson commented on YARN-605:
-

Integrated in Hadoop-Yarn-trunk #194 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/194/])
YARN-605. Fix failing unit test in TestNMWebServices when versionInfo has 
parantheses like when running on a git checkout. Contributed by Hitesh Shah. 
(Revision 1471608)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1471608
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/WebServicesTestUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestNMWebServices.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServices.java


> Failing unit test in TestNMWebServices when using git for source control 
> -
>
> Key: YARN-605
> URL: https://issues.apache.org/jira/browse/YARN-605
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Hitesh Shah
>Assignee: Hitesh Shah
> Fix For: 2.0.5-beta
>
> Attachments: YARN-605.1.patch
>
>
> Failed tests:   
> testNode(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices): 
> hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfo(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfoSlash(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testNodeInfoDefault(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, 
> origin/trunk, origin/HEAD, mrx-track) by Hitesh source checksum 
> f89f5c9b9c9d44cf3be5c2686f2d789
>   
> testSingleNodesXML(org.apache.hadoop.yarn.server.nodemanager.webapp.TestNMWebServices):
>  hadoopBuildVersion doesn't match, got: 3.0.0-SNAPSHOT from 
> fddcdcfb3cfe7dcc4f77c1ac953dd2cc0a890c62 (HEAD, origin/trunk, origin/HEAD, 
> mrx-track) by Hitesh source checksum f89f5c9b9c9d44cf3be5c2686f2d789 
> expected: 3.0.0-SNAPSHOT from fddcdcfb3cfe7dcc4f77c1ac953

  1   2   >