[jira] [Commented] (YARN-378) ApplicationMaster retry times should be set by Client

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599778#comment-13599778
 ] 

Hadoop QA commented on YARN-378:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573265/YARN-378_6.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/503//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/503//console

This message is automatically generated.

 ApplicationMaster retry times should be set by Client
 -

 Key: YARN-378
 URL: https://issues.apache.org/jira/browse/YARN-378
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
 Environment: suse
Reporter: xieguiming
Assignee: Zhijie Shen
  Labels: usability
 Attachments: YARN-378_1.patch, YARN-378_2.patch, YARN-378_3.patch, 
 YARN-378_4.patch, YARN-378_5.patch, YARN-378_6.patch


 We should support that different client or user have different 
 ApplicationMaster retry times. It also say that 
 yarn.resourcemanager.am.max-retries should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-468) coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter

2013-03-12 Thread Aleksey Gorshkov (JIRA)
Aleksey Gorshkov created YARN-468:
-

 Summary: coverage fix for 
org.apache.hadoop.yarn.server.webproxy.amfilter 
 Key: YARN-468
 URL: https://issues.apache.org/jira/browse/YARN-468
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov


coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-468) coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter

2013-03-12 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-468:
--

Attachment: YARN-468-trunk.patch

 coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter 
 -

 Key: YARN-468
 URL: https://issues.apache.org/jira/browse/YARN-468
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
 Attachments: YARN-468-trunk.patch


 coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-468) coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter

2013-03-12 Thread Aleksey Gorshkov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Gorshkov updated YARN-468:
--

Description: 
coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter 

patch YARN-468-trunk.patch for trunk, branch-2, branch-0.23

  was:coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter 


 coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter 
 -

 Key: YARN-468
 URL: https://issues.apache.org/jira/browse/YARN-468
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 3.0.0, 0.23.7, 2.0.4-alpha
Reporter: Aleksey Gorshkov
 Attachments: YARN-468-trunk.patch


 coverage fix for org.apache.hadoop.yarn.server.webproxy.amfilter 
 patch YARN-468-trunk.patch for trunk, branch-2, branch-0.23

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600180#comment-13600180
 ] 

Hadoop QA commented on YARN-18:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573355/YARN-18-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

  {color:red}-1 one of tests included doesn't have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 9 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/505//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/505//console

This message is automatically generated.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, 
 YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-379) yarn [node,application] command print logger info messages

2013-03-12 Thread Abhishek Kapoor (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599985#comment-13599985
 ] 

Abhishek Kapoor commented on YARN-379:
--

Are we okay with the fix ? 

Please suggest 

Thanks
Abhi

 yarn [node,application] command print logger info messages
 --

 Key: YARN-379
 URL: https://issues.apache.org/jira/browse/YARN-379
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.0.3-alpha
Reporter: Thomas Graves
Assignee: Abhishek Kapoor
  Labels: usability
 Attachments: YARN-379.patch


 Running the yarn node and yarn applications command results in annoying log 
 info messages being printed:
 $ yarn node -list
 13/02/06 02:36:50 INFO service.AbstractService: 
 Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
 13/02/06 02:36:50 INFO service.AbstractService: 
 Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
 Total Nodes:1
  Node-IdNode-State  Node-Http-Address   
 Health-Status(isNodeHealthy)Running-Containers
 foo:8041RUNNING  foo:8042   true  
  0
 13/02/06 02:36:50 INFO service.AbstractService: 
 Service:org.apache.hadoop.yarn.client.YarnClientImpl is stopped.
 $ yarn application
 13/02/06 02:38:47 INFO service.AbstractService: 
 Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
 13/02/06 02:38:47 INFO service.AbstractService: 
 Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
 Invalid Command Usage : 
 usage: application
  -kill arg Kills the application.
  -list   Lists all the Applications from RM.
  -status arg   Prints the status of the application.
 13/02/06 02:38:47 INFO service.AbstractService: 
 Service:org.apache.hadoop.yarn.client.YarnClientImpl is stopped.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-03-12 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-18:
---

Attachment: YARN-18-v4.patch

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, 
 YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-460) CS user left in list of active users for the queue even when application finished

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600182#comment-13600182
 ] 

Hadoop QA commented on YARN-460:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573361/YARN-460.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/506//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/506//console

This message is automatically generated.

 CS user left in list of active users for the queue even when application 
 finished
 -

 Key: YARN-460
 URL: https://issues.apache.org/jira/browse/YARN-460
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Critical
 Attachments: YARN-460-branch-0.23.patch, YARN-460-branch-0.23.patch, 
 YARN-460.patch, YARN-460.patch, YARN-460.patch


 We have seen a user get left in the queues list of active users even though 
 the application was removed. This can cause everyone else in the queue to get 
 less resources if using the minimum user limit percent config.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-460) CS user left in list of active users for the queue even when application finished

2013-03-12 Thread Thomas Graves (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Graves updated YARN-460:
---

Attachment: YARN-460.patch

trunk and branch-2 patch.  Unfortunately I couldn't easily come up with a unit 
test to hit the application stopped condition (without hitting the null check)  
due to the data structures being private.  

 CS user left in list of active users for the queue even when application 
 finished
 -

 Key: YARN-460
 URL: https://issues.apache.org/jira/browse/YARN-460
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Critical
 Attachments: YARN-460-branch-0.23.patch, YARN-460-branch-0.23.patch, 
 YARN-460.patch, YARN-460.patch, YARN-460.patch


 We have seen a user get left in the queues list of active users even though 
 the application was removed. This can cause everyone else in the queue to get 
 less resources if using the minimum user limit percent config.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-03-12 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600171#comment-13600171
 ] 

Junping Du commented on YARN-18:


Thanks Luke for your review and comments! I addressed almost all your points in 
V4 patch except for putting ScheduledRequests in topology related factory class 
for YARN as I think this is the object created within MRAppMaster but not YARN 
ResourceManager like other objects so in different package. Am I missing 
something? 

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, 
 YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-200) yarn log does not output all needed information, and is in a binary format

2013-03-12 Thread Ravi Prakash (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Prakash reassigned YARN-200:
-

Assignee: Ravi Prakash

 yarn log does not output all needed information, and is in a binary format
 --

 Key: YARN-200
 URL: https://issues.apache.org/jira/browse/YARN-200
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 0.23.5
Reporter: Robert Joseph Evans
Assignee: Ravi Prakash
  Labels: usability

 yarn logs does not output attemptid, nodename, or container-id.  Missing 
 these makes it very difficult to look through the logs for failed containers 
 and tie them back to actual tasks and task attempts.
 Also the output currently includes several binary characters.  This is OK for 
 being machine readable, but difficult for being human readable, or even for 
 using standard tool like grep.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-449) MRAppMaster classpath not set properly for unit tests in downstream projects

2013-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600169#comment-13600169
 ] 

Ted Yu commented on YARN-449:
-

On flubber, I installed protoc 2.4.1 but couldn't use it:
{code}
$ protoc --version
protoc: error while loading shared libraries: libprotobuf.so.7: cannot open 
shared object file: No such file or directory
{code}
I applied minimr_randomdir-branch2.txt locally and ran the following command:

mt -Dhadoop.profile=2.0 
-Dtest=org.apache.hadoop.hbase.mapreduce.TestTableMapReduce#testMultiRegionTable

The test passed.

 MRAppMaster classpath not set properly for unit tests in downstream projects
 

 Key: YARN-449
 URL: https://issues.apache.org/jira/browse/YARN-449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Priority: Blocker
 Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
 hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt, 
 minimr_randomdir-branch2.txt


 Post YARN-429, unit tests for HBase continue to fail since the classpath for 
 the MRAppMaster is not being set correctly.
 Reverting YARN-129 may fix this, but I'm not sure that's the correct 
 solution. My guess is, as Alexandro pointed out in YARN-129, maven 
 classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-460) CS user left in list of active users for the queue even when application finished

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600216#comment-13600216
 ] 

Hadoop QA commented on YARN-460:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573361/YARN-460.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/507//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/507//console

This message is automatically generated.

 CS user left in list of active users for the queue even when application 
 finished
 -

 Key: YARN-460
 URL: https://issues.apache.org/jira/browse/YARN-460
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Critical
 Attachments: YARN-460-branch-0.23.patch, YARN-460-branch-0.23.patch, 
 YARN-460.patch, YARN-460.patch, YARN-460.patch


 We have seen a user get left in the queues list of active users even though 
 the application was removed. This can cause everyone else in the queue to get 
 less resources if using the minimum user limit percent config.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-460) CS user left in list of active users for the queue even when application finished

2013-03-12 Thread Thomas Graves (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600232#comment-13600232
 ] 

Thomas Graves commented on YARN-460:


Also note that I manually tested this by delaying the kill container message 
going to AM and sleeping between when the app is marked done and before it was 
removed from the application list.   I was able to reproduce the issue, then 
saw this patch fix it and the AM get the reboot command.  I tested both 
capacity scheduler and fifo.

 CS user left in list of active users for the queue even when application 
 finished
 -

 Key: YARN-460
 URL: https://issues.apache.org/jira/browse/YARN-460
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 0.23.7, 2.0.4-alpha
Reporter: Thomas Graves
Assignee: Thomas Graves
Priority: Critical
 Attachments: YARN-460-branch-0.23.patch, YARN-460-branch-0.23.patch, 
 YARN-460.patch, YARN-460.patch, YARN-460.patch


 We have seen a user get left in the queues list of active users even though 
 the application was removed. This can cause everyone else in the queue to get 
 less resources if using the minimum user limit percent config.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-382) SchedulerUtils improve way normalizeRequest sets the resource capabilities

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600238#comment-13600238
 ] 

Hadoop QA commented on YARN-382:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12572843/YARN-382_1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/508//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/508//console

This message is automatically generated.

 SchedulerUtils improve way normalizeRequest sets the resource capabilities
 --

 Key: YARN-382
 URL: https://issues.apache.org/jira/browse/YARN-382
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Thomas Graves
Assignee: Zhijie Shen
 Attachments: YARN-382_1.patch, YARN-382_demo.patch


 In YARN-370, we changed it from setting the capability to directly setting 
 memory and cores:
 -ask.setCapability(normalized);
 +ask.getCapability().setMemory(normalized.getMemory());
 +ask.getCapability().setVirtualCores(normalized.getVirtualCores());
 We did this because it is directly setting the values in the original 
 resource object passed in when the AM gets allocated and without it the AM 
 doesn't get the resource normalized correctly in the submission context. See 
 YARN-370 for more details.
 I think we should find a better way of doing this long term, one so we don't 
 have to keep adding things there when new resources are added, two because 
 its a bit confusing as to what its doing and prone to someone accidentally 
 breaking it in the future again.  Something closer to what Arun suggested in 
 YARN-370 would be better but we need to make sure all the places work and get 
 some more testing on it before putting it in. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-99) Jobs fail during resource localization when directories in file cache reaches to unix directory limit

2013-03-12 Thread omkar vinit joshi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-99?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600312#comment-13600312
 ] 

omkar vinit joshi commented on YARN-99:
---

I am creating a yarn-467 for public cache issue. Private cache fix will be 
committed here.

 Jobs fail during resource localization when directories in file cache reaches 
 to unix directory limit
 -

 Key: YARN-99
 URL: https://issues.apache.org/jira/browse/YARN-99
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0, 2.0.0-alpha
Reporter: Devaraj K
Assignee: Devaraj K

 If we have multiple jobs which uses distributed cache with small size of 
 files, the directory limit reaches before reaching the cache size and fails 
 to create any directories in file cache. The jobs start failing with the 
 below exception.
 {code:xml}
 java.io.IOException: mkdir of 
 /tmp/nm-local-dir/usercache/root/filecache/1701886847734194975 failed
   at org.apache.hadoop.fs.FileSystem.primitiveMkdir(FileSystem.java:909)
   at 
 org.apache.hadoop.fs.DelegateToFileSystem.mkdir(DelegateToFileSystem.java:143)
   at org.apache.hadoop.fs.FilterFs.mkdir(FilterFs.java:189)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:706)
   at org.apache.hadoop.fs.FileContext$4.next(FileContext.java:703)
   at 
 org.apache.hadoop.fs.FileContext$FSLinkResolver.resolve(FileContext.java:2325)
   at org.apache.hadoop.fs.FileContext.mkdir(FileContext.java:703)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:147)
   at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:49)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {code}
 We should have a mechanism to clean the cache files if it crosses specified 
 number of directories like cache size.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-459) DefaultContainerExecutor doesn't log stderr from container launch

2013-03-12 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza reassigned YARN-459:
---

Assignee: Sandy Ryza

 DefaultContainerExecutor doesn't log stderr from container launch
 -

 Key: YARN-459
 URL: https://issues.apache.org/jira/browse/YARN-459
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha, 0.23.7
Reporter: Thomas Graves
Assignee: Sandy Ryza

 The DefaultContainerExecutor does not log stderr or add it to the diagnostics 
 message it something fails during the container launch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-449) MRAppMaster classpath not set properly for unit tests in downstream projects

2013-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600416#comment-13600416
 ] 

Ted Yu commented on YARN-449:
-

From 
https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/442/testReport/org.apache.hadoop.hbase.mapreduce/TestRowCounter/testRowCounterNoColumn/
 (hadoop-2.0.2-alpha was used):
{code}
2013-03-12 05:44:18,139 WARN  [DeletionService #1] 
nodemanager.DefaultContainerExecutor(276): delete returned false for path: 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_0/usercache/jenkins/appcache/application_1363067018215_0001/container_1363067018215_0001_01_01]
2013-03-12 05:44:18,145 WARN  [AsyncDispatcher event handler] 
nodemanager.NMAuditLogger(150): USER=jenkins  OPERATION=Container Finished 
- Failed   TARGET=ContainerImplRESULT=FAILURE  DESCRIPTION=Container failed 
with state: LOCALIZATION_FAILEDAPPID=application_1363067018215_0001
CONTAINERID=container_1363067018215_0001_01_01
2013-03-12 05:44:18,141 WARN  [DeletionService #0] 
nodemanager.DefaultContainerExecutor(276): delete returned false for path: 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_1/usercache/jenkins/appcache/application_1363067018215_0001/container_1363067018215_0001_01_01]
2013-03-12 05:44:18,220 WARN  [DeletionService #0] 
nodemanager.DefaultContainerExecutor(276): delete returned false for path: 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_2/usercache/jenkins/appcache/application_1363067018215_0001/container_1363067018215_0001_01_01]
2013-03-12 05:44:18,220 WARN  [DeletionService #0] 
nodemanager.DefaultContainerExecutor(276): delete returned false for path: 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_3/usercache/jenkins/appcache/application_1363067018215_0001/container_1363067018215_0001_01_01]
2013-03-12 05:44:18,865 WARN  [AsyncDispatcher event handler] 
resourcemanager.RMAuditLogger(255): USER=jenkins  OPERATION=Application 
Finished - Failed TARGET=RMAppManager RESULT=FAILURE  DESCRIPTION=App 
failed with state: FAILED   PERMISSIONS=Application 
application_1363067018215_0001 failed 1 times due to AM Container for 
appattempt_1363067018215_0001_01 exited with  exitCode: -1000 due to: 
RemoteTrace: 
java.io.IOException: Unable to rename file: 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_1/usercache/jenkins/filecache/5596410335910248146_tmp/hadoop-262140332608909552.jar.tmp]
 to 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_1/usercache/jenkins/filecache/5596410335910248146_tmp/hadoop-262140332608909552.jar]
at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:162)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:205)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:50)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
 at LocalTrace: 
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: 
Unable to rename file: 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_1/usercache/jenkins/filecache/5596410335910248146_tmp/hadoop-262140332608909552.jar.tmp]
 to 
[/home/jenkins/jenkins-slave/workspace/HBase-TRUNK-on-Hadoop-2.0.0/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster/org.apache.hadoop.mapred.MiniMRCluster-localDir-nm-1_1/usercache/jenkins/filecache/5596410335910248146_tmp/hadoop-262140332608909552.jar]
at 

[jira] [Created] (YARN-469) Make scheduling mode in FS pluggable

2013-03-12 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-469:
-

 Summary: Make scheduling mode in FS pluggable
 Key: YARN-469
 URL: https://issues.apache.org/jira/browse/YARN-469
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.0.3-alpha
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla


Currently, scheduling mode in FS is limited to Fair and FIFO. The code 
typically has an if condition at multiple places to determine the correct 
course of action.

Making the scheduling mode pluggable helps in simplifying this process, 
particularly as we add new modes (DRF in this case).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-459) DefaultContainerExecutor doesn't log stderr from container launch

2013-03-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600457#comment-13600457
 ] 

Sandy Ryza commented on YARN-459:
-

Currently for MR both stderr and stdout are redirected to files, so they 
contain nothing useful. Would it make sense to send the output streams both to 
these files and to the console (using tee)?

 DefaultContainerExecutor doesn't log stderr from container launch
 -

 Key: YARN-459
 URL: https://issues.apache.org/jira/browse/YARN-459
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha, 0.23.7
Reporter: Thomas Graves
Assignee: Sandy Ryza

 The DefaultContainerExecutor does not log stderr or add it to the diagnostics 
 message it something fails during the container launch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-459) DefaultContainerExecutor doesn't log stderr from container launch

2013-03-12 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600464#comment-13600464
 ] 

Sandy Ryza commented on YARN-459:
-

Or alternatively, make these files standard for all yarn apps so that the 
container executor can read info from them?

 DefaultContainerExecutor doesn't log stderr from container launch
 -

 Key: YARN-459
 URL: https://issues.apache.org/jira/browse/YARN-459
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 2.0.3-alpha, 0.23.7
Reporter: Thomas Graves
Assignee: Sandy Ryza

 The DefaultContainerExecutor does not log stderr or add it to the diagnostics 
 message it something fails during the container launch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-449) MRAppMaster classpath not set properly for unit tests in downstream projects

2013-03-12 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600578#comment-13600578
 ] 

Siddharth Seth commented on YARN-449:
-

{code}
013-03-12 18:53:39,275 WARN  [Container Monitor] 
monitor.ContainersMonitorImpl$MonitoringThread(444): Container 
[pid=8438,containerID=container_1363114400920_0001_01_01] is running beyond 
virtual memory limits. Current usage: 217.9 MB of 2 GB physical memory used; 
6.5 GB of 4.2 GB virtual memory used. Killing container.
Dump of the process-tree for container_1363114400920_0001_01_01 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 8438 7023 8438 8438 (bash) 1 0 108650496 310 /bin/bash -c 
/usr/lib/jvm/java-1.6.0-sun-1.6.0.37.x86_64/bin/java 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.mapreduce.container.log.dir=/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster_1035429065/org.apache.hadoop.mapred.MiniMRCluster_1035429065-logDir-nm-1_2/application_1363114400920_0001/container_1363114400920_0001_01_01
 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA  
-Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
1gt;/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster_1035429065/org.apache.hadoop.mapred.MiniMRCluster_1035429065-logDir-nm-1_2/application_1363114400920_0001/container_1363114400920_0001_01_01/stdout
 
2gt;/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster_1035429065/org.apache.hadoop.mapred.MiniMRCluster_1035429065-logDir-nm-1_2/application_1363114400920_0001/container_1363114400920_0001_01_01/stderr
{code}

This is what caused TestRowCounter to fail in the linux env. Not sure why the 
Vmem is going that high. The hadoop-1 default config likely disables this 
monitoring.

At this point there seem to be solutions for the original prolem the jira was 
opened for, and this is really re-purposed to get HBase unit tests working with 
Hadoop 2. Changing the title accordingly.

 MRAppMaster classpath not set properly for unit tests in downstream projects
 

 Key: YARN-449
 URL: https://issues.apache.org/jira/browse/YARN-449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Priority: Blocker
 Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
 hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt, 
 minimr_randomdir-branch2.txt


 Post YARN-429, unit tests for HBase continue to fail since the classpath for 
 the MRAppMaster is not being set correctly.
 Reverting YARN-129 may fix this, but I'm not sure that's the correct 
 solution. My guess is, as Alexandro pointed out in YARN-129, maven 
 classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-449) HBase test failures when running against Hadoop 2

2013-03-12 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated YARN-449:


Summary: HBase test failures when running against Hadoop 2  (was: 
MRAppMaster classpath not set properly for unit tests in downstream projects)

 HBase test failures when running against Hadoop 2
 -

 Key: YARN-449
 URL: https://issues.apache.org/jira/browse/YARN-449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Priority: Blocker
 Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
 hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt, 
 minimr_randomdir-branch2.txt


 Post YARN-429, unit tests for HBase continue to fail since the classpath for 
 the MRAppMaster is not being set correctly.
 Reverting YARN-129 may fix this, but I'm not sure that's the correct 
 solution. My guess is, as Alexandro pointed out in YARN-129, maven 
 classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-12 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-71:
--

Attachment: YARN-71.7.patch

1.Reuse the FileContxt instance
2.put rename/exception code into a new function.
3.create new test java file , testNodeManagerReboot.java, and add new test case

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch, YARN-71.6.patch, YARN-71.7.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-12 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-71:
--

Attachment: (was: YARN-71.7.patch)

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch, YARN-71.6.patch, YARN-71.7.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-12 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-71:
--

Attachment: YARN-71.7.patch

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch, YARN-71.6.patch, YARN-71.7.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-71) Ensure/confirm that the NodeManager cleans up local-dirs on restart

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600599#comment-13600599
 ] 

Hadoop QA commented on YARN-71:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573437/YARN-71.7.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/509//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/509//console

This message is automatically generated.

 Ensure/confirm that the NodeManager cleans up local-dirs on restart
 ---

 Key: YARN-71
 URL: https://issues.apache.org/jira/browse/YARN-71
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
Priority: Critical
 Attachments: YARN-71.1.patch, YARN-71.2.patch, YARN-71.3.patch, 
 YARN.71.4.patch, YARN-71.5.patch, YARN-71.6.patch, YARN-71.7.patch


 We have to make sure that NodeManagers cleanup their local files on restart.
 It may already be working like that in which case we should have tests 
 validating this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-03-12 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-412:
---

Assignee: Roger Hoover

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch, YARN-412.patch, YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-466) Slave hostname mismatches in ResourceManager/Scheduler

2013-03-12 Thread Bikas Saha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-466:


Assignee: Zhijie Shen

 Slave hostname mismatches in ResourceManager/Scheduler
 --

 Key: YARN-466
 URL: https://issues.apache.org/jira/browse/YARN-466
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Reporter: Roger Hoover
Assignee: Zhijie Shen

 The problem is that the ResourceManager learns the hostname of a slave node 
 when the NodeManager registers itself and it seems the node manager is 
 getting the hostname by asking the OS.  When a job is submitted, I think the 
 ApplicationMaster learns the hostname by doing a reverse DNS lookup based on 
 the slaves file.
 Therefore, the ApplicationMaster submits requests for containers using the 
 fully qualified domain name (node1.foo.com) but the scheduler uses the OS 
 hostname (node1) when checking to see if any requests are node-local.  The 
 result is that node-local requests are never found using this method of 
 searching for node-local requests:
 ResourceRequest request = application.getResourceRequest(priority, 
 node.getHostName());
 I think it's unfriendly to ask users to make sure they configure hostnames to 
 match fully qualified domain names. There should be a way for the 
 ApplicationMaster and NodeManager to agree on the hostname.
 Steps to Reproduce:
 1) Configure the OS hostname on slaves to differ from the fully qualified 
 domain name.  For example, if the FQDN for the slave is node1.foo.com, set 
 the hostname on the node to be just node1.
 2) On submitting a job, observe that the AM submits resource requests using 
 the FQDN (e.g. node1.foo.com).  You can add logging to the allocate() 
 method of whatever scheduler you're using 
 for (ResourceRequest req: ask) {
   LOG.debug(String.format(Request %s for %d containers on %s, req, 
 req.getNumContainers(), req.getHostName()));
 }
 3) Observe that when the scheduler checks for node locality (in the handle() 
 method) using the FiCaSchedulerNode.getHostName(), the hostname is uses is 
 the one set in the host OS (e.g. node1).  NOTE: if you're using 
 FifoScheduler, this bug needs to be fixed first 
 (https://issues.apache.org/jira/browse/YARN-412).  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-449) HBase test failures when running against Hadoop 2

2013-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600632#comment-13600632
 ] 

Ted Yu commented on YARN-449:
-

More from TEST-org.apache.hadoop.hbase.mapreduce.TestRowCounter.xml:
{code}
2013-03-12 18:53:39,274 WARN  [Container Monitor] 
monitor.ContainersMonitorImpl(298): Process tree for container: 
container_1363114400920_0001_01_01 has processes older than 1 iteration 
running over the configured limit. Limit=4509715456, current usage = 7007866880
2013-03-12 18:53:39,275 WARN  [Container Monitor] 
monitor.ContainersMonitorImpl$MonitoringThread(444): Container 
[pid=8438,containerID=container_1363114400920_0001_01_01] is running beyond 
virtual memory limits. Current usage: 217.9 MB of 2 GB physical memory used; 
6.5 GB of 4.2 GB virtual memory used. Killing container.
Dump of the process-tree for container_1363114400920_0001_01_01 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) 
SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 8438 7023 8438 8438 (bash) 1 0 108650496 310 /bin/bash -c 
/usr/lib/jvm/java-1.6.0-sun-1.6.0.37.x86_64/bin/java 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.mapreduce.container.log.dir=/homes/hortonzy/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster_1035429065/org.apache.hadoop.mapred.MiniMRCluster_1035429065-logDir-nm-1_2/application_1363114400920_0001/container_1363114400920_0001_01_01
 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA  
-Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 
1gt;/homes/hortonzy/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster_1035429065/org.apache.hadoop.mapred.MiniMRCluster_1035429065-logDir-nm-1_2/application_1363114400920_0001/container_1363114400920_0001_01_01/stdout
 
2gt;/homes/hortonzy/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster_1035429065/org.apache.hadoop.mapred.MiniMRCluster_1035429065-logDir-nm-1_2/application_1363114400920_0001/container_1363114400920_0001_01_01/stderr
|- 8461 8438 8438 8438 (java) 688 34 6899216384 55478 
/usr/lib/jvm/java-1.6.0-sun-1.6.0.37.x86_64/bin/java 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.mapreduce.container.log.dir=/homes/hortonzy/trunk/hbase-server/target/org.apache.hadoop.mapred.MiniMRCluster_1035429065/org.apache.hadoop.mapred.MiniMRCluster_1035429065-logDir-nm-1_2/application_1363114400920_0001/container_1363114400920_0001_01_01
 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA 
-Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster
{code}
Note that 1024m was specified for -Xmx

 HBase test failures when running against Hadoop 2
 -

 Key: YARN-449
 URL: https://issues.apache.org/jira/browse/YARN-449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Priority: Blocker
 Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
 hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt, 
 minimr_randomdir-branch2.txt


 Post YARN-429, unit tests for HBase continue to fail since the classpath for 
 the MRAppMaster is not being set correctly.
 Reverting YARN-129 may fix this, but I'm not sure that's the correct 
 solution. My guess is, as Alexandro pointed out in YARN-129, maven 
 classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-449) HBase test failures when running against Hadoop 2

2013-03-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600650#comment-13600650
 ] 

Ted Yu commented on YARN-449:
-

Here is sample content for /proc/PID/stat
{code}
30873 (sshd) S 30869 30869 30869 0 -1 4202816 360 0 0 0 47 56 0 0 20 0 1 0 
741791881 117960704 516 18446744073709551615 1 1 0 0 0 0 0 4096 65536 
18446744073709551615 0 0 17 2 0 0 0 0 0
{code}
Here is the regex used to parse the stat file:
{code}
  private static final Pattern PROCFS_STAT_FILE_FORMAT = Pattern .compile(
^([0-9-]+)\\s([^\\s]+)\\s[^\\s]\\s([0-9-]+)\\s([0-9-]+)\\s([0-9-]+)\\s +
([0-9-]+\\s){7}([0-9]+)\\s([0-9]+)\\s([0-9-]+\\s){7}([0-9]+)\\s([0-9]+) +
(\\s[0-9-]+){15});
{code}

 HBase test failures when running against Hadoop 2
 -

 Key: YARN-449
 URL: https://issues.apache.org/jira/browse/YARN-449
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Siddharth Seth
Priority: Blocker
 Attachments: 7904-v5.txt, hbase-7904-v3.txt, 
 hbase-TestHFileOutputFormat-wip.txt, hbase-TestingUtility-wip.txt, 
 minimr_randomdir-branch2.txt


 Post YARN-429, unit tests for HBase continue to fail since the classpath for 
 the MRAppMaster is not being set correctly.
 Reverting YARN-129 may fix this, but I'm not sure that's the correct 
 solution. My guess is, as Alexandro pointed out in YARN-129, maven 
 classloader magic is messing up java.class.path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-470) Support a way to disable resource monitoring on the NodeManager

2013-03-12 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-470:


 Summary: Support a way to disable resource monitoring on the 
NodeManager
 Key: YARN-470
 URL: https://issues.apache.org/jira/browse/YARN-470
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah


Currently, the memory management monitor's check is disabled when the maxMem is 
set to -1. However, the maxMem is also sent to the RM when the NM registers 
with it ( to define the max limit of allocate-able resources ). 

We need an explicit flag to disable monitoring to avoid the problems caused by 
the overloading of the max memory value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-471) RM does not validate the resource capability of an NM when the NM registers with the RM

2013-03-12 Thread Hitesh Shah (JIRA)
Hitesh Shah created YARN-471:


 Summary: RM does not validate the resource capability of an NM 
when the NM registers with the RM
 Key: YARN-471
 URL: https://issues.apache.org/jira/browse/YARN-471
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Hitesh Shah


Today, an NM could register with -1 memory and -1 cpu with the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (YARN-472) Fix MR Job falied if RM restarted when the job is running

2013-03-12 Thread jian he (JIRA)
jian he created YARN-472:


 Summary: Fix MR Job falied if RM restarted when the job is running
 Key: YARN-472
 URL: https://issues.apache.org/jira/browse/YARN-472
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: jian he
Assignee: jian he


If the RM is restarted when the MR job is running , the job failed because the 
staging directory is cleaned. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-472) MR Job falied if RM restarted when the job is running

2013-03-12 Thread jian he (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian he updated YARN-472:
-

Summary: MR Job falied if RM restarted when the job is running  (was: Fix 
MR Job falied if RM restarted when the job is running)

 MR Job falied if RM restarted when the job is running
 -

 Key: YARN-472
 URL: https://issues.apache.org/jira/browse/YARN-472
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: jian he
Assignee: jian he

 If the RM is restarted when the MR job is running , the job failed because 
 the staging directory is cleaned. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-466) Slave hostname mismatches in ResourceManager/Scheduler

2013-03-12 Thread Roger Hoover (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600772#comment-13600772
 ] 

Roger Hoover commented on YARN-466:
---

My guess is that the best way to solve this is to change the NodeManager to 
send the [fully qualified domain 
name|http://docs.oracle.com/javase/7/docs/api/java/net/InetAddress.html#getCanonicalHostName()]
 to the ResourceManager when it registers itself.

 Slave hostname mismatches in ResourceManager/Scheduler
 --

 Key: YARN-466
 URL: https://issues.apache.org/jira/browse/YARN-466
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, scheduler
Reporter: Roger Hoover
Assignee: Zhijie Shen

 The problem is that the ResourceManager learns the hostname of a slave node 
 when the NodeManager registers itself and it seems the node manager is 
 getting the hostname by asking the OS.  When a job is submitted, I think the 
 ApplicationMaster learns the hostname by doing a reverse DNS lookup based on 
 the slaves file.
 Therefore, the ApplicationMaster submits requests for containers using the 
 fully qualified domain name (node1.foo.com) but the scheduler uses the OS 
 hostname (node1) when checking to see if any requests are node-local.  The 
 result is that node-local requests are never found using this method of 
 searching for node-local requests:
 ResourceRequest request = application.getResourceRequest(priority, 
 node.getHostName());
 I think it's unfriendly to ask users to make sure they configure hostnames to 
 match fully qualified domain names. There should be a way for the 
 ApplicationMaster and NodeManager to agree on the hostname.
 Steps to Reproduce:
 1) Configure the OS hostname on slaves to differ from the fully qualified 
 domain name.  For example, if the FQDN for the slave is node1.foo.com, set 
 the hostname on the node to be just node1.
 2) On submitting a job, observe that the AM submits resource requests using 
 the FQDN (e.g. node1.foo.com).  You can add logging to the allocate() 
 method of whatever scheduler you're using 
 for (ResourceRequest req: ask) {
   LOG.debug(String.format(Request %s for %d containers on %s, req, 
 req.getNumContainers(), req.getHostName()));
 }
 3) Observe that when the scheduler checks for node locality (in the handle() 
 method) using the FiCaSchedulerNode.getHostName(), the hostname is uses is 
 the one set in the host OS (e.g. node1).  NOTE: if you're using 
 FifoScheduler, this bug needs to be fixed first 
 (https://issues.apache.org/jira/browse/YARN-412).  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-471) NM does not validate the resource capabilities before it registers with RM

2013-03-12 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-471:
-

Component/s: (was: resourcemanager)
 nodemanager
Summary: NM does not validate the resource capabilities before it 
registers with RM  (was: RM does not validate the resource capability of an NM 
when the NM registers with the RM)

Because NM and RM both are the trusted components in the system, I think we 
should do this validation on the NM itself.

Modifying the description, please revert it if you disagree.

 NM does not validate the resource capabilities before it registers with RM
 --

 Key: YARN-471
 URL: https://issues.apache.org/jira/browse/YARN-471
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah

 Today, an NM could register with -1 memory and -1 cpu with the RM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-470) Support a way to disable resource monitoring on the NodeManager

2013-03-12 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600795#comment-13600795
 ] 

Vinod Kumar Vavilapalli commented on YARN-470:
--

Dependent on how you look at it, it's massive; I take the blame :) Good find!

 Support a way to disable resource monitoring on the NodeManager
 ---

 Key: YARN-470
 URL: https://issues.apache.org/jira/browse/YARN-470
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah

 Currently, the memory management monitor's check is disabled when the maxMem 
 is set to -1. However, the maxMem is also sent to the RM when the NM 
 registers with it ( to define the max limit of allocate-able resources ). 
 We need an explicit flag to disable monitoring to avoid the problems caused 
 by the overloading of the max memory value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-03-12 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-18:
---

Attachment: YARN-18-v4.1.patch

Address timeout and JavaDoc issue in v4.1 patch.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, 
 YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-03-12 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600800#comment-13600800
 ] 

Vinod Kumar Vavilapalli commented on YARN-198:
--

+1, this looks good, checking it in.

 If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
 link to navigate back to Resource manager
 ---

 Key: YARN-198
 URL: https://issues.apache.org/jira/browse/YARN-198
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Ramgopal N
Assignee: jian he
Priority: Minor
  Labels: usability
 Attachments: YARN-198.patch


 If we are navigating to Nodemanager by clicking on the node link in RM,there 
 is no link provided on the NM to navigate back to RM.
  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-03-12 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600807#comment-13600807
 ] 

Hudson commented on YARN-198:
-

Integrated in Hadoop-trunk-Commit #3460 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3460/])
YARN-198. Added a link to RM pages from the NodeManager web app. 
Contributed by Jian He. (Revision 1455800)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1455800
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NavBlock.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NodePage.java


 If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
 link to navigate back to Resource manager
 ---

 Key: YARN-198
 URL: https://issues.apache.org/jira/browse/YARN-198
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Ramgopal N
Assignee: jian he
Priority: Minor
  Labels: usability
 Fix For: 2.0.5-beta

 Attachments: YARN-198.patch


 If we are navigating to Nodemanager by clicking on the node link in RM,there 
 is no link provided on the NM to navigate back to RM.
  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-03-12 Thread jian he (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600810#comment-13600810
 ] 

jian he commented on YARN-198:
--

Thanks, Vinod!

 If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
 link to navigate back to Resource manager
 ---

 Key: YARN-198
 URL: https://issues.apache.org/jira/browse/YARN-198
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Ramgopal N
Assignee: jian he
Priority: Minor
  Labels: usability
 Fix For: 2.0.5-beta

 Attachments: YARN-198.patch


 If we are navigating to Nodemanager by clicking on the node link in RM,there 
 is no link provided on the NM to navigate back to RM.
  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-378) ApplicationMaster retry times should be set by Client

2013-03-12 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600820#comment-13600820
 ] 

Vinod Kumar Vavilapalli commented on YARN-378:
--

bq. We should separate the YARN part of it from the mapreduce only changes.
Filed  MAPREDUCE-5062 : MR AM should read max-retries information from the RM.

 ApplicationMaster retry times should be set by Client
 -

 Key: YARN-378
 URL: https://issues.apache.org/jira/browse/YARN-378
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client, resourcemanager
 Environment: suse
Reporter: xieguiming
Assignee: Zhijie Shen
  Labels: usability
 Attachments: YARN-378_1.patch, YARN-378_2.patch, YARN-378_3.patch, 
 YARN-378_4.patch, YARN-378_5.patch, YARN-378_6.patch


 We should support that different client or user have different 
 ApplicationMaster retry times. It also say that 
 yarn.resourcemanager.am.max-retries should be set by client. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600824#comment-13600824
 ] 

Hadoop QA commented on YARN-18:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573471/YARN-18-v4.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 1 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/510//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/510//console

This message is automatically generated.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, 
 YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-03-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600849#comment-13600849
 ] 

Hadoop QA commented on YARN-18:
---

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12573476/YARN-18-v4.2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/511//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/511//console

This message is automatically generated.

 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, 
 YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, 
 YARN-18-v4.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-381) Improve FS docs

2013-03-12 Thread Vitaly Kruglikov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13600859#comment-13600859
 ] 

Vitaly Kruglikov commented on YARN-381:
---

The file FairScheduler.apt.vm doesn't specify the units for the minResources 
property in queue allocations. Is it in megabytes or in bytes?

 Improve FS docs
 ---

 Key: YARN-381
 URL: https://issues.apache.org/jira/browse/YARN-381
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 2.0.0-alpha
Reporter: Eli Collins
Priority: Minor

 The MR2 FS docs could use some improvements.
 Configuration:
 - sizebasedweight - what is the size here? Total memory usage?
 Pool properties:
 - minResources - what does min amount of aggregate memory mean given that 
 this is not a reservation?
 - maxResources - is this a hard limit?
 - weight: How is this  ratio configured?  Eg base is 1 and all weights are 
 relative to that?
 - schedulingMode - what is the default? Is fifo pure fifo, eg waits until all 
 tasks for the job are finished before launching the next job?
 There's no mention of ACLs, even though they're supported. See the CS docs 
 for comparison.
 Also there are a couple typos worth fixing while we're at it, eg finish. 
 apps to run
 Worth keeping in mind that some of these will need to be updated to reflect 
 that resource calculators are now pluggable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira