[jira] [Commented] (YARN-309) Make RM provide heartbeat interval to NM

2013-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618417#comment-13618417
 ] 

Hadoop QA commented on YARN-309:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576294/YARN-309-20130331.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/632//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/632//console

This message is automatically generated.

 Make RM provide heartbeat interval to NM
 

 Key: YARN-309
 URL: https://issues.apache.org/jira/browse/YARN-309
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-309.1.patch, YARN-309-20130331.txt, 
 YARN-309.2.patch, YARN-309.3.patch, YARN-309.4.patch, YARN-309.5.patch, 
 YARN-309.6.patch, YARN-309.7.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-475) Remove ApplicationConstants.AM_APP_ATTEMPT_ID_ENV as it is no longer set in an AM's environment

2013-03-31 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618430#comment-13618430
 ] 

Vinod Kumar Vavilapalli commented on YARN-475:
--

+1 this looks good. No tests needed. Checking this in.

 Remove ApplicationConstants.AM_APP_ATTEMPT_ID_ENV as it is no longer set in 
 an AM's environment
 ---

 Key: YARN-475
 URL: https://issues.apache.org/jira/browse/YARN-475
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: YARN-475.1.patch


 AMs are expected to use ApplicationConstants.AM_CONTAINER_ID_ENV and derive 
 the application attempt id from the container id. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (YARN-157) The option shell_command and shell_script have conflict

2013-03-31 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned YARN-157:


Assignee: rainy Yu

Rainy, assigning this to you..

 The option shell_command and shell_script have conflict
 ---

 Key: YARN-157
 URL: https://issues.apache.org/jira/browse/YARN-157
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.0.1-alpha
Reporter: Li Ming
Assignee: rainy Yu
  Labels: patch
 Attachments: hadoop_yarn.patch


 The DistributedShell has an option shell_script to let user specify a shell 
 script which will be executed in containers. But the issue is that the 
 shell_command option is a must, so if both options are set, then every 
 container executor will end with exitCode=1. This is because DistributedShell 
 executes the shell_command and shell_script together. For example, if 
 shell_command is 'date' then the final command to be executed in container is 
 date `ExecShellScript.sh`, so the date command will treat the result of 
 ExecShellScript.sh as its parameter, then there will be an error. 
 To solve this, the DistributedShell should not use the value of shell_command 
 option when the shell_script option is set, and the shell_command option also 
 should not be mandatory. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-03-31 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-412:
---

Assignee: Roger Hoover  (was: Arun C Murthy)

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch, YARN-412.patch, YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-412) FifoScheduler incorrectly checking for node locality

2013-03-31 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618433#comment-13618433
 ] 

Arun C Murthy commented on YARN-412:


[~theduderog] - No, you deserve credit for finding and fixing this. Thanks!

 FifoScheduler incorrectly checking for node locality
 

 Key: YARN-412
 URL: https://issues.apache.org/jira/browse/YARN-412
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Reporter: Roger Hoover
Assignee: Roger Hoover
Priority: Minor
  Labels: patch
 Attachments: YARN-412.patch, YARN-412.patch, YARN-412.patch


 In the FifoScheduler, the assignNodeLocalContainers method is checking if the 
 data is local to a node by searching for the nodeAddress of the node in the 
 set of outstanding requests for the app.  This seems to be incorrect as it 
 should be checking hostname instead.  The offending line of code is 455:
 application.getResourceRequest(priority, node.getRMNode().getNodeAddress());
 Requests are formated by hostname (e.g. host1.foo.com) whereas node addresses 
 are a concatenation of hostname and command port (e.g. host1.foo.com:1234)
 In the CapacityScheduler, it's done using hostname.  See 
 LeafQueue.assignNodeLocalContainers, line 1129
 application.getResourceRequest(priority, node.getHostName());
 Note that this bug does not affect the actual scheduling decisions made by 
 the FifoScheduler because even though it incorrect determines that a request 
 is not local to the node, it will still schedule the request immediately 
 because it's rack-local.  However, this bug may be adversely affecting the 
 reporting of job status by underreporting the number of tasks that were node 
 local.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-291) Dynamic resource configuration on NM

2013-03-31 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618443#comment-13618443
 ] 

Arun C Murthy commented on YARN-291:


I commented on one of the sub-tasks, but I'm not comfortable with JMX based 
tricks. We fundamentally need an explicit manner to inform RM of changes to a 
node and that needs to flow down to schedulers etc.

 Dynamic resource configuration on NM
 

 Key: YARN-291
 URL: https://issues.apache.org/jira/browse/YARN-291
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: Elastic Resources for YARN-v0.2.pdf, 
 YARN-291-AddClientRMProtocolToSetNodeResource-03.patch, 
 YARN-291-all-v1.patch, YARN-291-core-HeartBeatAndScheduler-01.patch, 
 YARN-291-JMXInterfaceOnNM-02.patch, 
 YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch, 
 YARN-291-YARNClientCommandline-04.patch


 The current Hadoop YARN resource management logic assumes per node resource 
 is static during the lifetime of the NM process. Allowing run-time 
 configuration on per node resource will give us finer granularity of resource 
 elasticity. This allows Hadoop workloads to coexist with other workloads on 
 the same hardware efficiently, whether or not the environment is virtualized. 
 About more background and design details, please refer: HADOOP-9165.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-475) Remove ApplicationConstants.AM_APP_ATTEMPT_ID_ENV as it is no longer set in an AM's environment

2013-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618444#comment-13618444
 ] 

Hudson commented on YARN-475:
-

Integrated in Hadoop-trunk-Commit #3541 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3541/])
YARN-475. Remove a unused constant in the public API - 
ApplicationConstants.AM_APP_ATTEMPT_ID_ENV. Contributed by Hitesh Shah. 
(Revision 1463033)

 Result = SUCCESS
vinodkv : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1463033
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/main/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/UnmanagedAMLauncher.java


 Remove ApplicationConstants.AM_APP_ATTEMPT_ID_ENV as it is no longer set in 
 an AM's environment
 ---

 Key: YARN-475
 URL: https://issues.apache.org/jira/browse/YARN-475
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Fix For: 2.0.5-beta

 Attachments: YARN-475.1.patch


 AMs are expected to use ApplicationConstants.AM_CONTAINER_ID_ENV and derive 
 the application attempt id from the container id. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-447) applicationComparator improvement for CS

2013-03-31 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-447:
---

Assignee: nemon lou

 applicationComparator improvement for CS
 

 Key: YARN-447
 URL: https://issues.apache.org/jira/browse/YARN-447
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: nemon lou
Assignee: nemon lou
Priority: Minor
 Attachments: YARN-447-trunk.patch, YARN-447-trunk.patch, 
 YARN-447-trunk.patch


 Now the compare code is :
 return a1.getApplicationId().getId() - a2.getApplicationId().getId();
 Will be replaced with :
 return a1.getApplicationId().compareTo(a2.getApplicationId());
 This will bring some benefits:
 1,leave applicationId compare logic to ApplicationId class;
 2,In future's HA mode,cluster time stamp may change,ApplicationId class 
 already takes care of this condition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-447) applicationComparator improvement for CS

2013-03-31 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618447#comment-13618447
 ] 

Arun C Murthy commented on YARN-447:


+1, lgtm!

 applicationComparator improvement for CS
 

 Key: YARN-447
 URL: https://issues.apache.org/jira/browse/YARN-447
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: nemon lou
Assignee: nemon lou
Priority: Minor
 Attachments: YARN-447-trunk.patch, YARN-447-trunk.patch, 
 YARN-447-trunk.patch


 Now the compare code is :
 return a1.getApplicationId().getId() - a2.getApplicationId().getId();
 Will be replaced with :
 return a1.getApplicationId().compareTo(a2.getApplicationId());
 This will bring some benefits:
 1,leave applicationId compare logic to ApplicationId class;
 2,In future's HA mode,cluster time stamp may change,ApplicationId class 
 already takes care of this condition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-444) Move special container exit codes from YarnConfiguration to API

2013-03-31 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618470#comment-13618470
 ] 

Bikas Saha commented on YARN-444:
-

Patch looks mostly good. I wonder why the plural has been used in the name of 
the class when thats not the common pattern. The comments in 
ContainerStatus.java mix the plural and singular.

 Move special container exit codes from YarnConfiguration to API
 ---

 Key: YARN-444
 URL: https://issues.apache.org/jira/browse/YARN-444
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, applications/distributed-shell
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-444.patch


 YarnConfiguration currently contains the special container exit codes 
 INVALID_CONTAINER_EXIT_STATUS = -1000, ABORTED_CONTAINER_EXIT_STATUS = -100, 
 and DISKS_FAILED = -101.
 These are not really not really related to configuration, and 
 YarnConfiguration should not become a place to put miscellaneous constants.
 Per discussion on YARN-417, appmaster writers need to be able to provide 
 special handling for them, so it might make sense to move these to their own 
 user-facing class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-392) Make it possible to schedule to specific nodes without dropping locality

2013-03-31 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618475#comment-13618475
 ] 

Bikas Saha commented on YARN-392:
-

How about calling is disableLocalityRelaxation as thats what it is basically 
doing. When specified on a node it would mean do not relax locality to rack or 
ANY. Potentially, we could also say that when specified on a rack then it would 
mean do not relax locality to ANY. Do you think it could be used to specify 
either exact nodes or exact racks. In that case, we would need to check that if 
this flag is set then either nodes or racks but not both are specified. 
I am not sure how to make sense of 2 different asks to the RM (at the same 
priority) that say 
1) allocate at specific node A (ie do not relax locality to rackA) 
2) allocate at specific rack rackA (ie do not relax locality to ANY) where node 
A is contained in rackA

 Make it possible to schedule to specific nodes without dropping locality
 

 Key: YARN-392
 URL: https://issues.apache.org/jira/browse/YARN-392
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Sandy Ryza
 Attachments: YARN-392-1.patch, YARN-392.patch


 Currently its not possible to specify scheduling requests for specific nodes 
 and nowhere else. The RM automatically relaxes locality to rack and * and 
 assigns non-specified machines to the app.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-309) Make RM provide heartbeat interval to NM

2013-03-31 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618486#comment-13618486
 ] 

Xuan Gong commented on YARN-309:


1.Fix the testcase 
org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdater never finished 
error. Because of this, we will get binding error when we run the testcase 
org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerReboot.
2.Add the NodeHeartBeatResponse.setNextHeartBeatInterval() for all the override 
function NodeHeartbeatResponse nodeHeartbeat()

 Make RM provide heartbeat interval to NM
 

 Key: YARN-309
 URL: https://issues.apache.org/jira/browse/YARN-309
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-309.1.patch, YARN-309-20130331.txt, 
 YARN-309.2.patch, YARN-309.3.patch, YARN-309.4.patch, YARN-309.5.patch, 
 YARN-309.6.patch, YARN-309.7.patch, YARN-309.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-309) Make RM provide heartbeat interval to NM

2013-03-31 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-309:
---

Attachment: YARN-309.9.patch

 Make RM provide heartbeat interval to NM
 

 Key: YARN-309
 URL: https://issues.apache.org/jira/browse/YARN-309
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-309.1.patch, YARN-309-20130331.txt, 
 YARN-309.2.patch, YARN-309.3.patch, YARN-309.4.patch, YARN-309.5.patch, 
 YARN-309.6.patch, YARN-309.7.patch, YARN-309.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-193) Scheduler.normalizeRequest does not account for allocation requests that exceed maximumAllocation limits

2013-03-31 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618487#comment-13618487
 ] 

Bikas Saha commented on YARN-193:
-

This and others like it are back-incompatible but might be ok since we are 
still in alpha
{code}
-  public static final int DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_CORES = 32;
+  public static final int DEFAULT_RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES = 32;
{code}

It should be disabled. Same for other places.
{code}
+maximum allocation is disable./description
{code}

This and other places, a LOG in the catch would be good.
Also, I am not warming up to the idea of having to put a try catch around every 
validate.
{code}
+  // sanity check
+  try {
+SchedulerUtils.validateResourceRequests(ask,
+rScheduler.getMaximumResourceCapability());
+  } catch (InvalidResourceRequestException e) {
+RPCUtil.getRemoteException(e);
+  }
{code}

Incorrect log message.
{code}
+try {
+  SchedulerUtils.validateResourceRequest(amReq,
+  scheduler.getMaximumResourceCapability());
+} catch (InvalidResourceRequestException e) {
+  LOG.info(RM App submission failed in normalize AM Resource Request 
+  + for application with id  + applicationId +  : 
+  + e.getMessage());
{code}
Also, in this method, why are we throwing an exception in the inner block and 
catching it in the outer block. Why is the inner try catch needed (instead of 
catching the exception in the outer catch)?
On the same note, why can this validation not be done in ClientRMService just 
like its been done in ApplicationMasterService? That maintains symmetry and is 
easier to understand/correlate. It will also work when RMAppManager.handle() is 
not called synchronously from ClientRMService.

Where are we testing that normalize is being set to the next higher multiple of 
min but not more than the max (for DRF case)? OR that checking against max is 
disabled by setting MAX allowed to -1. I am sorry if I have missed it.


 Scheduler.normalizeRequest does not account for allocation requests that 
 exceed maximumAllocation limits 
 -

 Key: YARN-193
 URL: https://issues.apache.org/jira/browse/YARN-193
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 3.0.0
Reporter: Hitesh Shah
Assignee: Zhijie Shen
 Attachments: MR-3796.1.patch, MR-3796.2.patch, MR-3796.3.patch, 
 MR-3796.wip.patch, YARN-193.4.patch, YARN-193.5.patch, YARN-193.6.patch, 
 YARN-193.7.patch, YARN-193.8.patch, YARN-193.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-309) Make RM provide heartbeat interval to NM

2013-03-31 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618488#comment-13618488
 ] 

Xuan Gong commented on YARN-309:


Setting initial value for nextHeartBeatInterval back 1000L instead of 0L. Test 
the patch at single running cluster. 

 Make RM provide heartbeat interval to NM
 

 Key: YARN-309
 URL: https://issues.apache.org/jira/browse/YARN-309
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-309.1.patch, YARN-309-20130331.txt, 
 YARN-309.2.patch, YARN-309.3.patch, YARN-309.4.patch, YARN-309.5.patch, 
 YARN-309.6.patch, YARN-309.7.patch, YARN-309.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-382) SchedulerUtils improve way normalizeRequest sets the resource capabilities

2013-03-31 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618489#comment-13618489
 ] 

Bikas Saha commented on YARN-382:
-

Given, YARN-193 is only fixing validation and this copying is still needed 
temporarily, can you please re-post a rebased patch with your original fix. 
Could you also please leave a comment saying this code can be removed once 
YARN-486 is completed since then there is no need to copy anything from 
Container to CLC.

 SchedulerUtils improve way normalizeRequest sets the resource capabilities
 --

 Key: YARN-382
 URL: https://issues.apache.org/jira/browse/YARN-382
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Thomas Graves
Assignee: Zhijie Shen
 Attachments: YARN-382_1.patch, YARN-382_demo.patch


 In YARN-370, we changed it from setting the capability to directly setting 
 memory and cores:
 -ask.setCapability(normalized);
 +ask.getCapability().setMemory(normalized.getMemory());
 +ask.getCapability().setVirtualCores(normalized.getVirtualCores());
 We did this because it is directly setting the values in the original 
 resource object passed in when the AM gets allocated and without it the AM 
 doesn't get the resource normalized correctly in the submission context. See 
 YARN-370 for more details.
 I think we should find a better way of doing this long term, one so we don't 
 have to keep adding things there when new resources are added, two because 
 its a bit confusing as to what its doing and prone to someone accidentally 
 breaking it in the future again.  Something closer to what Arun suggested in 
 YARN-370 would be better but we need to make sure all the places work and get 
 some more testing on it before putting it in. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-486) Change startContainer NM API to accept Container as a parameter and make ContainerLaunchContext user land

2013-03-31 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618491#comment-13618491
 ] 

Bikas Saha commented on YARN-486:
-

Once this change is made there is no need to copy the amContainer.resource in 
ASC as done in YARN-382. Linking the jiras.

 Change startContainer NM API to accept Container as a parameter and make 
 ContainerLaunchContext user land
 -

 Key: YARN-486
 URL: https://issues.apache.org/jira/browse/YARN-486
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha

 Currently, id, resource request etc need to be copied over from Container to 
 ContainerLaunchContext. This can be brittle. Also it leads to duplication of 
 information (such as Resource from CLC and Resource from Container and 
 Container.tokens). Sending Container directly to startContainer solves these 
 problems. It also makes CLC clean by only having stuff in it that it set by 
 the client/AM.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-193) Scheduler.normalizeRequest does not account for allocation requests that exceed maximumAllocation limits

2013-03-31 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618493#comment-13618493
 ] 

Bikas Saha commented on YARN-193:
-

Also, do we really need to create a new Resource object every time we call 
normalize? This should be a different jira though.

 Scheduler.normalizeRequest does not account for allocation requests that 
 exceed maximumAllocation limits 
 -

 Key: YARN-193
 URL: https://issues.apache.org/jira/browse/YARN-193
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.0.2-alpha, 3.0.0
Reporter: Hitesh Shah
Assignee: Zhijie Shen
 Attachments: MR-3796.1.patch, MR-3796.2.patch, MR-3796.3.patch, 
 MR-3796.wip.patch, YARN-193.4.patch, YARN-193.5.patch, YARN-193.6.patch, 
 YARN-193.7.patch, YARN-193.8.patch, YARN-193.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-309) Make RM provide heartbeat interval to NM

2013-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618502#comment-13618502
 ] 

Hadoop QA commented on YARN-309:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576303/YARN-309.9.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer
  
org.apache.hadoop.yarn.server.resourcemanager.TestResourceTrackerService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/633//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/633//console

This message is automatically generated.

 Make RM provide heartbeat interval to NM
 

 Key: YARN-309
 URL: https://issues.apache.org/jira/browse/YARN-309
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-309.1.patch, YARN-309-20130331.txt, 
 YARN-309.2.patch, YARN-309.3.patch, YARN-309.4.patch, YARN-309.5.patch, 
 YARN-309.6.patch, YARN-309.7.patch, YARN-309.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-309) Make RM provide heartbeat interval to NM

2013-03-31 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-309:
---

Attachment: YARN-309.10.patch

 Make RM provide heartbeat interval to NM
 

 Key: YARN-309
 URL: https://issues.apache.org/jira/browse/YARN-309
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-309.10.patch, YARN-309.1.patch, 
 YARN-309-20130331.txt, YARN-309.2.patch, YARN-309.3.patch, YARN-309.4.patch, 
 YARN-309.5.patch, YARN-309.6.patch, YARN-309.7.patch, YARN-309.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-309) Make RM provide heartbeat interval to NM

2013-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618519#comment-13618519
 ] 

Hadoop QA commented on YARN-309:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12576309/YARN-309.10.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestContainerLocalizer

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/634//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/634//console

This message is automatically generated.

 Make RM provide heartbeat interval to NM
 

 Key: YARN-309
 URL: https://issues.apache.org/jira/browse/YARN-309
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
 Attachments: YARN-309.10.patch, YARN-309.1.patch, 
 YARN-309-20130331.txt, YARN-309.2.patch, YARN-309.3.patch, YARN-309.4.patch, 
 YARN-309.5.patch, YARN-309.6.patch, YARN-309.7.patch, YARN-309.9.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-291) Dynamic resource configuration on NM

2013-03-31 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618522#comment-13618522
 ] 

Luke Lu commented on YARN-291:
--

Resource scheduling is fundamentally centralized at RM. The global resource 
view is currently bootstrapped via the node registration process, which is more 
of a historical artifact based on convenience, since the resource view can also 
be constructed directly on RM via an inventory database. It's a round about, 
inconvenient and inefficient way to (re)construct the resource view by 
modifying per node config explicitly and propagate partial views to RM, if you 
already have an inventory database.

For a practical example: if you have OLTP workload (say HBase) sharing the same 
hardware with YARN and there is a load surge on HBase, we need to stop 
scheduling tasks/containers immediately on relevant (potentially all) nodes. 
The current patch (JMX is just used as a portable protocol for external 
management client to communicate with RM) can take effect immediately most 
efficiently. If we explicitly modify each nodemanager config and let the NM-RM 
protocol to propagate the change to RM, it would waste resource (CPU and 
network bandwidth) to contact (potentially) all the nodemanagers and cause 
unnecessary scheduling delays if the propagation is via regular heartbeat 
and/or DDoS the RM if (potentially) all the NMs need to re-register out-of-band.

This is not about JMX based tricks. This is about changing global resource view 
directly where the scheduler is vs the Rube Goldbergish way of changing NM 
config individually and propagate changes to RM to reconstruct the resource 
view. IMO, the direct way is better because NM doesn't really care about what 
resource it really has.

 Dynamic resource configuration on NM
 

 Key: YARN-291
 URL: https://issues.apache.org/jira/browse/YARN-291
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: Elastic Resources for YARN-v0.2.pdf, 
 YARN-291-AddClientRMProtocolToSetNodeResource-03.patch, 
 YARN-291-all-v1.patch, YARN-291-core-HeartBeatAndScheduler-01.patch, 
 YARN-291-JMXInterfaceOnNM-02.patch, 
 YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch, 
 YARN-291-YARNClientCommandline-04.patch


 The current Hadoop YARN resource management logic assumes per node resource 
 is static during the lifetime of the NM process. Allowing run-time 
 configuration on per node resource will give us finer granularity of resource 
 elasticity. This allows Hadoop workloads to coexist with other workloads on 
 the same hardware efficiently, whether or not the environment is virtualized. 
 About more background and design details, please refer: HADOOP-9165.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-457) Setting updated nodes from null to null causes NPE in AMResponsePBImpl

2013-03-31 Thread Kenji Kikushima (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenji Kikushima updated YARN-457:
-

Attachment: YARN-457.patch

 Setting updated nodes from null to null causes NPE in AMResponsePBImpl
 --

 Key: YARN-457
 URL: https://issues.apache.org/jira/browse/YARN-457
 Project: Hadoop YARN
  Issue Type: Bug
  Components: api
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Priority: Minor
  Labels: Newbie
 Attachments: YARN-457.patch


 {code}
 if (updatedNodes == null) {
   this.updatedNodes.clear();
   return;
 }
 {code}
 If updatedNodes is already null, a NullPointerException is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-18) Make locatlity in YARN's container assignment and task scheduling pluggable for other deployment topology

2013-03-31 Thread Luke Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-18?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13618525#comment-13618525
 ] 

Luke Lu commented on YARN-18:
-

bq. I see a lot of factories and then no changes to actually support 
node-group..

YARN-18 is explicitly for making the topology pluggable based on previous 
review requests. So there is little node group specific changes, which are 
implemented in YARN-19.

bq. we need to abstract the notion of topology in the Scheduler without having 
to make changes every time we need to add another layer in the topology.

This is a goal of the JIRA. You should be able to add another layer without 
having to modify scheduler code after this patch is committed. I'm not sure 
that we could make arbitrary topology pluggable without scheduler change. 
However, it should work for common class of topologies, e.g., hierarchical 
network topology.


 Make locatlity in YARN's container assignment and task scheduling pluggable 
 for other deployment topology
 -

 Key: YARN-18
 URL: https://issues.apache.org/jira/browse/YARN-18
 Project: Hadoop YARN
  Issue Type: New Feature
Affects Versions: 2.0.3-alpha
Reporter: Junping Du
Assignee: Junping Du
  Labels: features
 Attachments: 
 HADOOP-8474-ContainerAssignmentTaskScheduling-pluggable.patch, 
 MAPREDUCE-4309.patch, MAPREDUCE-4309-v2.patch, MAPREDUCE-4309-v3.patch, 
 MAPREDUCE-4309-v4.patch, MAPREDUCE-4309-v5.patch, MAPREDUCE-4309-v6.patch, 
 MAPREDUCE-4309-v7.patch, YARN-18.patch, YARN-18-v2.patch, YARN-18-v3.1.patch, 
 YARN-18-v3.2.patch, YARN-18-v3.patch, YARN-18-v4.1.patch, YARN-18-v4.2.patch, 
 YARN-18-v4.3.patch, YARN-18-v4.patch, YARN-18-v5.1.patch, YARN-18-v5.patch


 There are several classes in YARN’s container assignment and task scheduling 
 algorithms that relate to data locality which were updated to give preference 
 to running a container on other locality besides node-local and rack-local 
 (like nodegroup-local). This propose to make these data structure/algorithms 
 pluggable, like: SchedulerNode, RMNodeImpl, etc. The inner class 
 ScheduledRequests was made a package level class to it would be easier to 
 create a subclass, ScheduledRequestsWithNodeGroup.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira