[jira] [Created] (YARN-1554) add an env variable for the YARN AM classpath

2014-01-02 Thread Steve Loughran (JIRA)
Steve Loughran created YARN-1554:


 Summary: add an env variable for the YARN AM classpath
 Key: YARN-1554
 URL: https://issues.apache.org/jira/browse/YARN-1554
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: Steve Loughran
Priority: Minor


Currently YARN apps set up their classpath via the default value 
{{YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH}} or an overridden 
property {{yarn.application.classpath}}. 

If you don't have the classpath right, the AM won't start up. This means the 
client needs to be explicitly configured with the CP.

If the node manager exported the classpath property via an env variable 
{{YARN_APPLICATION_CLASSPATH}}, then the classpath could be set up in the AM 
simply by referencing that property, rather than hoping its setting is in sync. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1554) add an env variable for the YARN AM classpath

2014-01-02 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860131#comment-13860131
 ] 

Steve Loughran commented on YARN-1554:
--

As  noted in {{org.apache.hadoop.yarn.applications.distributedshell.Client}}:

{code}
// At some point we should not be required to add 
// the hadoop specific classpaths to the env. 
// It should be provided out of the box. 
// For now setting all required classpaths including
// the classpath to . for the application jar
{code}

 add an env variable for the YARN AM classpath
 -

 Key: YARN-1554
 URL: https://issues.apache.org/jira/browse/YARN-1554
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: Steve Loughran
Priority: Minor

 Currently YARN apps set up their classpath via the default value 
 {{YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH}} or an overridden 
 property {{yarn.application.classpath}}. 
 If you don't have the classpath right, the AM won't start up. This means the 
 client needs to be explicitly configured with the CP.
 If the node manager exported the classpath property via an env variable 
 {{YARN_APPLICATION_CLASSPATH}}, then the classpath could be set up in the AM 
 simply by referencing that property, rather than hoping its setting is in 
 sync. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1506:
-

Attachment: YARN-1506-v1.patch

Update the first version of patch, include:
- replace set RMNode/SchedulerNode directly with event notification.
- update both total resource and available resource on schedulerNode (fix a bug 
in previous patch)
- Remove resourceOption from RMNode as RMNode don't have to aware 
overcommitTimeout
- Other necessary changes in related tests

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860253#comment-13860253
 ] 

Hadoop QA commented on YARN-1506:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621074/YARN-1506-v1.patch
  against trunk revision .

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2772//console

This message is automatically generated.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1506:
-

Attachment: YARN-1506-v2.patch

Not run in Jenkins automatically... Get failed in manual start build, but error 
message make no sense. Rename patch to v2 and submit it again.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860286#comment-13860286
 ] 

Hadoop QA commented on YARN-1506:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621084/YARN-1506-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 2 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 2 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2773//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2773//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2773//console

This message is automatically generated.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1506:
-

Attachment: YARN-1506-v3.patch

Fix findbugs warnings and javadoc warnings in v3 patch.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860326#comment-13860326
 ] 

Hadoop QA commented on YARN-1506:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621095/YARN-1506-v3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-sls 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2774//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2774//console

This message is automatically generated.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (YARN-1138) yarn.application.classpath is set to point to $HADOOP_CONF_DIR etc., which does not work on Windows

2014-01-02 Thread Yingda Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yingda Chen reassigned YARN-1138:
-

Assignee: (was: Yingda Chen)

I am not actively working on YARN for the time being.

 yarn.application.classpath is set to point to $HADOOP_CONF_DIR etc., which 
 does not work on Windows
 ---

 Key: YARN-1138
 URL: https://issues.apache.org/jira/browse/YARN-1138
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Yingda Chen

 yarn-default.xml has yarn.application.classpath entry set to 
 $HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/share/hadoop/common/,$HADOOP_COMMON_HOME/share/hadoop/common/lib/,$HADOOP_HDFS_HOME/share/hadoop/hdfs/,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/,$HADOOP_YARN_HOME/share/hadoop/yarn/*,$HADOOP_YARN_HOME/share/hadoop/yarn/lib.
  It does not work on Windows which needs to be fixed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1553) Do not use HttpConfig.isSecure() in YARN

2014-01-02 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated YARN-1553:
-

Attachment: YARN-1553.001.patch

 Do not use HttpConfig.isSecure() in YARN
 

 Key: YARN-1553
 URL: https://issues.apache.org/jira/browse/YARN-1553
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Haohui Mai
 Attachments: YARN-1553.000.patch, YARN-1553.001.patch


 HDFS-5305 and related jira decide that each individual project will have 
 their own configuration on http policy. {{HttpConfig.isSecure}} is a global 
 static method which does not fit the design anymore. The same functionality 
 should be moved into the YARN code base.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1554) add an env variable for the YARN AM classpath

2014-01-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860661#comment-13860661
 ] 

Sandy Ryza commented on YARN-1554:
--

Is this a duplicate of YARN-973?

 add an env variable for the YARN AM classpath
 -

 Key: YARN-1554
 URL: https://issues.apache.org/jira/browse/YARN-1554
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.2.0
Reporter: Steve Loughran
Priority: Minor

 Currently YARN apps set up their classpath via the default value 
 {{YarnConfiguration.DEFAULT_YARN_APPLICATION_CLASSPATH}} or an overridden 
 property {{yarn.application.classpath}}. 
 If you don't have the classpath right, the AM won't start up. This means the 
 client needs to be explicitly configured with the CP.
 If the node manager exported the classpath property via an env variable 
 {{YARN_APPLICATION_CLASSPATH}}, then the classpath could be set up in the AM 
 simply by referencing that property, rather than hoping its setting is in 
 sync. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (YARN-1553) Do not use HttpConfig.isSecure() in YARN

2014-01-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned YARN-1553:
-

Assignee: Haohui Mai

 Do not use HttpConfig.isSecure() in YARN
 

 Key: YARN-1553
 URL: https://issues.apache.org/jira/browse/YARN-1553
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: YARN-1553.000.patch, YARN-1553.001.patch


 HDFS-5305 and related jira decide that each individual project will have 
 their own configuration on http policy. {{HttpConfig.isSecure}} is a global 
 static method which does not fit the design anymore. The same functionality 
 should be moved into the YARN code base.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1549) TestUnmanagedAMLauncher#testDSShell fails in trunk

2014-01-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860696#comment-13860696
 ] 

Hudson commented on YARN-1549:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4949 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4949/])
YARN-1549. Fixed a bug in ResourceManager's ApplicationMasterService that was 
causing unamanged AMs to not finish correctly. Contributed by haosdent. 
(vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554886)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/src/test/java/org/apache/hadoop/yarn/applications/unmanagedamlauncher/TestUnmanagedAMLauncher.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java


 TestUnmanagedAMLauncher#testDSShell fails in trunk
 --

 Key: YARN-1549
 URL: https://issues.apache.org/jira/browse/YARN-1549
 Project: Hadoop YARN
  Issue Type: Test
Affects Versions: 2.2.0
Reporter: Ted Yu
Assignee: haosdent
 Fix For: 2.4.0

 Attachments: YARN-1549.1.patch, YARN-1549.patch


 The following error is reproducible:
 {code}
 testDSShell(org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher)
   Time elapsed: 14.911 sec   ERROR!
 java.lang.RuntimeException: Failed to receive final expected state in 
 ApplicationReport, CurrentState=RUNNING, ExpectedStates=FINISHED,FAILED,KILLED
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.monitorApplication(UnmanagedAMLauncher.java:447)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.UnmanagedAMLauncher.run(UnmanagedAMLauncher.java:352)
   at 
 org.apache.hadoop.yarn.applications.unmanagedamlauncher.TestUnmanagedAMLauncher.testDSShell(TestUnmanagedAMLauncher.java:147)
 {code}
 See https://builds.apache.org/job/Hadoop-Yarn-trunk/435



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1493) Schedulers don't recognize apps separately from app-attempts

2014-01-02 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860738#comment-13860738
 ] 

Hudson commented on YARN-1493:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4951 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4951/])
YARN-1493. Changed ResourceManager and Scheduler interfacing to recognize 
app-attempts separately from apps. Contributed by Jian He. (vinodkv: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1554896)
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java
* 
/hadoop/common/trunk/hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/RMAppAttemptImpl.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/attempt/event/RMAppAttemptRejectedEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ActiveUsersManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerAppReport.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerAppUtils.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplication.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppAttemptAddedSchedulerEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/event/AppRemovedSchedulerEvent.java
* 

[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860766#comment-13860766
 ] 

Hadoop QA commented on YARN-1506:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621095/YARN-1506-v3.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2775//console

This message is automatically generated.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1553) Do not use HttpConfig.isSecure() in YARN

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860768#comment-13860768
 ] 

Hadoop QA commented on YARN-1553:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621133/YARN-1553.001.patch
  against trunk revision .

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2776//console

This message is automatically generated.

 Do not use HttpConfig.isSecure() in YARN
 

 Key: YARN-1553
 URL: https://issues.apache.org/jira/browse/YARN-1553
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: YARN-1553.000.patch, YARN-1553.001.patch


 HDFS-5305 and related jira decide that each individual project will have 
 their own configuration on http policy. {{HttpConfig.isSecure}} is a global 
 static method which does not fit the design anymore. The same functionality 
 should be moved into the YARN code base.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1297) Miscellaneous Fair Scheduler speedups

2014-01-02 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860792#comment-13860792
 ] 

Karthik Kambatla commented on YARN-1297:


First round of comments:
# Would be nice to see what the gains are corresponding to replacing 
ResourcePBImpl with SimpleResource. If it is not noticeable, it might be better 
to leave it as is.
# At a couple of places, instead of modifying the resource usage of a queue 
this way, it would be better to add a method to FSQueue that does this. 
{code}
+  Resources.addTo(cur.getResourceUsage(), container.getResource());
{code}
# I am surprised direct comparisons instead of DefaultResourceCalculator have a 
noticeable performance difference. Can we measure the gains due to this change, 
and drop it if none.
{code}
-  Resource minShare1 = Resources.min(RESOURCE_CALCULATOR, null,
-  s1.getMinShare(), s1.getDemand());
+  int minShare1 = Math.min(s1.getMinShare().getMemory(),
+  s1.getDemand().getMemory());
{code}
# I am not an expert, but I hear Math#signum is supposed to be optimized for 
performance. Just curious - how much did changing this help? 

 Miscellaneous Fair Scheduler speedups
 -

 Key: YARN-1297
 URL: https://issues.apache.org/jira/browse/YARN-1297
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-1297-1.patch, YARN-1297.patch, YARN-1297.patch


 I ran the Fair Scheduler's core scheduling loop through a profiler to and 
 identified a bunch of minimally invasive changes that can shave off a few 
 milliseconds.
 The main one is demoting a couple INFO log messages to DEBUG, which brought 
 my benchmark down from 16000 ms to 6000.
 A few others (which had way less of an impact) were
 * Most of the time in comparisons was being spent in Math.signum.  I switched 
 this to direct ifs and elses and it halved the percent of time spent in 
 comparisons.
 * I removed some unnecessary instantiations of Resource objects
 * I made it so that queues' usage wasn't calculated from the applications up 
 each time getResourceUsage was called.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1413) [YARN-321] AHS WebUI should server aggregated logs as well

2014-01-02 Thread Mayank Bansal (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Bansal updated YARN-1413:


Attachment: YARN-1413-5.patch

Thanks [~vinodkv] for review.

Updating the patch.

This change is calling AHS server so no nodes will be called.

Thanks,
Mayank

 [YARN-321] AHS WebUI should server aggregated logs as well
 --

 Key: YARN-1413
 URL: https://issues.apache.org/jira/browse/YARN-1413
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: YARN-1413-1.patch, YARN-1413-2.patch, YARN-1413-3.patch, 
 YARN-1413-4.patch, YARN-1413-5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM

2014-01-02 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1482:


Attachment: YARN-1482.3.patch

 WebApplicationProxy should be always-on w.r.t HA even if it is embedded in 
 the RM
 -

 Key: YARN-1482
 URL: https://issues.apache.org/jira/browse/YARN-1482
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch


 This way, even if an RM goes to standby mode, we can affect a redirect to the 
 active. And more importantly, users will not suddenly see all their links 
 stop working.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission

2014-01-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860887#comment-13860887
 ] 

Xuan Gong commented on YARN-1410:
-

Thanks for the comments. [~bikassaha], [~kkambatl]

bq.But I think we have a separate createApplication() in order to get an appId 
for which to request RM tokens so that those tokens can be inserted in the 
AppSubmitContext before app submission

Looks like that requesting RM tokens request does not need appId. But I agree 
that we still need createApplication(). By using this function, it can give us 
a global unique Id, we can use this Id to do several things, such as create 
JobId for mapreduce job, use it as part of Path to set up the local resource. 

For the solution of this ticket, I think the better way is to ask client to 
alway use the submitApplication() (adding comments on yarnClient api). In 
submitApplication(), we can check whether the appid is provided from ASC or 
not, if it does, we can use this appid (of course, need to check whether this 
appid is provided by current active rm or not) to submit the application. If 
not, we can ask one, then do the submission. 

 Handle client failover during 2 step client API's like app submission
 -

 Key: YARN-1410
 URL: https://issues.apache.org/jira/browse/YARN-1410
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-1410.1.patch


 App submission involves
 1) creating appId
 2) using that appId to submit an ApplicationSubmissionContext to the user.
 The client may have obtained an appId from an RM, the RM may have failed 
 over, and the client may submit the app to the new RM.
 Since the new RM has a different notion of cluster timestamp (used to create 
 app id) the new RM may reject the app submission resulting in unexpected 
 failure on the client side.
 The same may happen for other 2 step client API operations.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1413) [YARN-321] AHS WebUI should server aggregated logs as well

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860955#comment-13860955
 ] 

Hadoop QA commented on YARN-1413:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621155/YARN-1413-5.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2778//console

This message is automatically generated.

 [YARN-321] AHS WebUI should server aggregated logs as well
 --

 Key: YARN-1413
 URL: https://issues.apache.org/jira/browse/YARN-1413
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: YARN-1413-1.patch, YARN-1413-2.patch, YARN-1413-3.patch, 
 YARN-1413-4.patch, YARN-1413-5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1553) Do not use HttpConfig.isSecure() in YARN

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861028#comment-13861028
 ] 

Hadoop QA commented on YARN-1553:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621133/YARN-1553.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 6 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

  
org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps
  
org.apache.hadoop.yarn.server.resourcemanager.recovery.TestZKRMStateStore

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2777//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2777//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2777//console

This message is automatically generated.

 Do not use HttpConfig.isSecure() in YARN
 

 Key: YARN-1553
 URL: https://issues.apache.org/jira/browse/YARN-1553
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: YARN-1553.000.patch, YARN-1553.001.patch


 HDFS-5305 and related jira decide that each individual project will have 
 their own configuration on http policy. {{HttpConfig.isSecure}} is a global 
 static method which does not fit the design anymore. The same functionality 
 should be moved into the YARN code base.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1539) Queue admin ACLs should NOT be similar to submit-acls w.r.t hierarchy.

2014-01-02 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861051#comment-13861051
 ] 

Sandy Ryza commented on YARN-1539:
--

Verified that the Fair Scheduler has the same behavior.

 Queue admin ACLs should NOT be similar to submit-acls w.r.t hierarchy.
 --

 Key: YARN-1539
 URL: https://issues.apache.org/jira/browse/YARN-1539
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli

 Today, Queue admin ACLs are similar to submit-acls w.r.t hierarchy in that if 
 one has to be able to administer a queue, he/she should be an admin of all 
 the queues in the ancestry - an unnecessary burden.
 This was added in YARN-899 and I believe is wrong semantics as well as 
 implementation.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861075#comment-13861075
 ] 

Hadoop QA commented on YARN-1482:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621164/YARN-1482.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 3 
release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy:

  org.apache.hadoop.yarn.client.api.impl.TestYarnClient

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2779//testReport/
Release audit warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2779//artifact/trunk/patchprocess/patchReleaseAuditProblems.txt
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2779//console

This message is automatically generated.

 WebApplicationProxy should be always-on w.r.t HA even if it is embedded in 
 the RM
 -

 Key: YARN-1482
 URL: https://issues.apache.org/jira/browse/YARN-1482
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch


 This way, even if an RM goes to standby mode, we can affect a redirect to the 
 active. And more importantly, users will not suddenly see all their links 
 stop working.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits

2014-01-02 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1490:
--

Attachment: YARN-1490.1.patch

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-1490.1.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1551) Allow user-specified reason for killApplication

2014-01-02 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1551:


Attachment: YARN-1551.v03.patch

Fixing broken tests.

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1551.v01.patch, YARN-1551.v02.patch, 
 YARN-1551.v03.patch


 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits

2014-01-02 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned YARN-1490:
-

Assignee: Jian He  (was: Vinod Kumar Vavilapalli)

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He
 Attachments: YARN-1490.1.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits

2014-01-02 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861093#comment-13861093
 ] 

Jian He commented on YARN-1490:
---

- Create a field in AppSubmissionContext to indicate whether to clean the 
containers on AM failure or not.
- Copy the data structures(liveContainers etc.) inside 
SchedulerApplicationAttempt over in the case that new attempt is recovering the 
failed attempt’s scheduler info.
- Similarly, copy the needed data structures(finished Containers etc.) inside 
RMAppAttempt over in the case that new attempt is recovering the failed 
RMAppAttempt info.
- The failed attempt is changed to still receive container events and record 
the finished containers and new attempt is created with the reference of the 
objects of the previous attempt.
- The appAttempt data structure inside the schedulers are removed, only use 
SchedulerApplication.getCurrentAppAttempt to retrieve the current attempt.

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He
 Attachments: YARN-1490.1.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1038) LocalizationProtocolPBClientImpl RPC failing

2014-01-02 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861094#comment-13861094
 ] 

haosdent commented on YARN-1038:


A complete logs would help us to position the cause of this error.

 LocalizationProtocolPBClientImpl RPC failing
 

 Key: YARN-1038
 URL: https://issues.apache.org/jira/browse/YARN-1038
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.0.0
Reporter: Alejandro Abdelnur
Priority: Blocker

 Trying to run an MR job in trunk is failing with:
 {code}
 2013-08-06 22:24:21,498 WARN org.apache.hadoop.ipc.Client: interrupted 
 waiting to send rpc request to server
 java.lang.InterruptedException
   at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1279)
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1019)
   at org.apache.hadoop.ipc.Client.call(Client.java:1372)
   at org.apache.hadoop.ipc.Client.call(Client.java:1352)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
   at com.sun.proxy.$Proxy25.heartbeat(Unknown Source)
   at 
 org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.client.LocalizationProtocolPBClientImpl.heartbeat(LocalizationProtocolPBClientImpl.java:62)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.localizeFiles(ContainerLocalizer.java:250)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer.runLocalization(ContainerLocalizer.java:164)
   at 
 org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.startLocalizer(DefaultContainerExecutor.java:107)
   at 
 org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.run(ResourceLocalizationService.java:977)
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-1506:
-

Attachment: YARN-1506-v4.patch

Some new changes coming to the trunk make v3 patch stale. Re-sync to v4 patch.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch, YARN-1506-v4.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861102#comment-13861102
 ] 

Hadoop QA commented on YARN-1506:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621217/YARN-1506-v4.patch
  against trunk revision .

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2782//console

This message is automatically generated.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch, YARN-1506-v4.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1431) TestWebAppProxyServlet is failing on trunk

2014-01-02 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861107#comment-13861107
 ] 

haosdent commented on YARN-1431:


On my machine, this test is passed. The error log form surefire looks normal 
because of this code.
{code:java}
  URL wrongUrl = new URL(http://localhost:9099/proxy/app;);
{code}

 TestWebAppProxyServlet is failing on trunk
 --

 Key: YARN-1431
 URL: https://issues.apache.org/jira/browse/YARN-1431
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Omkar Vinit Joshi
Priority: Blocker

 Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.609 sec  
 FAILURE! - in org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet
 testWebAppProxyServerMainMethod(org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet)
   Time elapsed: 5.006 sec   ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.net.Inet4AddressImpl.getHostByAddr(Native Method)
   at java.net.InetAddress$1.getHostByAddr(InetAddress.java:881)
   at java.net.InetAddress.getHostFromNameService(InetAddress.java:560)
   at java.net.InetAddress.getCanonicalHostName(InetAddress.java:531)
   at 
 org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:227)
   at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:247)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.doSecureLogin(WebAppProxyServer.java:72)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.serviceInit(WebAppProxyServer.java:57)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.startServer(WebAppProxyServer.java:99)
   at 
 org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet.testWebAppProxyServerMainMethod(TestWebAppProxyServlet.java:187)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861108#comment-13861108
 ] 

Hadoop QA commented on YARN-1490:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621206/YARN-1490.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 15 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 4 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-tools/hadoop-sls hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

org.apache.hadoop.yarn.client.api.impl.TestNMClient
org.apache.hadoop.yarn.client.api.impl.TestAMRMClient

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2781//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-YARN-Build/2781//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2781//console

This message is automatically generated.

 RM should optionally not kill all containers when an ApplicationMaster exits
 

 Key: YARN-1490
 URL: https://issues.apache.org/jira/browse/YARN-1490
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Jian He
 Attachments: YARN-1490.1.patch


 This is needed to enable work-preserving AM restart. Some apps can chose to 
 reconnect with old running containers, some may not want to. This should be 
 an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (YARN-1431) TestWebAppProxyServlet is failing on trunk

2014-01-02 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned YARN-1431:
--

Assignee: haosdent

 TestWebAppProxyServlet is failing on trunk
 --

 Key: YARN-1431
 URL: https://issues.apache.org/jira/browse/YARN-1431
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Omkar Vinit Joshi
Assignee: haosdent
Priority: Blocker

 Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.609 sec  
 FAILURE! - in org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet
 testWebAppProxyServerMainMethod(org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet)
   Time elapsed: 5.006 sec   ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.net.Inet4AddressImpl.getHostByAddr(Native Method)
   at java.net.InetAddress$1.getHostByAddr(InetAddress.java:881)
   at java.net.InetAddress.getHostFromNameService(InetAddress.java:560)
   at java.net.InetAddress.getCanonicalHostName(InetAddress.java:531)
   at 
 org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:227)
   at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:247)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.doSecureLogin(WebAppProxyServer.java:72)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.serviceInit(WebAppProxyServer.java:57)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.startServer(WebAppProxyServer.java:99)
   at 
 org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet.testWebAppProxyServerMainMethod(TestWebAppProxyServlet.java:187)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1431) TestWebAppProxyServlet is failing on trunk

2014-01-02 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861113#comment-13861113
 ] 

haosdent commented on YARN-1431:


The real cause of timeout exception is this code. Maybe it depends on your 
machine.

{code:java}
  mainServer  = WebAppProxyServer.startServer(conf);
{code}

{code}
java.lang.Exception: test timed out after 5000 milliseconds
at java.net.Inet4AddressImpl.getHostByAddr(Native Method)
{code}

 TestWebAppProxyServlet is failing on trunk
 --

 Key: YARN-1431
 URL: https://issues.apache.org/jira/browse/YARN-1431
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.3.0
Reporter: Omkar Vinit Joshi
Priority: Blocker

 Tests run: 2, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 6.609 sec  
 FAILURE! - in org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet
 testWebAppProxyServerMainMethod(org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet)
   Time elapsed: 5.006 sec   ERROR!
 java.lang.Exception: test timed out after 5000 milliseconds
   at java.net.Inet4AddressImpl.getHostByAddr(Native Method)
   at java.net.InetAddress$1.getHostByAddr(InetAddress.java:881)
   at java.net.InetAddress.getHostFromNameService(InetAddress.java:560)
   at java.net.InetAddress.getCanonicalHostName(InetAddress.java:531)
   at 
 org.apache.hadoop.security.SecurityUtil.getLocalHostName(SecurityUtil.java:227)
   at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:247)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.doSecureLogin(WebAppProxyServer.java:72)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.serviceInit(WebAppProxyServer.java:57)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
   at 
 org.apache.hadoop.yarn.server.webproxy.WebAppProxyServer.startServer(WebAppProxyServer.java:99)
   at 
 org.apache.hadoop.yarn.server.webproxy.TestWebAppProxyServlet.testWebAppProxyServerMainMethod(TestWebAppProxyServlet.java:187)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Attachment: YARN-1529.v02.patch

Moved YARN-changes from MAPREDUCE-5696

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-304) RM Tracking Links for purged applications needs a long-term solution

2014-01-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-304:
-

Issue Type: Sub-task  (was: Improvement)
Parent: YARN-321

 RM Tracking Links for purged applications needs a long-term solution
 

 Key: YARN-304
 URL: https://issues.apache.org/jira/browse/YARN-304
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 3.0.0, 0.23.5
Reporter: Derek Dagit

 This JIRA is intended to track a proper long-term fix for the issue described 
 in YARN-285.
 The following is from the original description:
 As applications complete, the RM tracks their IDs in a completed list. This 
 list is routinely truncated to limit the total number of application 
 remembered by the RM.
 When a user clicks the History for a job, either the browser is redirected to 
 the application's tracking link obtained from the stored application 
 instance. But when the application has been purged from the RM, an error is 
 displayed.
 In very busy clusters the rate at which applications complete can cause 
 applications to be purged from the RM's internal list within hours, which 
 breaks the proxy URLs users have saved for their jobs.
 We would like the RM to provide valid tracking links persist so that users 
 are not frustrated by broken links.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM

2014-01-02 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1482:


Attachment: YARN-1482.4.patch

 WebApplicationProxy should be always-on w.r.t HA even if it is embedded in 
 the RM
 -

 Key: YARN-1482
 URL: https://issues.apache.org/jira/browse/YARN-1482
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, 
 YARN-1482.4.patch


 This way, even if an RM goes to standby mode, we can affect a redirect to the 
 active. And more importantly, users will not suddenly see all their links 
 stop working.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM

2014-01-02 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861150#comment-13861150
 ] 

Xuan Gong commented on YARN-1482:
-

fix -1 on release audit

Testcase failure is un-related. Will open a ticket to track this test case 
failure

 WebApplicationProxy should be always-on w.r.t HA even if it is embedded in 
 the RM
 -

 Key: YARN-1482
 URL: https://issues.apache.org/jira/browse/YARN-1482
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, 
 YARN-1482.4.patch


 This way, even if an RM goes to standby mode, we can affect a redirect to the 
 active. And more importantly, users will not suddenly see all their links 
 stop working.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1399) Allow users to annotate an application with multiple tags

2014-01-02 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861156#comment-13861156
 ] 

Vinod Kumar Vavilapalli commented on YARN-1399:
---

Agree that oozie should NOT kill apps as privileged user in any case.

bq. I am open to enforcing specifying either a user/queue when searching for a 
tag. However, in principle, this could happen with application-types as well: a 
user could submit a number of random YARN applications with type MAPREDUCE. I 
thought the way we were restricting exposing these (tags/types) was through 
ACLs on a secure cluster.
Agreed about application-types too but two wrong things don't make a ... But I 
see your reasoning. Solving this faking of application-types is hard.

I think that it boils down to
 - Documenting very clearly that types and tags can clash with other user's 
inputs and so need to be judiciously used.
 - Having options when listing apps filtered by application-type or by 
application-tags (app-type is a specific kind of tag, so clearly the later 
subsumes the former). The important bit is that the default option be as 
restrictive as possible.

Today getApplications() API returns the list of ALL applications whether 
accessible or not. That's a bad default that I thought I filed a ticket about. 
By default we should only return the apps of the current user. Then there 
should be options to list all accessible apps (apps with view-acl), and then 
finally all apps across all users. We could follow something similar for this 
JIRA and set it as a precedent for fixing other existing issues with the 
default listing and the listing against app-types?

May be we should punt on the last one (listing ALL apps) altogether so that 
user's can only obtain lists of apps that they have access to. This does break 
the existing getApplications() API which we can leave as is - in any case, 
information about all the apps that are not accessible is completely blacked 
out, so it just serves as a simple listing.

 Allow users to annotate an application with multiple tags
 -

 Key: YARN-1399
 URL: https://issues.apache.org/jira/browse/YARN-1399
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Zhijie Shen
Assignee: Zhijie Shen

 Nowadays, when submitting an application, users can fill the applicationType 
 field to facilitate searching it later. IMHO, it's good to accept multiple 
 tags to allow users to describe their applications in multiple aspects, 
 including the application type. Then, searching by tags may be more efficient 
 for users to reach their desired application collection. It's pretty much 
 like the tag system of online photo/video/music and etc.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1413) [YARN-321] AHS WebUI should server aggregated logs as well

2014-01-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1413:
--

Attachment: YARN-1413-6.patch

The original patch looked patch. Updating the same patch without the extraneous 
pom.xml changes.

 [YARN-321] AHS WebUI should server aggregated logs as well
 --

 Key: YARN-1413
 URL: https://issues.apache.org/jira/browse/YARN-1413
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Attachments: YARN-1413-1.patch, YARN-1413-2.patch, YARN-1413-3.patch, 
 YARN-1413-4.patch, YARN-1413-5.patch, YARN-1413-6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (YARN-1413) [YARN-321] AHS WebUI should server aggregated logs as well

2014-01-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-1413.
---

   Resolution: Fixed
Fix Version/s: YARN-321
 Hadoop Flags: Reviewed

Committed this to YARN-321 branch. Thanks Mayank!

It'll be great to have tests for this and more. Will identify areas missing 
test-coverage and file tickets.

 [YARN-321] AHS WebUI should server aggregated logs as well
 --

 Key: YARN-1413
 URL: https://issues.apache.org/jira/browse/YARN-1413
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Zhijie Shen
Assignee: Mayank Bansal
 Fix For: YARN-321

 Attachments: YARN-1413-1.patch, YARN-1413-2.patch, YARN-1413-3.patch, 
 YARN-1413-4.patch, YARN-1413-5.patch, YARN-1413-6.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1551) Allow user-specified reason for killApplication

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861163#comment-13861163
 ] 

Hadoop QA commented on YARN-1551:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621207/YARN-1551.v03.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient
 hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.client.api.impl.TestYarnClient

  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2780//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2780//console

This message is automatically generated.

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1551.v01.patch, YARN-1551.v02.patch, 
 YARN-1551.v03.patch


 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861167#comment-13861167
 ] 

Hadoop QA commented on YARN-1529:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621228/YARN-1529.v02.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2783//console

This message is automatically generated.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (YARN-1555) [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.*

2014-01-02 Thread Vinod Kumar Vavilapalli (JIRA)
Vinod Kumar Vavilapalli created YARN-1555:
-

 Summary: [YARN-321] Failing tests in 
org.apache.hadoop.yarn.server.applicationhistoryservice.*
 Key: YARN-1555
 URL: https://issues.apache.org/jira/browse/YARN-1555
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli


Several tests are failing on the latest YARN-321 branch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861171#comment-13861171
 ] 

Hadoop QA commented on YARN-1482:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621230/YARN-1482.4.patch
  against trunk revision .

{color:red}-1 patch{color}.  Trunk compilation may be broken.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2785//console

This message is automatically generated.

 WebApplicationProxy should be always-on w.r.t HA even if it is embedded in 
 the RM
 -

 Key: YARN-1482
 URL: https://issues.apache.org/jira/browse/YARN-1482
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, 
 YARN-1482.4.patch


 This way, even if an RM goes to standby mode, we can affect a redirect to the 
 active. And more importantly, users will not suddenly see all their links 
 stop working.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1555) [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.*

2014-01-02 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-1555:
--

Attachment: YARN-1555-20140102.txt

Simple fixes to the three failing tests.

 [YARN-321] Failing tests in 
 org.apache.hadoop.yarn.server.applicationhistoryservice.*
 -

 Key: YARN-1555
 URL: https://issues.apache.org/jira/browse/YARN-1555
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-1555-20140102.txt


 Several tests are failing on the latest YARN-321 branch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1555) [YARN-321] Failing tests in org.apache.hadoop.yarn.server.applicationhistoryservice.*

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861178#comment-13861178
 ] 

Hadoop QA commented on YARN-1555:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12621244/YARN-1555-20140102.txt
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2786//console

This message is automatically generated.

 [YARN-321] Failing tests in 
 org.apache.hadoop.yarn.server.applicationhistoryservice.*
 -

 Key: YARN-1555
 URL: https://issues.apache.org/jira/browse/YARN-1555
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli
 Attachments: YARN-1555-20140102.txt


 Several tests are failing on the latest YARN-321 branch.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1506) Replace set resource change on RMNode/SchedulerNode directly with event notification.

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861186#comment-13861186
 ] 

Hadoop QA commented on YARN-1506:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621217/YARN-1506-v4.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2784//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2784//console

This message is automatically generated.

 Replace set resource change on RMNode/SchedulerNode directly with event 
 notification.
 -

 Key: YARN-1506
 URL: https://issues.apache.org/jira/browse/YARN-1506
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager, scheduler
Reporter: Junping Du
Assignee: Junping Du
Priority: Blocker
 Attachments: YARN-1506-v1.patch, YARN-1506-v2.patch, 
 YARN-1506-v3.patch, YARN-1506-v4.patch


 According to Vinod's comments on YARN-312 
 (https://issues.apache.org/jira/browse/YARN-312?focusedCommentId=13846087page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13846087),
  we should replace RMNode.setResourceOption() with some resource change event.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1551) Allow user-specified reason for killApplication

2014-01-02 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861188#comment-13861188
 ] 

Gera Shegalov commented on YARN-1551:
-

Test failure is unrelated. TestYarnClient.testAMMRTokens fails on trunk without 
patch as well.

 Allow user-specified reason for killApplication
 ---

 Key: YARN-1551
 URL: https://issues.apache.org/jira/browse/YARN-1551
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1551.v01.patch, YARN-1551.v02.patch, 
 YARN-1551.v03.patch


 This completes MAPREDUCE-5648



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Attachment: (was: YARN-1529.v02.patch)

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Gera Shegalov (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gera Shegalov updated YARN-1529:


Attachment: YARN-1529.v02.patch

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-01-02 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861194#comment-13861194
 ] 

Sunil G commented on YARN-1398:
---

As per YARN-325, this issue was fixed before 2.1.0. But in 2.1.0, we can see 
like below
ParentQueue.completedContainer while holding a lock on the LeafQueue.

This can cause same issue which is mentioned in YARN-325.

Is there any reason why the ParentQueue.completedContainer call is added back 
with holding the lock on leaf queue.
Because as per the YARN-325 fix, the fix was to remove the same. And this has 
mentioned in the comments too.

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Priority: Critical

 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861211#comment-13861211
 ] 

Hadoop QA commented on YARN-1529:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621247/YARN-1529.v02.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 5 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2787//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2787//console

This message is automatically generated.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1398) Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo and completedConatiner call

2014-01-02 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861236#comment-13861236
 ] 

Sunil G commented on YARN-1398:
---

During YARN-569 defect fix for adding a scheduling policy, the below code 
segment is added back in leafqueue lock segment

  // Inform the parent queue
  getParent().completedContainer(clusterResource, application,
  node, rmContainer, null, event, this);

Pls let know whether this call is really required in the synchronized block of 
Leafqueue completedContainer call.

 Deadlock in capacity scheduler leaf queue and parent queue for getQueueInfo 
 and completedConatiner call
 ---

 Key: YARN-1398
 URL: https://issues.apache.org/jira/browse/YARN-1398
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.2.0
Reporter: Sunil G
Priority: Critical

 getQueueInfo in parentQueue will call  child.getQueueInfo().
 This will try acquire the leaf queue lock over parent queue lock.
 Now at same time if a completedContainer call comes and acquired LeafQueue 
 lock and it will wait for ParentQueue's completedConatiner call.
 This lock usage is not in synchronous and can lead to deadlock.
 With JCarder, this is showing as a potential deadlock scenario.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1482) WebApplicationProxy should be always-on w.r.t HA even if it is embedded in the RM

2014-01-02 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-1482:


Attachment: YARN-1482.4.patch

 WebApplicationProxy should be always-on w.r.t HA even if it is embedded in 
 the RM
 -

 Key: YARN-1482
 URL: https://issues.apache.org/jira/browse/YARN-1482
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Xuan Gong
 Attachments: YARN-1482.1.patch, YARN-1482.2.patch, YARN-1482.3.patch, 
 YARN-1482.4.patch, YARN-1482.4.patch


 This way, even if an RM goes to standby mode, we can affect a redirect to the 
 active. And more importantly, users will not suddenly see all their links 
 stop working.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (YARN-1166) YARN 'appsFailed' metric should be of type 'counter'

2014-01-02 Thread Zhijie Shen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhijie Shen updated YARN-1166:
--

Attachment: YARN-1166.6.patch

Given YARN-1493 separates app and app-attempt events, I created a new patch 
which not only change 'appsFailed' from MutableGaugeInt to MutableCounterInt, 
but also bind the change of the counter with the correct event:

1. SubmitApp: increment appsSubmitted
2. SubmitAppAttempt: increment appsPending
3. RunAppAttempt: decrement appsPending, increment appsRunning
4. FinishAppAttempt: decrement appsRunning
5. FinishApp: increment appsCompleted/appsKilled/appsFailed

1,2 and 5 are binded to the app related events, and are always increasing, 
while 3 and 4 are binded to the app-attempt related events, and can increase 
and decrease.

 YARN 'appsFailed' metric should be of type 'counter'
 

 Key: YARN-1166
 URL: https://issues.apache.org/jira/browse/YARN-1166
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.1.0-beta
Reporter: Srimanth Gunturi
Assignee: Zhijie Shen
Priority: Blocker
 Attachments: YARN-1166.2.patch, YARN-1166.3.patch, YARN-1166.4.patch, 
 YARN-1166.5.patch, YARN-1166.6.patch, YARN-1166.patch


 Currently in YARN's queue metrics, the cumulative metric 'appsFailed' is of 
 type 'guage' - which means the exact value will be reported. 
 All other cumulative queue metrics (AppsSubmitted, AppsCompleted, AppsKilled) 
 are all of type 'counter' - meaning Ganglia will use slope to provide deltas 
 between time-points.
 To be consistent, AppsFailed metric should also be of type 'counter'. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)