[jira] [Updated] (YARN-1696) Document RM HA

2014-03-31 Thread Arun C Murthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated YARN-1696:


Target Version/s: 2.4.1  (was: 2.4.0)

 Document RM HA
 --

 Key: YARN-1696
 URL: https://issues.apache.org/jira/browse/YARN-1696
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: YARN-1696.2.patch, yarn-1696-1.patch


 Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
 required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-31 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955015#comment-13955015
 ] 

Arun C Murthy commented on YARN-1696:
-

[~kasha] - I'm almost done with rc0, moving this to 2.4.1 - if we need to spin 
rc1 we can get this in. Else, we can manually put this doc on the site when 
ready for 2.4.0. Thanks.

 Document RM HA
 --

 Key: YARN-1696
 URL: https://issues.apache.org/jira/browse/YARN-1696
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: YARN-1696.2.patch, yarn-1696-1.patch


 Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
 required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol

2014-03-31 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated YARN-1879:
-

Attachment: YARN-1879.1.patch

 Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
 ---

 Key: YARN-1879
 URL: https://issues.apache.org/jira/browse/YARN-1879
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Tsuyoshi OZAWA
Priority: Critical
 Attachments: YARN-1879.1.patch, YARN-1879.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1893) Make ApplicationMasterProtocol#allocate AtMostOnce

2014-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955097#comment-13955097
 ] 

Hudson commented on YARN-1893:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #525 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/525/])
YARN-1893. Mark AtMostOnce annotation to ApplicationMasterProtocol#allocate. 
Contributed by Xuan Gong. (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1583203)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationMasterProtocol.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationClientProtocolOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java


 Make ApplicationMasterProtocol#allocate AtMostOnce
 --

 Key: YARN-1893
 URL: https://issues.apache.org/jira/browse/YARN-1893
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.4.0

 Attachments: YARN-1893.1.patch, YARN-1893.1.patch, YARN-1893.2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1893) Make ApplicationMasterProtocol#allocate AtMostOnce

2014-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955190#comment-13955190
 ] 

Hudson commented on YARN-1893:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1717 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1717/])
YARN-1893. Mark AtMostOnce annotation to ApplicationMasterProtocol#allocate. 
Contributed by Xuan Gong. (jianhe: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1583203)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationMasterProtocol.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationClientProtocolOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java


 Make ApplicationMasterProtocol#allocate AtMostOnce
 --

 Key: YARN-1893
 URL: https://issues.apache.org/jira/browse/YARN-1893
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Xuan Gong
Assignee: Xuan Gong
Priority: Blocker
 Fix For: 2.4.0

 Attachments: YARN-1893.1.patch, YARN-1893.1.patch, YARN-1893.2.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1696) Document RM HA

2014-03-31 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955331#comment-13955331
 ] 

Karthik Kambatla commented on YARN-1696:


[~acmurthy] - sorry, I was not checking email over the weekend. I can get to 
this today. Was caught up with other things and given there were other 
blockers, didn't rush on this. 

 Document RM HA
 --

 Key: YARN-1696
 URL: https://issues.apache.org/jira/browse/YARN-1696
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.3.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla
Priority: Blocker
 Attachments: YARN-1696.2.patch, yarn-1696-1.patch


 Add documentation for RM HA. Marking this a blocker for 2.4 as this is 
 required to call RM HA Stable and ready for public consumption. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-808) ApplicationReport does not clearly tell that the attempt is running or not

2014-03-31 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong resolved YARN-808.


Resolution: Won't Fix

Close this ticket as won't fix. Because we already have apis/cli to get 
ApplicationAttemptReport, so we do not need to expose it with ApplicationReport.

 ApplicationReport does not clearly tell that the attempt is running or not
 --

 Key: YARN-808
 URL: https://issues.apache.org/jira/browse/YARN-808
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-808.1.patch


 When an app attempt fails and is being retried, ApplicationReport immediately 
 gives the new attemptId and non-null values of host etc. There is no way for 
 clients to know that the attempt is running other than connecting to it and 
 timing out on invalid host. Solution would be to expose the attempt state or 
 return a null value for host instead of N/A



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1763) Handle RM failovers during the submitApplication call.

2014-03-31 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955426#comment-13955426
 ] 

Xuan Gong commented on YARN-1763:
-

already fixed with YARN-1521

 Handle RM failovers during the submitApplication call.
 --

 Key: YARN-1763
 URL: https://issues.apache.org/jira/browse/YARN-1763
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Xuan Gong
Assignee: Xuan Gong





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-808) ApplicationReport does not clearly tell that the attempt is running or not

2014-03-31 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955438#comment-13955438
 ] 

Bikas Saha commented on YARN-808:
-

We should atleast change the app report response to include invalid value of 
host and port in the response when there host and port are not ready. Currently 
we return a value of N/A for host which is confusing since the non-null 
string could be a valid host. We should return null for host and some -ve 
number for the port.

 ApplicationReport does not clearly tell that the attempt is running or not
 --

 Key: YARN-808
 URL: https://issues.apache.org/jira/browse/YARN-808
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-808.1.patch


 When an app attempt fails and is being retried, ApplicationReport immediately 
 gives the new attemptId and non-null values of host etc. There is no way for 
 clients to know that the attempt is running other than connecting to it and 
 timing out on invalid host. Solution would be to expose the attempt state or 
 return a null value for host instead of N/A



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Assigned] (YARN-1892) Excessive logging in RM

2014-03-31 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned YARN-1892:
-

Assignee: Jian He

 Excessive logging in RM
 ---

 Key: YARN-1892
 URL: https://issues.apache.org/jira/browse/YARN-1892
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siddharth Seth
Assignee: Jian He
Priority: Minor

 Mostly in the CS I believe
 {code}
  INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1395435468498_0011 reserved container 
 container_1395435468498_0011_01_000213 on node host:  #containers=5 
 available=4096 used=20960, currently has 1 at priority 4; currentReservation 
 4096
 {code}
 {code}
 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 hive2 usedResources: memory:20480, vCores:5 clusterResources: 
 memory:81920, vCores:16 currentCapacity 0.25 required memory:4096, 
 vCores:1 potentialNewCapacity: 0.255 (  max-capacity: 0.25)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1892) Excessive logging in RM

2014-03-31 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-1892:
--

Attachment: YARN-1892.1.patch

Simple patch to clean some CS loggings. These logs will become more excessive 
if async scheduling is enabled, log per 5ms cycle by default.

Move a few redundant logs regarding container reservation to debug level.
Fix a few log formats and contents.

 Excessive logging in RM
 ---

 Key: YARN-1892
 URL: https://issues.apache.org/jira/browse/YARN-1892
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siddharth Seth
Assignee: Jian He
Priority: Minor
 Attachments: YARN-1892.1.patch


 Mostly in the CS I believe
 {code}
  INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1395435468498_0011 reserved container 
 container_1395435468498_0011_01_000213 on node host:  #containers=5 
 available=4096 used=20960, currently has 1 at priority 4; currentReservation 
 4096
 {code}
 {code}
 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 hive2 usedResources: memory:20480, vCores:5 clusterResources: 
 memory:81920, vCores:16 currentCapacity 0.25 required memory:4096, 
 vCores:1 potentialNewCapacity: 0.255 (  max-capacity: 0.25)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()

2014-03-31 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-1870:
--

Assignee: Fengdong Yu

 FileInputStream is not closed in 
 ProcfsBasedProcessTree#constructProcessSMAPInfo()
 --

 Key: YARN-1870
 URL: https://issues.apache.org/jira/browse/YARN-1870
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Fengdong Yu
Priority: Minor
 Attachments: YARN-1870.patch


 {code}
   ListString lines = IOUtils.readLines(new FileInputStream(file));
 {code}
 FileInputStream is not closed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-808) ApplicationReport does not clearly tell that the attempt is running or not

2014-03-31 Thread Zhijie Shen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955513#comment-13955513
 ] 

Zhijie Shen commented on YARN-808:
--

IMHO, it's a different issue. The N/A string is not just restricted in host, 
but diagnostics, tracking url and etc, and not in ApplicationReport, but in 
ApplicationAttemptReport and ContainerReport.

I think the good thing is to decouple the data and the display. While these 
string fields should keep null or empty for those programmatically get the 
reports to easily validate them, while webUI and CLI, who ware actually the 
consumers of the reports, should check whether these fields are null or empty, 
and display N/A when necessary.

 ApplicationReport does not clearly tell that the attempt is running or not
 --

 Key: YARN-808
 URL: https://issues.apache.org/jira/browse/YARN-808
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.1.0-beta
Reporter: Bikas Saha
Assignee: Xuan Gong
 Attachments: YARN-808.1.patch


 When an app attempt fails and is being retried, ApplicationReport immediately 
 gives the new attemptId and non-null values of host etc. There is no way for 
 clients to know that the attempt is running other than connecting to it and 
 timing out on invalid host. Solution would be to expose the attempt state or 
 return a null value for host instead of N/A



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1892) Excessive logging in RM

2014-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955544#comment-13955544
 ] 

Hadoop QA commented on YARN-1892:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12637891/YARN-1892.1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3492//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3492//console

This message is automatically generated.

 Excessive logging in RM
 ---

 Key: YARN-1892
 URL: https://issues.apache.org/jira/browse/YARN-1892
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siddharth Seth
Assignee: Jian He
Priority: Minor
 Attachments: YARN-1892.1.patch


 Mostly in the CS I believe
 {code}
  INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt:
  Application application_1395435468498_0011 reserved container 
 container_1395435468498_0011_01_000213 on node host:  #containers=5 
 available=4096 used=20960, currently has 1 at priority 4; currentReservation 
 4096
 {code}
 {code}
 INFO 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: 
 hive2 usedResources: memory:20480, vCores:5 clusterResources: 
 memory:81920, vCores:16 currentCapacity 0.25 required memory:4096, 
 vCores:1 potentialNewCapacity: 0.255 (  max-capacity: 0.25)
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.

2014-03-31 Thread Ming Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated YARN-221:
-

Attachment: YARN-221-trunk-v2.patch

Here is the patch to support log aggregation sampling at yarn layer. Yarn 
applications can choose to override the default behavior. Without any change at 
MR layer to specify per-container log aggregation policy, yarn log aggregation 
sampling policy at cluster level will be applied.

 NM should provide a way for AM to tell it not to aggregate logs.
 

 Key: YARN-221
 URL: https://issues.apache.org/jira/browse/YARN-221
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Robert Joseph Evans
Assignee: Chris Trezzo
 Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch


 The NodeManager should provide a way for an AM to tell it that either the 
 logs should not be aggregated, that they should be aggregated with a high 
 priority, or that they should be aggregated but with a lower priority.  The 
 AM should be able to do this in the ContainerLaunch context to provide a 
 default value, but should also be able to update the value when the container 
 is released.
 This would allow for the NM to not aggregate logs in some cases, and avoid 
 connection to the NN at all.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-85) Allow per job log aggregation configuration

2014-03-31 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955601#comment-13955601
 ] 

Ming Ma commented on YARN-85:
-

Regarding Seth's comment of container exit status is not necessarily an 
indication of whether a task completes successfully or not, 
https://issues.apache.org/jira/browse/MAPREDUCE-5465 should fix the issue.

 Allow per job log aggregation configuration
 ---

 Key: YARN-85
 URL: https://issues.apache.org/jira/browse/YARN-85
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Siddharth Seth
Assignee: Chris Trezzo
Priority: Critical

 Currently, if log aggregation is enabled for a cluster - logs for all jobs 
 will be aggregated - leading to a whole bunch of files on hdfs which users 
 may not want.
 Users should be able to control this along with the aggregation policy - 
 failed only, all, etc.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.

2014-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955678#comment-13955678
 ] 

Hadoop QA commented on YARN-221:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12637905/YARN-221-trunk-v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3493//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3493//console

This message is automatically generated.

 NM should provide a way for AM to tell it not to aggregate logs.
 

 Key: YARN-221
 URL: https://issues.apache.org/jira/browse/YARN-221
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Reporter: Robert Joseph Evans
Assignee: Chris Trezzo
 Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch


 The NodeManager should provide a way for an AM to tell it that either the 
 logs should not be aggregated, that they should be aggregated with a high 
 priority, or that they should be aggregated but with a lower priority.  The 
 AM should be able to do this in the ContainerLaunch context to provide a 
 default value, but should also be able to update the value when the container 
 is released.
 This would allow for the NM to not aggregate logs in some cases, and avoid 
 connection to the NN at all.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-31 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955762#comment-13955762
 ] 

Sandy Ryza commented on YARN-1889:
--

+1

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Priority: Minor
  Labels: reviewed
 Attachments: YARN-1889.patch, YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after NodeUpdate processing:);
 printGCStats();
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-31 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1889:
-

Assignee: Hong Zhiguo

 avoid creating new objects on each fair scheduler call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Minor
  Labels: reviewed
 Attachments: YARN-1889.patch, YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 printGCStats();
 int hb_num = 5000;
 long start = System.nanoTime();
 for (int i = 0; i  hb_num; ++i) {
   String host = String.format(192.1.%d.%d, i/256, i%256);
   RMNode node =
   MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
 host);
   NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
   scheduler.handle(nodeEvent);
 }
 long end = System.nanoTime();
 System.out.printf(processing time for a NodeUpdate in average: %d us\n,
 (end - start)/(hb_num * 1000));
 System.out.println(GC stats after NodeUpdate processing:);
 printGCStats();
   }



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator

2014-03-31 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-1889:
-

Description: 
In fair scheduler, in each scheduling attempt, a full sort is
performed on List of AppSchedulable, which invokes Comparator.compare
method many times. Both FairShareComparator and DRFComparator call
AppSchedulable.getWeights, and AppSchedulable.getPriority.

A new ResourceWeights object is allocated on each call of getWeights,
and the same for getPriority. This introduces a lot of pressure to
GC because these methods are called very very frequently.

Below test case shows improvement on performance and GC behaviour. The results 
show that the GC pressure during processing NodeUpdate is recuded half by this 
patch.

The code to show the improvement: (Add it to TestFairScheduler.java)

{code}
import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
  public void printGCStats() {
long totalGarbageCollections = 0;
long garbageCollectionTime = 0;

for(GarbageCollectorMXBean gc :
  ManagementFactory.getGarbageCollectorMXBeans()) {
  long count = gc.getCollectionCount();
  if(count = 0) {
totalGarbageCollections += count;
  }

  long time = gc.getCollectionTime();
  if(time = 0) {
garbageCollectionTime += time;
  }
}

System.out.println(Total Garbage Collections: 
+ totalGarbageCollections);
System.out.println(Total Garbage Collection Time (ms): 
+ garbageCollectionTime);
  }

  @Test
  public void testImpactOnGC() throws Exception {
scheduler.reinitialize(conf, resourceManager.getRMContext());

// Add nodes
int numNode = 1;

for (int i = 0; i  numNode; ++i) {
String host = String.format(192.1.%d.%d, i/256, i%256);
RMNode node =
MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
host);
NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
scheduler.handle(nodeEvent);
assertEquals(1024 * 64 * (i+1), 
scheduler.getClusterCapacity().getMemory());
}
assertEquals(numNode, scheduler.getNumClusterNodes());
assertEquals(1024 * 64 * numNode, 
scheduler.getClusterCapacity().getMemory());

// add apps, each app has 100 containers.
int minReqSize =
FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
int numApp = 8000;
int priority = 1;

for (int i = 1; i  numApp + 1; ++i) {
ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
attemptId.getApplicationId(), queue1, user1);
scheduler.handle(appAddedEvent);
AppAttemptAddedSchedulerEvent attemptAddedEvent =
new AppAttemptAddedSchedulerEvent(attemptId, false);
scheduler.handle(attemptAddedEvent);
createSchedulingRequestExistingApplication(minReqSize * 2, 1, priority, 
attemptId);
}
scheduler.update();

assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
true)
.getRunnableAppSchedulables().size());

System.out.println(GC stats before NodeUpdate processing:);
printGCStats();
int hb_num = 5000;
long start = System.nanoTime();
for (int i = 0; i  hb_num; ++i) {
  String host = String.format(192.1.%d.%d, i/256, i%256);
  RMNode node =
  MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, 
host);
  NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node);
  scheduler.handle(nodeEvent);
}
long end = System.nanoTime();

System.out.printf(processing time for a NodeUpdate in average: %d us\n,
(end - start)/(hb_num * 1000));

System.out.println(GC stats after NodeUpdate processing:);
printGCStats();
  }

{code}


  was:
In fair scheduler, in each scheduling attempt, a full sort is
performed on List of AppSchedulable, which invokes Comparator.compare
method many times. Both FairShareComparator and DRFComparator call
AppSchedulable.getWeights, and AppSchedulable.getPriority.

A new ResourceWeights object is allocated on each call of getWeights,
and the same for getPriority. This introduces a lot of pressure to
GC because these methods are called very very frequently.

Below test case shows improvement on performance and GC behaviour. The results 
show that the GC pressure during processing NodeUpdate is recuded half by this 
patch.

The code to show the improvement: (Add it to TestFairScheduler.java)

import java.lang.management.GarbageCollectorMXBean;
import java.lang.management.ManagementFactory;
  public void printGCStats() {
long totalGarbageCollections = 0;
long garbageCollectionTime = 0;

for(GarbageCollectorMXBean gc :
  ManagementFactory.getGarbageCollectorMXBeans()) {
  long 

[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol

2014-03-31 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955817#comment-13955817
 ] 

Tsuyoshi OZAWA commented on YARN-1879:
--

I'm adding RetryCache support and tests to 
registerApplicationMaster()/unregisterApplicationMaster(). Please let me know 
if you have more appropriate idea. 

 Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
 ---

 Key: YARN-1879
 URL: https://issues.apache.org/jira/browse/YARN-1879
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Tsuyoshi OZAWA
Priority: Critical
 Attachments: YARN-1879.1.patch, YARN-1879.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (YARN-904) Enable multiple QOP for ResourceManager

2014-03-31 Thread Benoy Antony (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoy Antony resolved YARN-904.
---

Resolution: Duplicate

resolved via HDFS_5910 and HADOOP-10221

 Enable multiple QOP for ResourceManager
 ---

 Key: YARN-904
 URL: https://issues.apache.org/jira/browse/YARN-904
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Benoy Antony
 Attachments: yarn-904.patch


 Currently ResourceManager supports only single QOP.
 The feature makes ResourceManager listen on two ports for RPC. 
 One RPC port supports only authentication , other RPC port supports privacy.
 Please see HADOOP-9709 for general requirements.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1889) In Fair Scheduler, avoid creating objects on each call to AppSchedulable comparator

2014-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955912#comment-13955912
 ] 

Hudson commented on YARN-1889:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5440 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5440/])
YARN-1889. In Fair Scheduler, avoid creating objects on each call to 
AppSchedulable comparator (Hong Zhiguo via Sandy Ryza) (sandy: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1583491)
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/ResourceWeights.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java


 In Fair Scheduler, avoid creating objects on each call to AppSchedulable 
 comparator
 ---

 Key: YARN-1889
 URL: https://issues.apache.org/jira/browse/YARN-1889
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Hong Zhiguo
Assignee: Hong Zhiguo
Priority: Minor
  Labels: reviewed
 Fix For: 2.5.0

 Attachments: YARN-1889.patch, YARN-1889.patch


 In fair scheduler, in each scheduling attempt, a full sort is
 performed on List of AppSchedulable, which invokes Comparator.compare
 method many times. Both FairShareComparator and DRFComparator call
 AppSchedulable.getWeights, and AppSchedulable.getPriority.
 A new ResourceWeights object is allocated on each call of getWeights,
 and the same for getPriority. This introduces a lot of pressure to
 GC because these methods are called very very frequently.
 Below test case shows improvement on performance and GC behaviour. The 
 results show that the GC pressure during processing NodeUpdate is recuded 
 half by this patch.
 The code to show the improvement: (Add it to TestFairScheduler.java)
 {code}
 import java.lang.management.GarbageCollectorMXBean;
 import java.lang.management.ManagementFactory;
   public void printGCStats() {
 long totalGarbageCollections = 0;
 long garbageCollectionTime = 0;
 for(GarbageCollectorMXBean gc :
   ManagementFactory.getGarbageCollectorMXBeans()) {
   long count = gc.getCollectionCount();
   if(count = 0) {
 totalGarbageCollections += count;
   }
   long time = gc.getCollectionTime();
   if(time = 0) {
 garbageCollectionTime += time;
   }
 }
 System.out.println(Total Garbage Collections: 
 + totalGarbageCollections);
 System.out.println(Total Garbage Collection Time (ms): 
 + garbageCollectionTime);
   }
   @Test
   public void testImpactOnGC() throws Exception {
 scheduler.reinitialize(conf, resourceManager.getRMContext());
 // Add nodes
 int numNode = 1;
 for (int i = 0; i  numNode; ++i) {
 String host = String.format(192.1.%d.%d, i/256, i%256);
 RMNode node =
 MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, 
 host);
 NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node);
 scheduler.handle(nodeEvent);
 assertEquals(1024 * 64 * (i+1), 
 scheduler.getClusterCapacity().getMemory());
 }
 assertEquals(numNode, scheduler.getNumClusterNodes());
 assertEquals(1024 * 64 * numNode, 
 scheduler.getClusterCapacity().getMemory());
 // add apps, each app has 100 containers.
 int minReqSize =
 
 FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB;
 int numApp = 8000;
 int priority = 1;
 for (int i = 1; i  numApp + 1; ++i) {
 ApplicationAttemptId attemptId = createAppAttemptId(i, 1);
 AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent(
 attemptId.getApplicationId(), queue1, user1);
 scheduler.handle(appAddedEvent);
 AppAttemptAddedSchedulerEvent attemptAddedEvent =
 new AppAttemptAddedSchedulerEvent(attemptId, false);
 scheduler.handle(attemptAddedEvent);
 createSchedulingRequestExistingApplication(minReqSize * 2, 1, 
 priority, attemptId);
 }
 scheduler.update();
 assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, 
 true)
 .getRunnableAppSchedulables().size());
 System.out.println(GC stats before NodeUpdate processing:);
 

[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol

2014-03-31 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955925#comment-13955925
 ] 

Jian He commented on YARN-1879:
---

+1 with the RetryCache approach

 Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
 ---

 Key: YARN-1879
 URL: https://issues.apache.org/jira/browse/YARN-1879
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Jian He
Assignee: Tsuyoshi OZAWA
Priority: Critical
 Attachments: YARN-1879.1.patch, YARN-1879.1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1896) For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics

2014-03-31 Thread Siqi Li (JIRA)
Siqi Li created YARN-1896:
-

 Summary: For FairScheduler expose MinimumQueueResource of each 
queu in QueueMetrics
 Key: YARN-1896
 URL: https://issues.apache.org/jira/browse/YARN-1896
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1896) For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics

2014-03-31 Thread Siqi Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siqi Li updated YARN-1896:
--

Attachment: YARN-1896.v1.patch

 For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics
 --

 Key: YARN-1896
 URL: https://issues.apache.org/jira/browse/YARN-1896
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
 Attachments: YARN-1896.v1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1896) For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics

2014-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955952#comment-13955952
 ] 

Hadoop QA commented on YARN-1896:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12637959/YARN-1896.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3494//console

This message is automatically generated.

 For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics
 --

 Key: YARN-1896
 URL: https://issues.apache.org/jira/browse/YARN-1896
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siqi Li
 Attachments: YARN-1896.v1.patch






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions

2014-03-31 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955960#comment-13955960
 ] 

Sandy Ryza commented on YARN-596:
-

Thanks.  The patch is looking almost done.  I like that you replaced the O(n 
log n) sort call in preemptContainer with an O(n) iteration.   Just a few more 
nits:

{code}
+  LOG.debug(Queue  + getName() +  is going to preempt a container  +
+  from its childQueues.);
{code}
This doesn't make sense in FSLeafQueue, which can't have child queues.

{code}
+// Let the selected queue to preempt
+if (candidateQueue != null) {
+  toBePreempted = candidateQueue.preemptContainer();
+}
{code}
Did you mean Let the selected queue choose which of its containers to preempt?

For preemptContainerPreCheck, it would be good to take multiple resources into 
account (using the DefaultResourceCalculator will only apply to memory).  
Resources.fitsIn(getResourceUsage(), getFairShare) can be used to determine 
whether a Schedulable is safe from preemption.

Lastly, can you add a test that makes sure that containers from apps that are 
higher over their fair share get preempted first, even when containers from 
other apps that are over their fair share have lower priorities?

 In fair scheduler, intra-application container priorities affect 
 inter-application preemption decisions
 ---

 Key: YARN-596
 URL: https://issues.apache.org/jira/browse/YARN-596
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scheduler
Affects Versions: 2.0.3-alpha
Reporter: Sandy Ryza
Assignee: Sandy Ryza
 Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch, 
 YARN-596.patch, YARN-596.patch


 In the fair scheduler, containers are chosen for preemption in the following 
 way:
 All containers for all apps that are in queues that are over their fair share 
 are put in a list.
 The list is sorted in order of the priority that the container was requested 
 in.
 This means that an application can shield itself from preemption by 
 requesting it's containers at higher priorities, which doesn't really make 
 sense.
 Also, an application that is not over its fair share, but that is in a queue 
 that is over it's fair share is just as likely to have containers preempted 
 as an application that is over its fair share.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse

2014-03-31 Thread Ming Ma (JIRA)
Ming Ma created YARN-1897:
-

 Summary: Define SignalContainerRequest and SignalContainerResponse
 Key: YARN-1897
 URL: https://issues.apache.org/jira/browse/YARN-1897
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api
Reporter: Ming Ma


We need to define SignalContainerRequest and SignalContainerResponse first as 
they are needed by other sub tasks. SignalContainerRequest should use 
OS-independent commands and provide a way to application to specify reason 
for diagnosis. SignalContainerResponse might be empty.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (YARN-1726) ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced in YARN-1041

2014-03-31 Thread Wei Yan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Yan updated YARN-1726:
--

Attachment: YARN-1726.patch

A new patch:
(1) Fix the problem because of the AbstractYarnScheduler.
(2) Add testcases for AMSimulator amd NMSimulator.
(3) Update the TestSLSRunner to catch the possible exception from child thread 
during running. The old testcase cannot work as it cannot catch the child 
threads' exception.

 ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced 
 in YARN-1041
 

 Key: YARN-1726
 URL: https://issues.apache.org/jira/browse/YARN-1726
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-1726.patch, YARN-1726.patch


 The YARN scheduler simulator failed when running Fair Scheduler, due to 
 AbstractYarnScheduler introduced in YARN-1041. The ResourceSchedulerWrapper 
 should inherit AbstractYarnScheduler, instead of implementing 
 ResourceScheduler interface directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1726) ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced in YARN-1041

2014-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956107#comment-13956107
 ] 

Hadoop QA commented on YARN-1726:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12637996/YARN-1726.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-tools/hadoop-sls:

  org.apache.hadoop.yarn.sls.appmaster.TestAMSimulator
  org.apache.hadoop.yarn.sls.TestSLSRunner

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/3495//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3495//console

This message is automatically generated.

 ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced 
 in YARN-1041
 

 Key: YARN-1726
 URL: https://issues.apache.org/jira/browse/YARN-1726
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wei Yan
Assignee: Wei Yan
Priority: Minor
 Attachments: YARN-1726.patch, YARN-1726.patch


 The YARN scheduler simulator failed when running Fair Scheduler, due to 
 AbstractYarnScheduler introduced in YARN-1041. The ResourceSchedulerWrapper 
 should inherit AbstractYarnScheduler, instead of implementing 
 ResourceScheduler interface directly.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk

2014-03-31 Thread Hong Zhiguo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956122#comment-13956122
 ] 

Hong Zhiguo commented on YARN-1872:
---

I met the timeout too. But I can't reproduce it.
Could you reproduce the timeout? Can you attach the 
TestDistributedShell-output.txt under surefire-reports?

 TestDistributedShell occasionally fails in trunk
 

 Key: YARN-1872
 URL: https://issues.apache.org/jira/browse/YARN-1872
 Project: Hadoop YARN
  Issue Type: Test
Reporter: Ted Yu
 Attachments: TestDistributedShell.out


 From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console :
 TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and 
 TestDistributedShell#testDSShell timed out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)