[jira] [Commented] (YARN-5050) Code cleanup for TestDistributedShell

2016-05-05 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273647#comment-15273647
 ] 

Li Lu commented on YARN-5050:
-

Cannot reproduce the test failure locally. The test is only for ATS v1.0 and 
should not be affected by the changes in this patch. I'm launching another 
Jenkins run to check. 

> Code cleanup for TestDistributedShell
> -
>
> Key: YARN-5050
> URL: https://issues.apache.org/jira/browse/YARN-5050
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-5050-YARN-2928.001.patch
>
>
> We introduced some small errors after yesterday's rebase. Also, some timeout 
> settings for timeline v2 tests are deprecated since we introduced global time 
> out settings in YARN-4545. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long

2016-05-05 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273639#comment-15273639
 ] 

Rohith Sharma K S commented on YARN-4807:
-

[~sunilg] I tried back porting this JIRA, there are other multiple JIRA need to 
be back ported first!! Some of them are not really required to back port in 
branch-2.8. I think let YARN-4947 rebased patch go in. 

> MockAM#waitForState sleep duration is too long
> --
>
> Key: YARN-4807
> URL: https://issues.apache.org/jira/browse/YARN-4807
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
>Assignee: Yufei Gu
> Fix For: 2.9.0
>
> Attachments: YARN-4807.001.patch, YARN-4807.002.patch, 
> YARN-4807.003.patch, YARN-4807.004.patch, YARN-4807.005.patch, 
> YARN-4807.006.patch, YARN-4807.007.patch, YARN-4807.008.patch, 
> YARN-4807.009.patch, YARN-4807.010.patch, YARN-4807.011.patch, 
> YARN-4807.012.patch, YARN-4807.013.patch, YARN-4807.014.patch, 
> YARN-4807.015.patch
>
>
> MockAM#waitForState sleep duration (500 ms) is too long. Also, there is 
> significant duplication with MockRM#waitForState.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes

2016-05-05 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273625#comment-15273625
 ] 

Rohith Sharma K S commented on YARN-4947:
-

Labeled as 2.8-candidate

> Test timeout is happening for TestRMWebServicesNodes
> 
>
> Key: YARN-4947
> URL: https://issues.apache.org/jira/browse/YARN-4947
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: 2.8-candidate
> Fix For: 2.9.0
>
> Attachments: 0001-YARN-4947.patch, 0002-YARN-4947.patch, 
> 0003-YARN-4947.patch, 0004-YARN-4947.patch, 0005-YARN-4947.patch, 
> 0006-YARN-4947-rebase.patch, 0006-YARN-4947.patch, 0007-YARN-4947.patch, 
> YARN-4947-branch-2.8.007.patch
>
>
> Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 
> [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4947) Test timeout is happening for TestRMWebServicesNodes

2016-05-05 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-4947:

Labels: 2.8-candidate  (was: )

> Test timeout is happening for TestRMWebServicesNodes
> 
>
> Key: YARN-4947
> URL: https://issues.apache.org/jira/browse/YARN-4947
> Project: Hadoop YARN
>  Issue Type: Test
>  Components: test
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>  Labels: 2.8-candidate
> Fix For: 2.9.0
>
> Attachments: 0001-YARN-4947.patch, 0002-YARN-4947.patch, 
> 0003-YARN-4947.patch, 0004-YARN-4947.patch, 0005-YARN-4947.patch, 
> 0006-YARN-4947-rebase.patch, 0006-YARN-4947.patch, 0007-YARN-4947.patch, 
> YARN-4947-branch-2.8.007.patch
>
>
> Testcase timeout for TestRMWebServicesNodes is happening after YARN-4893 
> [timeout|https://builds.apache.org/job/PreCommit-YARN-Build/11044/testReport/]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273585#comment-15273585
 ] 

Hadoop QA commented on YARN-5045:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 15 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 12s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
18s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 22s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 6s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
8s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 33s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
51s {color} | {color:green} YARN-2928 passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped branch modules with no Java source: 
hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
36s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 1s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 55s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patch modules with no Java source: 
hadoop-project hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 14s 
{color} | {color:red} 
patch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 no findbugs output file 
(hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/target/findbugsXml.xml)
 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 8s 
{color} | {color:green} hadoop-project in the patch passed with JDK 

[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE

2016-05-05 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273574#comment-15273574
 ] 

Jian He commented on YARN-5048:
---

Those tests are not related. I've run them locally. thanks

> DelegationTokenRenewer#skipTokenRenewal may throw NPE 
> --
>
> Key: YARN-5048
> URL: https://issues.apache.org/jira/browse/YARN-5048
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-5048.1.patch
>
>
> {{((Token)token).decodeIdentifier()}} may 
> throw NPE if RM does not have the corresponding toke kind class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5050) Code cleanup for TestDistributedShell

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273501#comment-15273501
 ] 

Hadoop QA commented on YARN-5050:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
35s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
24s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
44s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
20s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 25s {color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed 
with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 11m 34s {color} 
| {color:red} hadoop-yarn-applications-distributedshell in the patch failed 
with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 43m 36s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | 
hadoop.yarn.applications.distributedshell.TestDistributedShell |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.applications.distributedshell.TestDistributedShell |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802575/YARN-5050-YARN-2928.001.patch
 |
| JIRA Issue | YARN-5050 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 49fbaf689b44 

[jira] [Commented] (YARN-5029) RM needs to send update event with YarnApplicationState as Running to ATS/AHS

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273459#comment-15273459
 ] 

Hadoop QA commented on YARN-5029:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
42s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in 
trunk has 3 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
7s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 16s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 25s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: patch 
generated 1 new + 178 unchanged - 27 fixed = 179 total (was 205) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
59s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s 
{color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 56s 
{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed with JDK v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 5s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s 
{color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 

[jira] [Updated] (YARN-5050) Code cleanup for TestDistributedShell

2016-05-05 Thread Li Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-5050:

Attachment: YARN-5050-YARN-2928.001.patch

Patch 001 to clean up the code and make sure all UTs pass. 

> Code cleanup for TestDistributedShell
> -
>
> Key: YARN-5050
> URL: https://issues.apache.org/jira/browse/YARN-5050
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-5050-YARN-2928.001.patch
>
>
> We introduced some small errors after yesterday's rebase. Also, some timeout 
> settings for timeline v2 tests are deprecated since we introduced global time 
> out settings in YARN-4545. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5050) Code cleanup for TestDistributedShell

2016-05-05 Thread Li Lu (JIRA)
Li Lu created YARN-5050:
---

 Summary: Code cleanup for TestDistributedShell
 Key: YARN-5050
 URL: https://issues.apache.org/jira/browse/YARN-5050
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Li Lu
Assignee: Li Lu


We introduced some small errors after yesterday's rebase. Also, some timeout 
settings for timeline v2 tests are deprecated since we introduced global time 
out settings in YARN-4545. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5000) [YARN-3368] App attempt page is not loading when timeline server is not started

2016-05-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273408#comment-15273408
 ] 

Wangda Tan commented on YARN-5000:
--

[~sunilg],

Tested the patch, haven't done deep tests, some comments so far

1) When I try the patch, run RM without TS/NM. The application is in ACCEPTED 
state, but the application page has multiple entries of attempt: (see 
[attachment|https://issues.apache.org/jira/secure/attachment/12802571/screenshot-1.png]),
 (This only happens when I access RM hosted page at localhost:8288, probably 
caused by auto/hash issue which is mentioned by you at: 
https://issues.apache.org/jira/browse/YARN-4515?focusedCommentId=15269292=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15269292).

2) When an application is running (without enable TS, but NM is running). I get 
following intermittent exception when keep refreshing page at: 
{{http://localhost:4200/yarnAppAttempt/appattempt_1462494543748_0004_01 
(replace by your attempt ID}}
{code}
http://localhost:1337/localhost:8088/ws/v1/cluster/apps/application_1462494543748_0004/appattempts/appattempt_1462494543748_0004_01/containers
yarn-container.js:31 
http://localhost:1337/localhost:8188/ws/v1/applicationhistory/apps/applicat…494543748_0004/appattempts/appattempt_1462494543748_0004_01/containers
ember.debug.js:30877 TypeError: Cannot read property '0' of undefined
at Class.error (application.js:11)
at Object.triggerEvent (ember.debug.js:27476)
at Object.trigger (ember.debug.js:51925)
at Object.trigger (ember.debug.js:51739)
at ember.debug.js:51559
at tryCatch (ember.debug.js:52258)
at invokeCallback (ember.debug.js:52273)
at publish (ember.debug.js:52241)
at publishRejection (ember.debug.js:52176)
at ember.debug.js:30835onerrorDefault @ ember.debug.js:30877trigger @ 
ember.debug.js:52928(anonymous function) @ ember.debug.js:54177invoke @ 
ember.debug.js:320flush @ ember.debug.js:384flush @ ember.debug.js:185end @ 
ember.debug.js:563run @ ember.debug.js:685run @ 
ember.debug.js:20105hash.success @ rest-adapter.js:742fire @ 
jquery.js:3099fireWith @ jquery.js:3211done @ jquery.js:8264(anonymous 
function) @ jquery.js:8605
jquery.js:8630 GET 
http://localhost:1337/localhost:8188/ws/v1/applicationhistory/apps/applicat…494543748_0004/appattempts/appattempt_1462494543748_0004_01/containers
 502 (Bad Gateway)
{code}

3) (MINOR), I can get this error at console when TS is not enabled:
{{jquery.js:8630 GET 
http://localhost:1337/localhost:8188/ws/v1/applicationhistory/apps/applicat…494543748_0004/appattempts/appattempt_1462494543748_0004_01/containers
 502 (Bad Gateway)}}, probably we should not show such message

For code implementation, some comments so far:
1) Instead of returning object with id="dummy" and check "dummy" everywhere, is 
it possible to return an empty array? Which may be more clear.

> [YARN-3368] App attempt page is not loading when timeline server is not 
> started
> ---
>
> Key: YARN-5000
> URL: https://issues.apache.org/jira/browse/YARN-5000
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-5000.patch, 
> AppFinishedAndNoTimelineServer.png, AppRunningAndNoTimelineServer.png, 
> AppRunningAndNoTimelineServer_v2.png, YARN-5000-YARN-3368.1.patch, 
> YARN-5000-YARN-3368.2.patch, YARN-5000-YARN-3368.3.patch, 
> YARN-5000-YARN-3368.4.patch, screenshot-1.png
>
>
> If timeline server is not started, app attempt page is not getting loaded.
> In new web-ui, yarnContainer route is tightly coupled with both RM and 
> Timeline server. And if one of server is not up, page will not load. If 
> timeline server is not up, container information from RM is to be displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5000) [YARN-3368] App attempt page is not loading when timeline server is not started

2016-05-05 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-5000:
-
Attachment: screenshot-1.png

> [YARN-3368] App attempt page is not loading when timeline server is not 
> started
> ---
>
> Key: YARN-5000
> URL: https://issues.apache.org/jira/browse/YARN-5000
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-5000.patch, 
> AppFinishedAndNoTimelineServer.png, AppRunningAndNoTimelineServer.png, 
> AppRunningAndNoTimelineServer_v2.png, YARN-5000-YARN-3368.1.patch, 
> YARN-5000-YARN-3368.2.patch, YARN-5000-YARN-3368.3.patch, 
> YARN-5000-YARN-3368.4.patch, screenshot-1.png
>
>
> If timeline server is not started, app attempt page is not getting loaded.
> In new web-ui, yarnContainer route is tightly coupled with both RM and 
> Timeline server. And if one of server is not up, page will not load. If 
> timeline server is not up, container information from RM is to be displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5045) hbase unit tests fail due to dependency issues

2016-05-05 Thread Sangjin Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sangjin Lee updated YARN-5045:
--
Attachment: YARN-5045-YARN-2928.poc.patch

Posted the POC patch. With this, I have a clean build in terms of all the unit 
tests passing in timelineservice as well as timelineservice-hbase-tests. I also 
confirmed that it does not add hadoop-common 2.5.1 to the tarball.

I'm going to turn on jenkins to see if I have other issues.

> hbase unit tests fail due to dependency issues
> --
>
> Key: YARN-5045
> URL: https://issues.apache.org/jira/browse/YARN-5045
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Blocker
> Attachments: YARN-5045-YARN-2928.poc.patch
>
>
> After the 5/4 rebase, the hbase unit tests in the timeline service project 
> are failing:
> {noformat}
> org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
>   Time elapsed: 5.103 sec  <<< ERROR!
> java.io.IOException: Shutting down
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>   at 
> org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677)
>   at 
> org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546)
>   at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500)
>   at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104)
>   at 
> org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345)
>   at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550)
>   at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
>   at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87)
> {noformat}
> The root cause is that the hbase mini server depends on hadoop common's 
> {{MetricsServlet}} which has been removed in the trunk (HADOOP-12504):
> {noformat}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/metrics/MetricsServlet
> at 
> org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677)
> at 
> org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546)
> at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500)
> at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104)
> at 
> org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345)
> at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77)
> at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697)
> at 
> 

[jira] [Commented] (YARN-4515) [YARN-3368] Support hosting web UI framework inside YARN RM

2016-05-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273390#comment-15273390
 ] 

Wangda Tan commented on YARN-4515:
--

My bad, it is caused by one Chrome plugin I installed: 
Allow-Control-Allow-Origin: *.  After disable the plugin and set 
localBaseAddress: "localhost:1337", it works for me.

Refresh page and back page seems wrong, which should be addressed by your last 
comment (auto/hash stuffs).

> [YARN-3368] Support hosting web UI framework inside YARN RM
> ---
>
> Key: YARN-4515
> URL: https://issues.apache.org/jira/browse/YARN-4515
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Attachments: 0001-YARN-4515.patch, 
> preliminary-YARN-4515-host_rm_web_ui_v2.patch
>
>
> Currently it can be only launched outside of YARN, we should make it runnable 
> inside YARN for easier testing and we should have a configuration to 
> enable/disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues

2016-05-05 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273383#comment-15273383
 ] 

Sangjin Lee commented on YARN-5045:
---

One point about not including non-test source in the separated project. In the 
non-test part of the code, there isn't a clean separation between code that 
uses hbase and the other part. Also, even if there is such a clear separation, 
there needs to be a clear dependency *from* the hbase-related code to the 
generic code, or the diamond dependency is reintroduced.

Also, another point about option (3). The key point there should be no binary 
dependency between the hbase-related code and non-hbase-related code for this 
idea to work.

> hbase unit tests fail due to dependency issues
> --
>
> Key: YARN-5045
> URL: https://issues.apache.org/jira/browse/YARN-5045
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Blocker
>
> After the 5/4 rebase, the hbase unit tests in the timeline service project 
> are failing:
> {noformat}
> org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
>   Time elapsed: 5.103 sec  <<< ERROR!
> java.io.IOException: Shutting down
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>   at 
> org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677)
>   at 
> org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546)
>   at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500)
>   at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104)
>   at 
> org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345)
>   at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550)
>   at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
>   at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217)
>   at 
> org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213)
>   at 
> org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806)
>   at 
> org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750)
>   at 
> org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87)
> {noformat}
> The root cause is that the hbase mini server depends on hadoop common's 
> {{MetricsServlet}} which has been removed in the trunk (HADOOP-12504):
> {noformat}
> Caused by: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/metrics/MetricsServlet
> at 
> org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677)
> at 
> org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546)
> at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500)
> at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104)
> at 
> org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345)
> at 

[jira] [Created] (YARN-5049) Extend NMStateStore to save queued container information

2016-05-05 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5049:


 Summary: Extend NMStateStore to save queued container information
 Key: YARN-5049
 URL: https://issues.apache.org/jira/browse/YARN-5049
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos
Assignee: Konstantinos Karanasos


This JIRA is about extending the NMStateStore to save queued container 
information whenever a new container is added to the NM queue. 
It also removes the information from the state store when the queued container 
starts its execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE

2016-05-05 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273372#comment-15273372
 ] 

Yongjun Zhang edited comment on YARN-5048 at 5/6/16 12:20 AM:
--

Thanks for the explanation [~jianhe].  
I'm about to +1, but saw some test failures. Would you please check on? thanks.




was (Author: yzhangal):
Thanks for the explanation [~jianhe]. I'm +1 on your patch.


> DelegationTokenRenewer#skipTokenRenewal may throw NPE 
> --
>
> Key: YARN-5048
> URL: https://issues.apache.org/jira/browse/YARN-5048
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-5048.1.patch
>
>
> {{((Token)token).decodeIdentifier()}} may 
> throw NPE if RM does not have the corresponding toke kind class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4515) [YARN-3368] Support hosting web UI framework inside YARN RM

2016-05-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273374#comment-15273374
 ] 

Wangda Tan commented on YARN-4515:
--

BTW: do I need to apply other patches to try this?

> [YARN-3368] Support hosting web UI framework inside YARN RM
> ---
>
> Key: YARN-4515
> URL: https://issues.apache.org/jira/browse/YARN-4515
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Attachments: 0001-YARN-4515.patch, 
> preliminary-YARN-4515-host_rm_web_ui_v2.patch
>
>
> Currently it can be only launched outside of YARN, we should make it runnable 
> inside YARN for easier testing and we should have a configuration to 
> enable/disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE

2016-05-05 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273372#comment-15273372
 ] 

Yongjun Zhang commented on YARN-5048:
-

Thanks for the explanation [~jianhe]. I'm +1 on your patch.


> DelegationTokenRenewer#skipTokenRenewal may throw NPE 
> --
>
> Key: YARN-5048
> URL: https://issues.apache.org/jira/browse/YARN-5048
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-5048.1.patch
>
>
> {{((Token)token).decodeIdentifier()}} may 
> throw NPE if RM does not have the corresponding toke kind class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4515) [YARN-3368] Support hosting web UI framework inside YARN RM

2016-05-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273373#comment-15273373
 ] 

Wangda Tan commented on YARN-4515:
--

Some issues when I tried to run with the patch:

What I have done:
- I copied js/html to webapps/rm
- Modified {{webapps/rm/config/configs.env}} to make sure {{localBaseAddress: 
"localhost:1337"}}
- Start RM in local host
- Run corsproxy
- Go to localhost:8288

It shows error:
{code}
XMLHttpRequest cannot load 
http://localhost:1337/localhost:8088/ws/v1/cluster/metrics. The 
'Access-Control-Allow-Origin' header contains multiple values 
'http://evil.com/, *', but only one is allowed. Origin 'http://localhost:8288' 
is therefore not allowed access.
vendor-f2b8b19….js:11 Error: Adapter operation failed
at new Error (native)
at Error.n 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:8:4037)
at Error.e 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:30:2065)
at n.handleResponse 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:30:22139)
at n.c.error 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:30:22648)
at c 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:2:6024)
at Object.fireWith [as rejectWith] 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:2:6836)
at n 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:3:11508)
at XMLHttpRequest. 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:3:17202)
{code}

And if I set localBaseAddress to empty, and stop corsproxy, on localhost:8288, 
it shows:
{code}
XMLHttpRequest cannot load http://localhost:8088/ws/v1/cluster/metrics. A 
wildcard '*' cannot be used in the 'Access-Control-Allow-Origin' header when 
the credentials flag is true. Origin 'http://localhost:8288' is therefore not 
allowed access. The credentials mode of an XMLHttpRequest is controlled by the 
withCredentials attribute.
vendor-f2b8b19….js:11 Error: Adapter operation failed
at new Error (native)
at Error.n 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:8:4037)
at Error.e 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:30:2065)
at n.handleResponse 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:30:22139)
at n.c.error 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:30:22648)
at c 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:2:6024)
at Object.fireWith [as rejectWith] 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:2:6836)
at n 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:3:11508)
at XMLHttpRequest. 
(http://localhost:8288/assets/vendor-f2b8b19533691ddb57f3926c35c36bf2.js:3:17202)
{code}

And a couple of quick comments:
1. Could we add a config to start yarn-ui-v2 on demand? (disable it by default)
2. {{LOG.error("Failed to start Yarn web app v2")}} should not swallow exception

> [YARN-3368] Support hosting web UI framework inside YARN RM
> ---
>
> Key: YARN-4515
> URL: https://issues.apache.org/jira/browse/YARN-4515
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Sunil G
> Attachments: 0001-YARN-4515.patch, 
> preliminary-YARN-4515-host_rm_web_ui_v2.patch
>
>
> Currently it can be only launched outside of YARN, we should make it runnable 
> inside YARN for easier testing and we should have a configuration to 
> enable/disable it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE

2016-05-05 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273367#comment-15273367
 ] 

Jian He commented on YARN-5048:
---

Hi Yongjun

bq.  what circumstance that RM does not have the token kind class?
e.g. If user pass in an HFTP token, RM may not have the token kind class.
bq. does it mean we should not skip renewing thus RM need to renew; If RM need 
to renew, does it mean RM need to have the token kind class?
Even without the token kind class, RM may still able to renew because renew is 
called by Token#renew, and the token kind class is needed for decodeIdentifier 
method only. 

> DelegationTokenRenewer#skipTokenRenewal may throw NPE 
> --
>
> Key: YARN-5048
> URL: https://issues.apache.org/jira/browse/YARN-5048
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-5048.1.patch
>
>
> {{((Token)token).decodeIdentifier()}} may 
> throw NPE if RM does not have the corresponding toke kind class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5039) Applications ACCEPTED but not starting

2016-05-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273366#comment-15273366
 ] 

Wangda Tan commented on YARN-5039:
--

+ [~djp] since this may be related to node grace decommissioning.

> Applications ACCEPTED but not starting
> --
>
> Key: YARN-5039
> URL: https://issues.apache.org/jira/browse/YARN-5039
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot 
> 2016-05-04 at 2.41.22 PM.png, queue-config.log, 
> resource-manager-application-starts.log.gz, 
> yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz
>
>
> Often when we submit applications to an incompletely utilized cluster, they 
> sit, unable to start for no apparent reason.
> There are multiple nodes in the cluster with available resources, but the 
> resourcemanger logs show that scheduling is being skipped. The scheduling is 
> skipped because the application itself has reserved the node? I'm not sure 
> how to interpret this log output:
> {code}
> 2016-05-04 20:19:21,315 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-53.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (YARN-5039) Applications ACCEPTED but not starting

2016-05-05 Thread Miles Crawford (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273361#comment-15273361
 ] 

Miles Crawford commented on YARN-5039:
--

So we could be shooting ourselves in the foot by scaling our cluster up and 
down as needed?

But, this only applies to new jobs not starting - the nodes are used once the 
application does start...

> Applications ACCEPTED but not starting
> --
>
> Key: YARN-5039
> URL: https://issues.apache.org/jira/browse/YARN-5039
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot 
> 2016-05-04 at 2.41.22 PM.png, queue-config.log, 
> resource-manager-application-starts.log.gz, 
> yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz
>
>
> Often when we submit applications to an incompletely utilized cluster, they 
> sit, unable to start for no apparent reason.
> There are multiple nodes in the cluster with available resources, but the 
> resourcemanger logs show that scheduling is being skipped. The scheduling is 
> skipped because the application itself has reserved the node? I'm not sure 
> how to interpret this log output:
> {code}
> 2016-05-04 20:19:21,315 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-53.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2888) Corrective mechanisms for rebalancing NM container queues

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273351#comment-15273351
 ] 

Hadoop QA commented on YARN-2888:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 42s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
1s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 12s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 38s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
43s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 10s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
1s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in 
trunk has 3 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 24s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
54s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 27s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 3m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 26s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 43s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 33 new + 
499 unchanged - 6 fixed = 532 total (was 505) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
55s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 37s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 8s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 28s {color} 
| {color:red} hadoop-yarn-api in the patch failed with JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s 
{color} | {color:green} hadoop-yarn-server-common in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 1s 
{color} | {color:green} 

[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE

2016-05-05 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273347#comment-15273347
 ] 

Yongjun Zhang commented on YARN-5048:
-

HI [~jianhe],

Thanks for catching the issue and the patch!

The patch looks good to me. Some questions,
 
1. what circumstance that RM does not have the token kind class?
2. This method is to check whether we need to skip renewing a token, if RM does 
not have the corresponding token kind class, does it mean we should not skip 
renewing thus RM need to renew? 
3. If RM need to renew, does it mean RM need to have the token kind class?

Thanks.
 


> DelegationTokenRenewer#skipTokenRenewal may throw NPE 
> --
>
> Key: YARN-5048
> URL: https://issues.apache.org/jira/browse/YARN-5048
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-5048.1.patch
>
>
> {{((Token)token).decodeIdentifier()}} may 
> throw NPE if RM does not have the corresponding toke kind class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273344#comment-15273344
 ] 

Hadoop QA commented on YARN-5048:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 40s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 28s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 29m 31s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 37s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 77m 52s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestContainerResourceUsage |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  

[jira] [Commented] (YARN-5029) RM needs to send update event with YarnApplicationState as Running to ATS/AHS

2016-05-05 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273335#comment-15273335
 ] 

Xuan Gong commented on YARN-5029:
-

Thanks for the review. Junping.

Uploaded a new patch to fix checkstyle issue and testcase failure.

bq. TestSystemMetricsPublisher to check if SystemMetricsPublisher received the 
event. It would be helpful to have an additional check in ATS side to actually 
see event is written in backend - just like what we have now in 
TestDistributedShell.

It actually did the check. After SystemMetricsPublisher sends all the events, 
we do the getEntity call from ATS and check whether we have those events as 
well as specified data.

> RM needs to send update event with YarnApplicationState as Running to ATS/AHS
> -
>
> Key: YARN-5029
> URL: https://issues.apache.org/jira/browse/YARN-5029
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-5029.1.patch, YARN-5029.2.patch
>
>
> Right now, Application in AHS/ATS is alway in ACCEPTED state until the 
> application finishes/Fails/is killed. This is because RM did not send any 
> other YarnApplicationState information, except FINISHED/FAILED/KILLED, to 
> ATS.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5029) RM needs to send update event with YarnApplicationState as Running to ATS/AHS

2016-05-05 Thread Xuan Gong (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuan Gong updated YARN-5029:

Attachment: YARN-5029.2.patch

> RM needs to send update event with YarnApplicationState as Running to ATS/AHS
> -
>
> Key: YARN-5029
> URL: https://issues.apache.org/jira/browse/YARN-5029
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-5029.1.patch, YARN-5029.2.patch
>
>
> Right now, Application in AHS/ATS is alway in ACCEPTED state until the 
> application finishes/Fails/is killed. This is because RM did not send any 
> other YarnApplicationState information, except FINISHED/FAILED/KILLED, to 
> ATS.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5046) [Umbrella] Refactor scheduler code

2016-05-05 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273324#comment-15273324
 ] 

Ray Chiang commented on YARN-5046:
--

[~leftnoteasy] or [~jianhe], let me know if you think if it makes sense to make 
YARN-4433 a sub-task of this JIRA.  I'm also looking at YARN-3890 and YARN-2773 
as possible candidates.

> [Umbrella] Refactor scheduler code
> --
>
> Key: YARN-5046
> URL: https://issues.apache.org/jira/browse/YARN-5046
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: capacity scheduler, fairscheduler, resourcemanager, 
> scheduler
>Affects Versions: 3.0.0
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: technical_debt
>
> At this point in time, there are several places where code common to the 
> schedulers can be moved from one or more of the schedulers into 
> AbstractYARNScheduler or a related interface.
> Creating this umbrella JIRA to track this refactoring.  In general, it is 
> preferable to create a subtask JIRA on a per-method basis.
> This may need some coordination with [YARN-3091  \[Umbrella\] Improve and fix 
> locks of RM scheduler|https://issues.apache.org/jira/browse/YARN-3091].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273289#comment-15273289
 ] 

Hadoop QA commented on YARN-3362:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 
55s {color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
26s {color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} branch-2.7 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 1s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 in branch-2.7 has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
27s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 23s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 patch generated 51 new + 688 unchanged - 43 fixed = 739 total (was 731) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 3261 line(s) that end in whitespace. Use 
git apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 1m 14s 
{color} | {color:red} The patch has 497 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 49m 28s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 33s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_101. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 2m 20s 
{color} | {color:red} Patch generated 61 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 134m 3s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
| 

[jira] [Commented] (YARN-5039) Applications ACCEPTED but not starting

2016-05-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273267#comment-15273267
 ] 

Wangda Tan commented on YARN-5039:
--

It seems some nodes are in decommissioning state from the 
[log|https://issues.apache.org/jira/secure/attachment/12802294/yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz],
 probably we have some bugs to correctly show decommissioning nodes and related 
resources on web UI.

{code}
2016-05-04 21:00:10,182 INFO 
org.apache.hadoop.yarn.server.resourcemanager.DecommissioningNodesWatcher (IPC 
Server handler 44 on 8025): Decommissioning Nodes: 
  ip-10-12-41-126.us-west-2.compute.internal 63833s fresh:41194s containers: 0  
READY
  ip-10-12-36-61.us-west-2.compute.internal 62964s fresh:41194s containers: 0   
   READY
  ip-10-12-46-96.us-west-2.compute.internal 98343s fresh:98343s containers: 0   
   READY
  ...
{code}

IIRC, scheduler will not assign containers to decommissioning nodes, that could 
be the reason why your applications stay at ACCEPTED state.

> Applications ACCEPTED but not starting
> --
>
> Key: YARN-5039
> URL: https://issues.apache.org/jira/browse/YARN-5039
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot 
> 2016-05-04 at 2.41.22 PM.png, queue-config.log, 
> resource-manager-application-starts.log.gz, 
> yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz
>
>
> Often when we submit applications to an incompletely utilized cluster, they 
> sit, unable to start for no apparent reason.
> There are multiple nodes in the cluster with available resources, but the 
> resourcemanger logs show that scheduling is being skipped. The scheduling is 
> skipped because the application itself has reserved the node? I'm not sure 
> how to interpret this log output:
> {code}
> 2016-05-04 20:19:21,315 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-53.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> 

[jira] [Updated] (YARN-5039) Applications ACCEPTED but not starting

2016-05-05 Thread Miles Crawford (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miles Crawford updated YARN-5039:
-
Attachment: queue-config.log

Attached the queue config section of RM logs from startup, looks like the 
setting is false from the get-go.

> Applications ACCEPTED but not starting
> --
>
> Key: YARN-5039
> URL: https://issues.apache.org/jira/browse/YARN-5039
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot 
> 2016-05-04 at 2.41.22 PM.png, queue-config.log, 
> resource-manager-application-starts.log.gz, 
> yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz
>
>
> Often when we submit applications to an incompletely utilized cluster, they 
> sit, unable to start for no apparent reason.
> There are multiple nodes in the cluster with available resources, but the 
> resourcemanger logs show that scheduling is being skipped. The scheduling is 
> skipped because the application itself has reserved the node? I'm not sure 
> how to interpret this log output:
> {code}
> 2016-05-04 20:19:21,315 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-53.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional 

[jira] [Commented] (YARN-5039) Applications ACCEPTED but not starting

2016-05-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273240#comment-15273240
 ] 

Jason Lowe commented on YARN-5039:
--

Can you also double-check that the startup messages in the RM log show that the 
queues are configured with it off?  The conf screen always loads a fresh 
config, so it doesn't always reflect what is actually being used.  Look for 
"reservationsContinueLooking = " lines in the RM log when it dumps the queue 
config on startup.

I'll also look into reproducing it on our end.  This is a pretty simple setup 
with just two users and two total apps, so hopefully this is straightforward to 
replicate.

> Applications ACCEPTED but not starting
> --
>
> Key: YARN-5039
> URL: https://issues.apache.org/jira/browse/YARN-5039
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot 
> 2016-05-04 at 2.41.22 PM.png, resource-manager-application-starts.log.gz, 
> yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz
>
>
> Often when we submit applications to an incompletely utilized cluster, they 
> sit, unable to start for no apparent reason.
> There are multiple nodes in the cluster with available resources, but the 
> resourcemanger logs show that scheduling is being skipped. The scheduling is 
> skipped because the application itself has reserved the node? I'm not sure 
> how to interpret this log output:
> {code}
> 2016-05-04 20:19:21,315 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-53.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): 

[jira] [Commented] (YARN-5045) hbase unit tests fail due to dependency issues

2016-05-05 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273218#comment-15273218
 ] 

Sangjin Lee commented on YARN-5045:
---

This is basically a *diamond dependency problem*. Timeline service depends on 
HBase 1.0.1 which in turn depends on hadoop-common *2.5.1*. However, timeline 
service itself depends on hadoop-common *trunk*. In maven, the trunk version is 
chosen as the hadoop dependency version is managed.

We knew there is potential risk of this diamond problem when the timeline 
service is developed on trunk (where the backward incompatible changes can 
happen) while we also depend on HBase which relies on a released version of 
hadoop. This is a first manifestation of the issue.

There are several ways we can address this issue, but there are drawbacks on 
all of the approaches.

(1) we can try to resurrect the metrics v.1 classes (that were removed in 
HADOOP-12504)
We could add back those classes under {{src/test/java}} of the timelineservice 
project. While this might be an easy way to get around this specific problem, 
there are issues. First, it is in poor taste to resurrect code we're getting 
rid of to work around this problem. Also, it does nothing to handle potential 
future issues that may arise. What if classes' shapes change instead of being 
removed?

(2) isolate the hbase unit test code and (en)force the hadoop version that 
HBase relies on
We could try to override the default version of hadoop-common (trunk: 
3.0.0-SNAPSHOT) while running the unit tests. It turns out this requires 
creating a separate project. Prior to maven 3, there used to be a capability of 
choosing a different version for the test scope and test scope only. However, 
maven 3 or later seems to have removed that. Thus, for a given project, there 
can only be one dependency version regardless of scope.

We could try to isolate the HBase-related unit tests into their own project and 
enforce 2.5.1 in that project. This has a decent chance of success.

Still there are caveats. For this to work, the portion of the timeline service 
code that exercises HBase has to ensure it works fine against hadoop-common 
2.5.1 as well as trunk. I'm not sure how best to enforce this state.

Also, I thought of moving the non-test code that relies on HBase into this new 
project as well, but I'm moving away from that idea. The main problem is that 
we need to prevent the hadoop-common 2.5.1 jar from being incorporated into the 
hadoop distribution tarball. If we have the non-test code in here, the worry is 
hadoop-common 2.5.1 may sneak into the tarball.

(3) pull out timeline service entirely out of the hadoop project
This is truly an out-of-the-box idea. Technically it might be feasible to set 
up the HBase-related part of the timeline service as a separate project from 
hadoop. Then the hadoop-proper part of the timeline service would interact with 
the HBase-related part of the timeline service over the wire. That way, hadoop 
can still be shielded from the thorny HBase dependency issues.

While there is some possibility this may work technically, I think there are 
several major issues, ranging from logistics of setting up a separate project 
independent of hadoop, performance implications, etc. I think this is a long 
shot at best.

Out of these options, it seems to me (2) is the least bad option. I am working 
on making that work. I'd love to hear your thoughts if you have better 
suggestions.

> hbase unit tests fail due to dependency issues
> --
>
> Key: YARN-5045
> URL: https://issues.apache.org/jira/browse/YARN-5045
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Sangjin Lee
>Priority: Blocker
>
> After the 5/4 rebase, the hbase unit tests in the timeline service project 
> are failing:
> {noformat}
> org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
>   Time elapsed: 5.103 sec  <<< ERROR!
> java.io.IOException: Shutting down
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
>   at 
> org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677)
>   at 
> org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546)
>   at 

[jira] [Commented] (YARN-5044) Add peak memory usage counter for each task

2016-05-05 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273195#comment-15273195
 ] 

Yufei Gu commented on YARN-5044:


[~vvasudev], thanks for pointing out. But I cannot find these metrics on JHS or 
RM WebUI when a job is running or after it's done. Seems like it is used only 
internally.  

> Add peak memory usage counter for each task
> ---
>
> Key: YARN-5044
> URL: https://issues.apache.org/jira/browse/YARN-5044
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
> are snapshots of memory usage of that task. They are not sufficient for users 
> to understand peak memory usage by that task, e.g. in order to diagnose task 
> failures, tune job parameters or change application design. This new feature 
> will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
> VIRTUAL_MEMORY_BYTES_MAX.
> This JIRA has the same feature from MAPREDUCE-4710.  I file this new YARN 
> JIRA since MAPREDUCE-4710 is pretty old one from MR 1.x era, it more or less 
> assumes a branch-1 architecture, should be close at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273156#comment-15273156
 ] 

Hadoop QA commented on YARN-4766:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
24s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 48s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 41s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 12s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
30s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 23s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in 
trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
3s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 40s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 40s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 39s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 2 new + 
122 unchanged - 4 fixed = 124 total (was 126) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
53s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 40s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 31s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 44s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 

[jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273116#comment-15273116
 ] 

Hadoop QA commented on YARN-4963:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 47s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 52s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.8.0_91. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m 16s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 98m 43s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
| JDK v1.7.0_95 Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestRMRestart |
|   | hadoop.yarn.server.resourcemanager.TestContainerResourceUsage |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | 

[jira] [Commented] (YARN-4390) Do surgical preemption based on reserved container in CapacityScheduler

2016-05-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273108#comment-15273108
 ] 

Wangda Tan commented on YARN-4390:
--

Thanks [~jianhe]/[~sunilg]/[~eepayne] for reviewing the patch!

> Do surgical preemption based on reserved container in CapacityScheduler
> ---
>
> Key: YARN-4390
> URL: https://issues.apache.org/jira/browse/YARN-4390
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.0.0, 2.8.0, 2.7.3
>Reporter: Eric Payne
>Assignee: Wangda Tan
> Fix For: 2.9.0
>
> Attachments: QueueNotHittingMax.jpg, YARN-4390-design.1.pdf, 
> YARN-4390-test-results.pdf, YARN-4390.1.patch, YARN-4390.2.patch, 
> YARN-4390.3.branch-2.patch, YARN-4390.3.patch, YARN-4390.4.patch, 
> YARN-4390.5.patch, YARN-4390.6.patch, YARN-4390.7.patch, YARN-4390.8.patch
>
>
> There are multiple reasons why preemption could unnecessarily preempt 
> containers. One is that an app could be requesting a large container (say 
> 8-GB), and the preemption monitor could conceivably preempt multiple 
> containers (say 8, 1-GB containers) in order to fill the large container 
> request. These smaller containers would then be rejected by the requesting AM 
> and potentially given right back to the preempted app.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2888) Corrective mechanisms for rebalancing NM container queues

2016-05-05 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273104#comment-15273104
 ] 

Arun Suresh commented on YARN-2888:
---

Thanks for the review [~kkaranasos].

I agree with most of your comments and I have addressed them in the latest 
patch. For the rest...

bq. Rename ContainerQueuingLimit* to NMQueuingLimit*?
Hmmm... I prefer to keep it as ContainerQueuingLimit, since it is a struct that 
is part of the NM heartbeat response.. which establishes the 'NM' aspect of it 
and 'ContainerQueuing' more explicitly expresses the fact that we are queuing 
containers.

bq. Why is it needed to change the return type of getContainerManager() to 
ContainerManager  ?
With this patch, we need to set the queuing limit etc on the ContainerManager. 
One option is to introduce the setter etc. method into the Protocol, where I 
don't think it belongs, since it is a property of the ContainerManager entity, 
not the protocol. Another option is to type cast the return type into the 
QueuingContainerManagerImpl, which does not seem clean either. Given all this 
and considering that we have multiple implementations of the ContainerManager, 
I felt this seemed cleaner.

bq. In pruneOpportunisticContainerQueue(), let's use the same logic/code as in 
the stopContainerInternal()..
I feel this is code patch is a bit simpler.. so Id prefer to leave it as it 
is.. But yes, I have changed the variable names and method name for better 
clarity

In {{QueueLimitCalculator}}
* Ive removed median
* The calculations are now independent of the size of k
 

> Corrective mechanisms for rebalancing NM container queues
> -
>
> Key: YARN-2888
> URL: https://issues.apache.org/jira/browse/YARN-2888
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
> Attachments: YARN-2888-yarn-2877.001.patch, 
> YARN-2888-yarn-2877.002.patch, YARN-2888.003.patch, YARN-2888.004.patch, 
> YARN-2888.005.patch
>
>
> Bad queuing decisions by the LocalRMs (e.g., due to the distributed nature of 
> the scheduling decisions or due to having a stale image of the system) may 
> lead to an imbalance in the waiting times of the NM container queues. This 
> can in turn have an impact in job execution times and cluster utilization.
> To this end, we introduce corrective mechanisms that may remove (whenever 
> needed) container requests from overloaded queues, adding them to less-loaded 
> ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5039) Applications ACCEPTED but not starting

2016-05-05 Thread Miles Crawford (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273098#comment-15273098
 ] 

Miles Crawford commented on YARN-5039:
--

Nope, I set reservations-continue-look-all-nodes to false, and verified that 
the config screen showed it:
{code}yarn.scheduler.capacity.reservations-continue-look-all-nodes
false{code}

But I still have exactly the same hangup as before. Two apps schedulable, four 
nodes in the cluster with plenty of free resources, but still skipping because 
of reservations on busy nodes:
{code}
2016-05-05 21:09:58,380 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 (ResourceManager Event Processor): Trying to fulfill reservation for 
application application_1462468084916_0085 on node: 
ip-10-12-41-191.us-west-2.compute.internal:8041
2016-05-05 21:09:58,380 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
(ResourceManager Event Processor): Reserved container  
application=application_1462468084916_0085 resource= 
queue=default: capacity=1.0, absoluteCapacity=1.0, 
usedResources=, usedCapacity=0.7126589, 
absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= cluster=
2016-05-05 21:09:58,380 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
 (ResourceManager Event Processor): Skipping scheduling since node 
ip-10-12-41-191.us-west-2.compute.internal:8041 is reserved by application 
appattempt_1462468084916_0085_01
{code}

> Applications ACCEPTED but not starting
> --
>
> Key: YARN-5039
> URL: https://issues.apache.org/jira/browse/YARN-5039
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot 
> 2016-05-04 at 2.41.22 PM.png, resource-manager-application-starts.log.gz, 
> yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz
>
>
> Often when we submit applications to an incompletely utilized cluster, they 
> sit, unable to start for no apparent reason.
> There are multiple nodes in the cluster with available resources, but the 
> resourcemanger logs show that scheduling is being skipped. The scheduling is 
> skipped because the application itself has reserved the node? I'm not sure 
> how to interpret this log output:
> {code}
> 2016-05-04 20:19:21,315 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-53.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application 
> 

[jira] [Updated] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE

2016-05-05 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-5048:
--
Attachment: YARN-5048.1.patch

> DelegationTokenRenewer#skipTokenRenewal may throw NPE 
> --
>
> Key: YARN-5048
> URL: https://issues.apache.org/jira/browse/YARN-5048
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-5048.1.patch
>
>
> {{((Token)token).decodeIdentifier()}} may 
> throw NPE if RM does not have the corresponding toke kind class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5048) DelegationTokenRenewer#skipTokenRenewal may throw NPE

2016-05-05 Thread Jian He (JIRA)
Jian He created YARN-5048:
-

 Summary: DelegationTokenRenewer#skipTokenRenewal may throw NPE 
 Key: YARN-5048
 URL: https://issues.apache.org/jira/browse/YARN-5048
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Jian He
Assignee: Jian He


{{((Token)token).decodeIdentifier()}} may 
throw NPE if RM does not have the corresponding toke kind class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4994) Use MiniYARNCluster with try-with-resources in tests

2016-05-05 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273076#comment-15273076
 ] 

Andras Bokor commented on YARN-4994:


It seems the test failures are not related.
I thought Hadoop QA runs all the tests on every patch. Actually, it does not, 
so when I added just some files the tests in org.apache.hadoop.yarn package did 
not run. That is why I did not see tests failure in some cases and I saw when I 
added particular changes.
You can check YARN-5034:YARN-5034.13.patch. I changed only a comment and the 
same tests failed. Can I consider as the test failures are not related?
If so, I will do the suggested changes and update the new patch tomorrow.

> Use MiniYARNCluster with try-with-resources in tests
> 
>
> Key: YARN-4994
> URL: https://issues.apache.org/jira/browse/YARN-4994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch, 
> HDFS-10287.03.patch
>
>
> In tests MiniYARNCluster is used with the following pattern:
> In try-catch block create a MiniYARNCluster instance and in finally block 
> close it.
> [Try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]
>  is preferred since Java7 instead of the pattern above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5034) Failing tests after using try-with-resources

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273066#comment-15273066
 ] 

Hadoop QA commented on YARN-5034:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 6s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 20s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 146m 2s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy |
|   | hadoop.yarn.client.TestGetGroups |
| JDK v1.8.0_91 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | org.apache.hadoop.yarn.client.api.impl.TestYarnClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestNMClient |
| JDK v1.7.0_95 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy |
|   | hadoop.yarn.client.TestGetGroups |
| JDK v1.7.0_95 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | 

[jira] [Updated] (YARN-2888) Corrective mechanisms for rebalancing NM container queues

2016-05-05 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-2888:
--
Attachment: YARN-2888.005.patch

> Corrective mechanisms for rebalancing NM container queues
> -
>
> Key: YARN-2888
> URL: https://issues.apache.org/jira/browse/YARN-2888
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
> Attachments: YARN-2888-yarn-2877.001.patch, 
> YARN-2888-yarn-2877.002.patch, YARN-2888.003.patch, YARN-2888.004.patch, 
> YARN-2888.005.patch
>
>
> Bad queuing decisions by the LocalRMs (e.g., due to the distributed nature of 
> the scheduling decisions or due to having a stale image of the system) may 
> lead to an imbalance in the waiting times of the NM container queues. This 
> can in turn have an impact in job execution times and cluster utilization.
> To this end, we introduce corrective mechanisms that may remove (whenever 
> needed) container requests from overloaded queues, adding them to less-loaded 
> ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4390) Do surgical preemption based on reserved container in CapacityScheduler

2016-05-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15273017#comment-15273017
 ] 

Hudson commented on YARN-4390:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9724 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9724/])
YARN-4390. Do surgical preemption based on reserved container in (jianhe: rev 
bb62e0592566b2fcae7136b30972aad2d3ac55b0)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TempQueuePerPartition.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicyMockFramework.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerSurgicalPreemption.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerPreemptionTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/FifoCandidatesSelector.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicyMockFramework.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicyForNodePartitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ReservedContainerCandidatesSelector.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicyForReservedContainers.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmcontainer/RMContainer.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 

[jira] [Updated] (YARN-4766) NM should not aggregate logs older than the retention policy

2016-05-05 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-4766:
-
Attachment: yarn4766.004.patch

rebased the patch on the latest trunk, fixed the compilation issue

> NM should not aggregate logs older than the retention policy
> 
>
> Key: YARN-4766
> URL: https://issues.apache.org/jira/browse/YARN-4766
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn4766.001.patch, yarn4766.002.patch, 
> yarn4766.003.patch, yarn4766.004.patch, yarn4766.004.patch
>
>
> When a log aggregation fails on the NM the information is for the attempt is 
> kept in the recovery DB. Log aggregation can fail for multiple reasons which 
> are often related to HDFS space or permissions.
> On restart the recovery DB is read and if an application attempt needs its 
> logs aggregated, the files are scheduled for aggregation without any checks. 
> The log files could be older than the retention limit in which case we should 
> not aggregate them but immediately mark them for deletion from the local file 
> system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3362) Add node label usage in RM CapacityScheduler web UI

2016-05-05 Thread Eric Payne (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-3362:
-
Attachment: YARN-3362-branch-2.7.003.patch

[~Naganarasimha]], uploading YARN-3362-branch-2.7.003.patch

Most of the checkstyle warnings were from the previous (trunk) patch. However, 
there were a few that I introduced, so I fixed those. I also fixed some of the 
other ones including methods and parameters that needed to be final and adding 
minimal javadocs.

> Add node label usage in RM CapacityScheduler web UI
> ---
>
> Key: YARN-3362
> URL: https://issues.apache.org/jira/browse/YARN-3362
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacityscheduler, resourcemanager, webapp
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Fix For: 2.8.0
>
> Attachments: 2015.05.06 Folded Queues.png, 2015.05.06 Queue 
> Expanded.png, 2015.05.07_3362_Queue_Hierarchy.png, 
> 2015.05.10_3362_Queue_Hierarchy.png, 2015.05.12_3362_Queue_Hierarchy.png, 
> AppInLabelXnoStatsInSchedPage.png, CSWithLabelsView.png, 
> No-space-between-Active_user_info-and-next-queues.png, Screen Shot 2015-04-29 
> at 11.42.17 AM.png, YARN-3362-branch-2.7.002.patch, 
> YARN-3362-branch-2.7.003.patch, YARN-3362.20150428-3-modified.patch, 
> YARN-3362.20150428-3.patch, YARN-3362.20150506-1.patch, 
> YARN-3362.20150507-1.patch, YARN-3362.20150510-1.patch, 
> YARN-3362.20150511-1.patch, YARN-3362.20150512-1.patch, capacity-scheduler.xml
>
>
> We don't have node label usage in RM CapacityScheduler web UI now, without 
> this, user will be hard to understand what happened to nodes have labels 
> assign to it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5019) [YARN-3368] Change urls in new YARN ui from camel casing to hyphens

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272967#comment-15272967
 ] 

Hadoop QA commented on YARN-5019:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 43s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
28s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 2m 32s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:f38692c |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802498/YARN-5019-YARN-3368.1.patch
 |
| JIRA Issue | YARN-5019 |
| Optional Tests |  asflicense  |
| uname | Linux 34a11fba1afa 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-3368 / 2d617a5 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/11347/artifact/patchprocess/whitespace-eol.txt
 |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/11347/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> [YARN-3368] Change urls in new YARN ui from camel casing to hyphens
> ---
>
> Key: YARN-5019
> URL: https://issues.apache.org/jira/browse/YARN-5019
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Varun Vasudev
>Assignee: Sunil G
> Attachments: YARN-5019-YARN-3368.1.patch
>
>
> There are a couple of reasons we should recommend avoiding camel casing in 
> urls -
> 1. Some web servers are case insensitive
> 2. Google suggests using hyphens - 
> https://support.google.com/webmasters/answer/76329



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4896) ProportionalPreemptionPolicy needs to handle AMResourcePercentage per partition

2016-05-05 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272928#comment-15272928
 ] 

Sunil G commented on YARN-4896:
---

Hi [~leftnoteasy], could you pls take a look on this patch.

> ProportionalPreemptionPolicy needs to handle AMResourcePercentage per 
> partition
> ---
>
> Key: YARN-4896
> URL: https://issues.apache.org/jira/browse/YARN-4896
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.7.2
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-4896.patch, 0002-YARN-4896.patch
>
>
> In PCPP, currently we are using {{getMaxAMResourcePerQueuePercent()}} to get 
> the max AM capacity for queue to save AM Containers from preemption. As we 
> are now supporting MaxAMResourcePerQueuePercent per partition, PCPP also need 
> to handle the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5001) Aggregated Logs root directory is created with wrong group if nonexistent

2016-05-05 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272908#comment-15272908
 ] 

Haibo Chen commented on YARN-5001:
--

The fix will explicitly set the group of /tmp/logs to the group that the user 
Node Manager is running as belongs to. This will not work if the group of the 
user JHS is running as is not the same as that of Node Manager. 

> Aggregated Logs root directory is created with wrong group if nonexistent 
> --
>
> Key: YARN-5001
> URL: https://issues.apache.org/jira/browse/YARN-5001
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn5001.001.patch
>
>
> The directory /tmp/logs, where the aggregated logs go, is supposed to be read 
> by JHS. But if it is not created beforehand, it will be created by 
> NodeManager with the group being the superuser set in HDFS.  Files created 
> under this directory will then inherit the supergroup as their groups. This 
> leads to JHS to fail to read the container logs files under that directory if 
> JHS is not running as a user that belongs to superuser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable

2016-05-05 Thread Nathan Roberts (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nathan Roberts updated YARN-4963:
-
Attachment: YARN-4963.002.patch

Address checkstyle comment.



> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat 
> configurable
> 
>
> Key: YARN-4963
> URL: https://issues.apache.org/jira/browse/YARN-4963
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacityscheduler
>Affects Versions: 3.0.0, 2.7.2
>Reporter: Nathan Roberts
>Assignee: Nathan Roberts
> Attachments: YARN-4963.001.patch, YARN-4963.002.patch
>
>
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment 
> per heartbeat. With more and more non MapReduce workloads coming along, the 
> degree of locality is declining, causing scheduling to be significantly 
> slower. It's still important to limit the number of OFF_SWITCH assignments to 
> avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH 
> assignments configurable.
> Will upload candidate patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-05-05 Thread Daniel Zhi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272901#comment-15272901
 ] 

Daniel Zhi commented on YARN-4676:
--

To clarify before I make code changes:

1. HostsFileReader currently allows multiple hosts per line. When hosts are 
pure digits, there will be ambiguity with timeout during interpretation. Likely 
allowing pure digit would requires pure-digit-host starts with a new line.
2. -1 means infinite timeout (wait forever until ready). null means no 
overwrite, use the default timeout. 
3. there could be large number of hosts to be decommissioned so the single line 
could be huge. grep a particular host would return a huge line in that case. A 
mix could be log in a single line for less than N host but otherwise multiple 
line. That said, I am ok to change to single line.
4. simple after 1)
5. same as 2
6. ok
7. How about DEFAULT_NM_EXIT_WAIT_MS = 0? So that it could be customized in 
cases the delay is preferred.
8. The grace period is to give RM server-side a chance to DECOMMISSION the node 
should timeout reaches. A much smaller period like 2 seconds most likely would 
be sufficient as NodeManager heartbeat every second during which 
DECOMMISSIONING node will be re-evaluated and decommissioned if ready or 
timeout.
9. "yarn rmadmin -refreshNodes -g -1" waits forever until the node is ready. 
"yarn rmadmin -refreshNodes -g" uses default timeout as specified by the 
configuration key.
10. same as 2)
11. ok
12. see 7)
13. ok
14. Here is an example of the tabular logging. Keeping DECOMMISSIONED node a 
little longer prevent it from suddenly disappeared from the list after 
DECOMMISSIONed. 
2015-08-14 20:31:00,797 INFO 
org.apache.hadoop.yarn.server.resourcemanager.DecommissioningNodesWatcher (IPC 
Server handler 14 on 9023): Decommissioning Nodes: 
  ip-10-45-166-151.ec2.internal20s fresh:  0s containers:14 
WAIT_CONTAINER timeout:1779s
application_1439334429355_0004 RUNNING MAPREDUCE  7.50%55s
  ip-10-170-95-251.ec2.internal20s fresh:  0s containers:14 
WAIT_CONTAINER timeout:1779s
application_1439334429355_0004 RUNNING MAPREDUCE  7.50%55s
  ip-10-29-137-237.ec2.internal19s fresh:  0s containers:14 
WAIT_CONTAINER timeout:1780s
application_1439334429355_0004 RUNNING MAPREDUCE  7.50%55s
  ip-10-157-4-26.ec2.internal  19s fresh:  0s containers:14 
WAIT_CONTAINER timeout:1780s
application_1439334429355_0004 RUNNING MAPREDUCE  7.50%55s

15. I agree that getDecommissioningStatus suggest the call is read-only. Since 
completed apps need to be take into account when evaluate readiness of the 
node, getDecommissioningStatus is actually a private method used internally so 
it could be changed into private checkDecommissioningStatus(nodeId).

16. readDecommissioningTimeout is to pick up new value without restart RM. It 
was requested by EMR customers and I do see the user scenarios. It is only 
invoked when there are DECOMMISSIONED nodes and will only be invoked once every 
20 seconds (poll period). I have to maintain private patch or consider other 
option if remove the feature.

17. ok
18. The method return number of seconds to timeout. I don't mind changing the 
name to getTimeoutTimestampInSec() but don't see the reason behind.

19. see the example in 14. This is once every 20 seconds and was very useful
during my development and testing of the work. I see more valuable to leave it 
as INFO but as the code become mature and stable, maybe ok to turn into DEBUG.

20. ok
21. The isValidNode() && isNodeInDecommissioning() condition is just a very 
quick shallow check --- for a DECOMMISSIONING node, although nodesListManager 
would return false for isValidNode() as the node appear in excluded host list, 
such node will be allowed to continue as it is in the middle of 
DECOMMISSIONING. During the process of the heart beat, decommissioningWatcher 
is updated with the latest container status of the node; Later 
decomWatcher.checkReadyToBeDecommissioned(rmNode.getNodeID()) evaluates its 
readiness and DECOMMISSION the node if ready (include timeout).  

22. the call simply returns if within 20 seconds of last call. Currently it 
lives inside ResourceTrackerService and uses rmContext. Alternatively 
DecommissioningNodesWatcher could be constructed with rmContext and internally 
has its own polling thread. Other than not sure yet the code pattern to use for 
such internal thread, it appears as valid alternative to me. 

23. ok
24. ok
25. Instead of disallow and exit, an alternative way is to allow the graceful 
decommission as usual. There will be no difference if no RM restart during the 
session. In case RM restart, currently all excluded nodes decommissioned right 
away, an enhanced support in future will resume it. 


> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> 

[jira] [Updated] (YARN-5019) [YARN-3368] Change urls in new YARN ui from camel casing to hyphens

2016-05-05 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5019:
--
Attachment: YARN-5019-YARN-3368.1.patch

Attaching an initial version of the patch.

> [YARN-3368] Change urls in new YARN ui from camel casing to hyphens
> ---
>
> Key: YARN-5019
> URL: https://issues.apache.org/jira/browse/YARN-5019
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Varun Vasudev
>Assignee: Sunil G
> Attachments: YARN-5019-YARN-3368.1.patch
>
>
> There are a couple of reasons we should recommend avoiding camel casing in 
> urls -
> 1. Some web servers are case insensitive
> 2. Google suggests using hyphens - 
> https://support.google.com/webmasters/answer/76329



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5001) Aggregated Logs root directory is created with wrong group if nonexistent

2016-05-05 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5001:
-
Description: The directory /tmp/logs, where the aggregated logs go, is 
supposed to be read by JHS. But if it is not created beforehand, it will be 
created by NodeManager with the group being the superuser set in HDFS.  Files 
created under this directory will then inherit the supergroup as their groups. 
This leads to JHS to fail to read the container logs files under that directory 
if JHS is not running as a user that belongs to superuser.  (was: The directory 
/tmp/logs, where the aggregated logs go, is supposed to be read by JHS. But if 
it is not created beforehand, it will be created by NodeManager with group 
being the superuser set in HDFS. This leads to JHS to fail to read the 
container logs files under that directory if JHS is not running as a user that 
belongs to superuser.)

> Aggregated Logs root directory is created with wrong group if nonexistent 
> --
>
> Key: YARN-5001
> URL: https://issues.apache.org/jira/browse/YARN-5001
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn5001.001.patch
>
>
> The directory /tmp/logs, where the aggregated logs go, is supposed to be read 
> by JHS. But if it is not created beforehand, it will be created by 
> NodeManager with the group being the superuser set in HDFS.  Files created 
> under this directory will then inherit the supergroup as their groups. This 
> leads to JHS to fail to read the container logs files under that directory if 
> JHS is not running as a user that belongs to superuser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5001) Aggregated Logs root directory is created with wrong group if nonexistent

2016-05-05 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-5001:
-
Description: The directory /tmp/logs, where the aggregated logs go, is 
supposed to be read by JHS. But if it is not created beforehand, it will be 
created by NodeManager with group being the superuser set in HDFS. This leads 
to JHS to fail to read the container logs files under that directory if JHS is 
not running as a user that belongs to superuser.  (was: Usually, the group 
owner for /tmp/logs, where the aggregated logs go, is "hadoop". Under that dir, 
you then have /logs// with group 
being "hadoop" all the way down. 

If you delete the /tmp/logs dir (when you want to clean up all the logs), the 
directory will be created with a different group "superuser". The JHS runs as 
the mapred user, who is a member of the hadoop group. With the new group, the 
JHS doesn't have permission to read the logs any more.)

> Aggregated Logs root directory is created with wrong group if nonexistent 
> --
>
> Key: YARN-5001
> URL: https://issues.apache.org/jira/browse/YARN-5001
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn5001.001.patch
>
>
> The directory /tmp/logs, where the aggregated logs go, is supposed to be read 
> by JHS. But if it is not created beforehand, it will be created by 
> NodeManager with group being the superuser set in HDFS. This leads to JHS to 
> fail to read the container logs files under that directory if JHS is not 
> running as a user that belongs to superuser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5029) RM needs to send update event with YarnApplicationState as Running to ATS/AHS

2016-05-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272832#comment-15272832
 ] 

Junping Du commented on YARN-5029:
--

Thanks [~xgong] for reporting the issue and putting up a patch. The patch looks 
good in overall. However, I see we have UT for TestSystemMetricsPublisher to 
check if SystemMetricsPublisher received the event. It would be helpful to have 
an additional check in ATS side to actually see event is written in backend - 
just like what we have now in TestDistributedShell.
Also, there are several indentation issues in v1 patch that we should fix.
Other looks fine.

> RM needs to send update event with YarnApplicationState as Running to ATS/AHS
> -
>
> Key: YARN-5029
> URL: https://issues.apache.org/jira/browse/YARN-5029
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>Priority: Critical
> Attachments: YARN-5029.1.patch
>
>
> Right now, Application in AHS/ATS is alway in ACCEPTED state until the 
> application finishes/Fails/is killed. This is because RM did not send any 
> other YarnApplicationState information, except FINISHED/FAILED/KILLED, to 
> ATS.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5034) Failing tests after using try-with-resources

2016-05-05 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated YARN-5034:
---
Attachment: YARN-5034.13.patch

13: Comment change

> Failing tests after using try-with-resources
> 
>
> Key: YARN-5034
> URL: https://issues.apache.org/jira/browse/YARN-5034
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-5034.01.patch, YARN-5034.02.patch, 
> YARN-5034.03.patch, YARN-5034.04.patch, YARN-5034.05.patch, 
> YARN-5034.06.patch, YARN-5034.07.patch, YARN-5034.08.patch, 
> YARN-5034.09.patch, YARN-5034.10.patch, YARN-5034.11.patch, 
> YARN-5034.12.patch, YARN-5034.13.patch
>
>
> This JIRA for following up failing tests. I am not able to reproduce locally 
> neither on mac nor CentOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4994) Use MiniYARNCluster with try-with-resources in tests

2016-05-05 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272781#comment-15272781
 ] 

Andras Bokor commented on YARN-4994:


Meanwhile, I have an idea, will get back to you.

> Use MiniYARNCluster with try-with-resources in tests
> 
>
> Key: YARN-4994
> URL: https://issues.apache.org/jira/browse/YARN-4994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch, 
> HDFS-10287.03.patch
>
>
> In tests MiniYARNCluster is used with the following pattern:
> In try-catch block create a MiniYARNCluster instance and in finally block 
> close it.
> [Try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]
>  is preferred since Java7 instead of the pattern above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5002) getApplicationReport call may raise NPE

2016-05-05 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272780#comment-15272780
 ] 

Wangda Tan commented on YARN-5002:
--

[~kasha],

We totally agree that enforce same ACL for running and completed applications 
in removed queues is the best solution. However, since we don't store ACL info 
for removed queues today, we cannot get ACL info for removed queues.

I would like to suggest to fix this NPE issue first, and for the longer term, 
after we can store ACL info for removed queues as well, we can enforce same ACL 
check.

Thoughts? 

> getApplicationReport call may raise NPE
> ---
>
> Key: YARN-5002
> URL: https://issues.apache.org/jira/browse/YARN-5002
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Sumana Sathish
>Assignee: Jian He
>Priority: Critical
> Attachments: YARN-5002.1.patch, YARN-5002.2.patch, YARN-5002.3.patch
>
>
> getApplicationReport call may raise NPE
> {code}
> Exception in thread "main" java.lang.NullPointerException: 
> java.lang.NullPointerException
>  
> org.apache.hadoop.yarn.server.resourcemanager.security.QueueACLsManager.checkAccess(QueueACLsManager.java:57)
>  
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.checkAccess(ClientRMService.java:279)
>  
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:760)
>  
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplications(ClientRMService.java:682)
>  
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplications(ApplicationClientProtocolPBServiceImpl.java:234)
>  
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:425)
>  
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>  org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>  org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2268)
>  org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2264)
>  java.security.AccessController.doPrivileged(Native Method)
>  javax.security.auth.Subject.doAs(Subject.java:422)
>  
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1708)
>  org.apache.hadoop.ipc.Server$Handler.run(Server.java:2262)
>  sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
>  org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:107)
>  
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplications(ApplicationClientProtocolPBClientImpl.java:254)
>  sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  java.lang.reflect.Method.invoke(Method.java:498)
>  
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  com.sun.proxy.$Proxy18.getApplications(Unknown Source)
>  
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplications(YarnClientImpl.java:479)
>  
> org.apache.hadoop.mapred.ResourceMgrDelegate.getAllJobs(ResourceMgrDelegate.java:135)
>  org.apache.hadoop.mapred.YARNRunner.getAllJobs(YARNRunner.java:167)
>  org.apache.hadoop.mapreduce.Cluster.getAllJobStatuses(Cluster.java:294)
>  org.apache.hadoop.mapreduce.tools.CLI.listJobs(CLI.java:553)
>  org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:338)
>  org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  org.apache.hadoop.mapred.JobClient.main(JobClient.java:1274)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4994) Use MiniYARNCluster with try-with-resources in tests

2016-05-05 Thread Andras Bokor (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272772#comment-15272772
 ] 

Andras Bokor commented on YARN-4994:


Hi [~jzhuge],

In the last two days I tried many different combinations to find out why the 
tests are failing.
In YARN-5034 you can see my last patch (patch 12). I changed only stop method 
to close and the tests failed.
It does not make sense for me at all. Close() calls stop. How can it ruin the 
test results? It seems not-intermittent. Always the same tests are failing I 
can reproduce builds with no error.
Do you have idea? It is very strange.

> Use MiniYARNCluster with try-with-resources in tests
> 
>
> Key: YARN-4994
> URL: https://issues.apache.org/jira/browse/YARN-4994
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: HDFS-10287.01.patch, HDFS-10287.02.patch, 
> HDFS-10287.03.patch
>
>
> In tests MiniYARNCluster is used with the following pattern:
> In try-catch block create a MiniYARNCluster instance and in finally block 
> close it.
> [Try-with-resources|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]
>  is preferred since Java7 instead of the pattern above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5024) TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers random failure

2016-05-05 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272737#comment-15272737
 ] 

Yufei Gu commented on YARN-5024:


Sorry, not quite follow you. Could you please elaborate it?

> TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers 
> random failure
> ---
>
> Key: YARN-5024
> URL: https://issues.apache.org/jira/browse/YARN-5024
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-5024.patch, 0002-YARN-5024.patch, 
> 0003-YARN-5024.patch
>
>
> Random Testcase failure for 
> {{TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers}}
> {noformat}
> java.lang.AssertionError: Unexcpected MemorySeconds value 
> expected:<-1497214794931> but was:<1913>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.amRestartTests(TestContainerResourceUsage.java:395)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers(TestContainerResourceUsage.java:252)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5024) TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers random failure

2016-05-05 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272706#comment-15272706
 ] 

Yufei Gu commented on YARN-5024:


Hi [~bibinchundatt],  to reproduce these flakiness, should remove the 
{{Thread.sleep}} I manually added to prevent the flakiness. For exmaple, this 
is from  function {{amRestartTests}}:

{quote}
//TODO explore a better way than sleeping for a while (YARN-4929)
 Thread.sleep(1000);
{quote}


> TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers 
> random failure
> ---
>
> Key: YARN-5024
> URL: https://issues.apache.org/jira/browse/YARN-5024
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: 0001-YARN-5024.patch, 0002-YARN-5024.patch, 
> 0003-YARN-5024.patch
>
>
> Random Testcase failure for 
> {{TestContainerResourceUsage#testUsageAfterAMRestartWithMultipleContainers}}
> {noformat}
> java.lang.AssertionError: Unexcpected MemorySeconds value 
> expected:<-1497214794931> but was:<1913>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.amRestartTests(TestContainerResourceUsage.java:395)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestContainerResourceUsage.testUsageAfterAMRestartWithMultipleContainers(TestContainerResourceUsage.java:252)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4984) LogAggregationService shouldn't swallow exception in handling createAppDir() which cause thread leak.

2016-05-05 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272682#comment-15272682
 ] 

Junping Du commented on YARN-4984:
--

Thanks [~leftnoteasy] to review and commit!

> LogAggregationService shouldn't swallow exception in handling createAppDir() 
> which cause thread leak.
> -
>
> Key: YARN-4984
> URL: https://issues.apache.org/jira/browse/YARN-4984
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.2
>Reporter: Junping Du
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.8.0
>
> Attachments: YARN-4984-v2.patch, YARN-4984-v3.patch, 
> YARN-4984-v4.patch, YARN-4984.patch
>
>
> Due to YARN-4325, many stale applications still exists in NM state store and 
> get recovered after NM restart. The app initiation will get failed due to 
> token invalid, but exception is swallowed and aggregator thread is still 
> created for invalid app.
> Exception is:
> {noformat}
> 158 2016-04-19 23:38:33,039 ERROR logaggregation.LogAggregationService 
> (LogAggregationService.java:run(300)) - Failed to setup application log 
> directory for application_1448060878692_11842
> 159 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 1380589 for hdfswrite) can't be fo
> und in cache
> 160 at org.apache.hadoop.ipc.Client.call(Client.java:1427)
> 161 at org.apache.hadoop.ipc.Client.call(Client.java:1358)
> 162 at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> 163 at com.sun.proxy.$Proxy13.getFileInfo(Unknown Source)
> 164 at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> 165 at sun.reflect.GeneratedMethodAccessor76.invoke(Unknown 
> Source)
> 166 at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 167 at java.lang.reflect.Method.invoke(Method.java:606)
> 168 at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
> 169 at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> 170 at com.sun.proxy.$Proxy14.getFileInfo(Unknown Source)
> 171 at 
> org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116)
> 172 at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1315)
> 173 at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1311)
> 174 at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> 175 at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1311)
> 176 at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.checkExists(LogAggregationService.java:248)
> 177 at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.access$100(LogAggregationService.java:67)
> 178 at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:276)
> 179 at java.security.AccessController.doPrivileged(Native Method)
> 180 at javax.security.auth.Subject.doAs(Subject.java:415)
> 181 at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> 182 at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:261)
> 183 at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:367)
> 184 at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320)
> 185 at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:447)
> 186 at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: 

[jira] [Commented] (YARN-5044) Add peak memory usage counter for each task

2016-05-05 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272664#comment-15272664
 ] 

Varun Vasudev commented on YARN-5044:
-

[~yufeigu] - is this work for every container? Have you looked at the 
ContainerMetrics class - it should have this information for physical memory at 
least.

> Add peak memory usage counter for each task
> ---
>
> Key: YARN-5044
> URL: https://issues.apache.org/jira/browse/YARN-5044
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
> are snapshots of memory usage of that task. They are not sufficient for users 
> to understand peak memory usage by that task, e.g. in order to diagnose task 
> failures, tune job parameters or change application design. This new feature 
> will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
> VIRTUAL_MEMORY_BYTES_MAX.
> This JIRA has the same feature from MAPREDUCE-4710.  I file this new YARN 
> JIRA since MAPREDUCE-4710 is pretty old one from MR 1.x era, it more or less 
> assumes a branch-1 architecture, should be close at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-2143) Merge common killContainer logic of Fair/Capacity scheduler into AbstractYarnScheduler

2016-05-05 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang reassigned YARN-2143:


Assignee: Ray Chiang

> Merge common killContainer logic of Fair/Capacity scheduler into 
> AbstractYarnScheduler
> --
>
> Key: YARN-2143
> URL: https://issues.apache.org/jira/browse/YARN-2143
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Ray Chiang
>
> Currently, CapacityScheduler has killContainer API inherited from 
> PreemptableResourceScheduler, and FairScheduler uses warnOrKillContainer to 
> do container preemption. We'd better to merge common code to kill container 
> into AbstractYarnScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5030) Revisit o.a.h.y.s.rm.scheduler.ResourceUsage

2016-05-05 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang reassigned YARN-5030:


Assignee: Ray Chiang

> Revisit o.a.h.y.s.rm.scheduler.ResourceUsage
> 
>
> Key: YARN-5030
> URL: https://issues.apache.org/jira/browse/YARN-5030
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Karthik Kambatla
>Assignee: Ray Chiang
>  Labels: newbie++
>
> YARN-3092 introduced ResourceUsage class to track a number of things around 
> resources. The naming is a little ambiguous and hence less conducive to being 
> used elsewhere. 
> # UsageByLabel doesn't need to be label specific. A more descriptive name 
> (TenantResourceTracker) might be more apt - since the class tracks more than 
> just usage. 
> # Accordingly, ResourceUsage itself can be renamed to more descriptive (and 
> less ambiguous) to LabelWiseTenantResourceTracker or some such. 
> # TenantResourceTracker (previously UsageByLabel) should probably be a class 
> on its own, and the private ResourceType should be part of it instead of the 
> mapping against labels.
> # Ideally, would like for the names to say allocation to capture allocation 
> instead of usage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5047) Refactor nodeUpdate() from FairScheduler and CapacityScheduler

2016-05-05 Thread Ray Chiang (JIRA)
Ray Chiang created YARN-5047:


 Summary: Refactor nodeUpdate() from FairScheduler and 
CapacityScheduler
 Key: YARN-5047
 URL: https://issues.apache.org/jira/browse/YARN-5047
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: capacityscheduler, fairscheduler, scheduler
Affects Versions: 3.0.0
Reporter: Ray Chiang
Assignee: Ray Chiang


FairScheduler#nodeUpdate() and CapacityScheduler#nodeUpdate() have a lot of 
commonality in their code.  See about refactoring the common parts into 
AbstractYARNScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5046) [Umbrella] Refactor scheduler code

2016-05-05 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-5046:
-
Description: 
At this point in time, there are several places where code common to the 
schedulers can be moved from one or more of the schedulers into 
AbstractYARNScheduler or a related interface.

Creating this umbrella JIRA to track this refactoring.  In general, it is 
preferable to create a subtask JIRA on a per-method basis.

This may need some coordination with [YARN-3091  \[Umbrella\] Improve and fix 
locks of RM scheduler|https://issues.apache.org/jira/browse/YARN-3091].

  was:
At this point in time, there are several places where code common to the 
schedulers can be moved from one or more of the schedulers into 
AbstractYARNScheduler or a related interface.

Creating this umbrella JIRA to track this refactoring.  In general, it is 
preferable to create a subtask JIRA on a per-method basis.


> [Umbrella] Refactor scheduler code
> --
>
> Key: YARN-5046
> URL: https://issues.apache.org/jira/browse/YARN-5046
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: capacity scheduler, fairscheduler, resourcemanager, 
> scheduler
>Affects Versions: 3.0.0
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: technical_debt
>
> At this point in time, there are several places where code common to the 
> schedulers can be moved from one or more of the schedulers into 
> AbstractYARNScheduler or a related interface.
> Creating this umbrella JIRA to track this refactoring.  In general, it is 
> preferable to create a subtask JIRA on a per-method basis.
> This may need some coordination with [YARN-3091  \[Umbrella\] Improve and fix 
> locks of RM scheduler|https://issues.apache.org/jira/browse/YARN-3091].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4809) De-duplicate container completion across schedulers

2016-05-05 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-4809:
-
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-5046

> De-duplicate container completion across schedulers
> ---
>
> Key: YARN-4809
> URL: https://issues.apache.org/jira/browse/YARN-4809
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Karthik Kambatla
>Assignee: Sunil G
> Attachments: 0001-YARN-4809.patch
>
>
> CapacityScheduler and FairScheduler implement containerCompleted the exact 
> same way. Duplication across the schedulers can be avoided. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5030) Revisit o.a.h.y.s.rm.scheduler.ResourceUsage

2016-05-05 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-5030:
-
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-5046

> Revisit o.a.h.y.s.rm.scheduler.ResourceUsage
> 
>
> Key: YARN-5030
> URL: https://issues.apache.org/jira/browse/YARN-5030
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Karthik Kambatla
>  Labels: newbie++
>
> YARN-3092 introduced ResourceUsage class to track a number of things around 
> resources. The naming is a little ambiguous and hence less conducive to being 
> used elsewhere. 
> # UsageByLabel doesn't need to be label specific. A more descriptive name 
> (TenantResourceTracker) might be more apt - since the class tracks more than 
> just usage. 
> # Accordingly, ResourceUsage itself can be renamed to more descriptive (and 
> less ambiguous) to LabelWiseTenantResourceTracker or some such. 
> # TenantResourceTracker (previously UsageByLabel) should probably be a class 
> on its own, and the private ResourceType should be part of it instead of the 
> mapping against labels.
> # Ideally, would like for the names to say allocation to capture allocation 
> instead of usage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-3776) FairScheduler code refactoring to separate out the code paths for assigning a reserved container and a non-reserved container

2016-05-05 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-3776:
-
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-5046

> FairScheduler code refactoring to separate out the code paths for assigning a 
> reserved container and a non-reserved container
> -
>
> Key: YARN-3776
> URL: https://issues.apache.org/jira/browse/YARN-3776
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Affects Versions: 2.7.0
>Reporter: zhihai xu
>Assignee: zhihai xu
>
> FairScheduler code refactoring, as discussed at YARN-3655, Separate out the 
> code paths for assigning a reserved container and a non-reserved container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.

2016-05-05 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272636#comment-15272636
 ] 

Sunil G commented on YARN-3933:
---

HI [~guoshiwei]
Could you pls update new patch with the suggested update.

> Race condition when calling AbstractYarnScheduler.completedContainer.
> -
>
> Key: YARN-3933
> URL: https://issues.apache.org/jira/browse/YARN-3933
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.6.0, 2.7.0, 2.5.2, 2.7.1
>Reporter: Lavkesh Lahngir
>Assignee: Shiwei Guo
> Attachments: YARN-3933.001.patch, YARN-3933.002.patch, 
> YARN-3933.003.patch
>
>
> In our cluster we are seeing available memory and cores being negative. 
> Initial inspection:
> Scenario no. 1: 
> In capacity scheduler the method allocateContainersToNode() checks if 
> there are excess reservation of containers for an application, and they are 
> no longer needed then it calls queue.completedContainer() which causes 
> resources being negative. And they were never assigned in the first place. 
> I am still looking through the code. Can somebody suggest how to simulate 
> excess containers assignments ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2143) Merge common killContainer logic of Fair/Capacity scheduler into AbstractYarnScheduler

2016-05-05 Thread Ray Chiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ray Chiang updated YARN-2143:
-
Issue Type: Sub-task  (was: Task)
Parent: YARN-5046

> Merge common killContainer logic of Fair/Capacity scheduler into 
> AbstractYarnScheduler
> --
>
> Key: YARN-2143
> URL: https://issues.apache.org/jira/browse/YARN-2143
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>
> Currently, CapacityScheduler has killContainer API inherited from 
> PreemptableResourceScheduler, and FairScheduler uses warnOrKillContainer to 
> do container preemption. We'd better to merge common code to kill container 
> into AbstractYarnScheduler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5046) [Umbrella] Refactor scheduler code

2016-05-05 Thread Ray Chiang (JIRA)
Ray Chiang created YARN-5046:


 Summary: [Umbrella] Refactor scheduler code
 Key: YARN-5046
 URL: https://issues.apache.org/jira/browse/YARN-5046
 Project: Hadoop YARN
  Issue Type: Task
  Components: capacity scheduler, fairscheduler, resourcemanager, 
scheduler
Affects Versions: 3.0.0
Reporter: Ray Chiang
Assignee: Ray Chiang


At this point in time, there are several places where code common to the 
schedulers can be moved from one or more of the schedulers into 
AbstractYARNScheduler or a related interface.

Creating this umbrella JIRA to track this refactoring.  In general, it is 
preferable to create a subtask JIRA on a per-method basis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5045) hbase unit tests fail due to dependency issues

2016-05-05 Thread Sangjin Lee (JIRA)
Sangjin Lee created YARN-5045:
-

 Summary: hbase unit tests fail due to dependency issues
 Key: YARN-5045
 URL: https://issues.apache.org/jira/browse/YARN-5045
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Sangjin Lee
Priority: Blocker


After the 5/4 rebase, the hbase unit tests in the timeline service project are 
failing:

{noformat}
org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
  Time elapsed: 5.103 sec  <<< ERROR!
java.io.IOException: Shutting down
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at 
org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677)
at 
org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546)
at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500)
at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104)
at 
org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345)
at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550)
at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at 
org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139)
at 
org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluster.java:217)
at 
org.apache.hadoop.hbase.LocalHBaseCluster.(LocalHBaseCluster.java:153)
at 
org.apache.hadoop.hbase.MiniHBaseCluster.init(MiniHBaseCluster.java:213)
at 
org.apache.hadoop.hbase.MiniHBaseCluster.(MiniHBaseCluster.java:93)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniHBaseCluster(HBaseTestingUtility.java:978)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:938)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:812)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:806)
at 
org.apache.hadoop.hbase.HBaseTestingUtility.startMiniCluster(HBaseTestingUtility.java:750)
at 
org.apache.hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage.setup(TestTimelineReaderWebServicesHBaseStorage.java:87)
{noformat}

The root cause is that the hbase mini server depends on hadoop common's 
{{MetricsServlet}} which has been removed in the trunk (HADOOP-12504):

{noformat}
Caused by: java.lang.NoClassDefFoundError: 
org/apache/hadoop/metrics/MetricsServlet
at 
org.apache.hadoop.hbase.http.HttpServer.addDefaultServlets(HttpServer.java:677)
at 
org.apache.hadoop.hbase.http.HttpServer.initializeWebServer(HttpServer.java:546)
at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:500)
at org.apache.hadoop.hbase.http.HttpServer.(HttpServer.java:104)
at 
org.apache.hadoop.hbase.http.HttpServer$Builder.build(HttpServer.java:345)
at org.apache.hadoop.hbase.http.InfoServer.(InfoServer.java:77)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.putUpWebUI(HRegionServer.java:1697)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.(HRegionServer.java:550)
at org.apache.hadoop.hbase.master.HMaster.(HMaster.java:333)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
at 
org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMClusterUtil.java:139)
... 26 more
{noformat}



--
This message was sent by 

[jira] [Updated] (YARN-5044) Add peak memory usage counter for each task

2016-05-05 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-5044:
---
Description: 
Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
are snapshots of memory usage of that task. They are not sufficient for users 
to understand peak memory usage by that task, e.g. in order to diagnose task 
failures, tune job parameters or change application design. This new feature 
will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
VIRTUAL_MEMORY_BYTES_MAX.

This JIRA has the same feature from MAPREDUCE-4710.  I file this new YARN JIRA 
since MAPREDUCE-4710 is pretty old one from MR 1.x era, it more or less assumes 
a branch-1 architecture, should be close at this point.

  was:
Need the same 

Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
are snapshots of memory usage of that task. They are not sufficient for users 
to understand peak memory usage by that task, e.g. in order to diagnose task 
failures, tune job parameters or change application design. This new feature 
will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
VIRTUAL_MEMORY_BYTES_MAX.



> Add peak memory usage counter for each task
> ---
>
> Key: YARN-5044
> URL: https://issues.apache.org/jira/browse/YARN-5044
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
> are snapshots of memory usage of that task. They are not sufficient for users 
> to understand peak memory usage by that task, e.g. in order to diagnose task 
> failures, tune job parameters or change application design. This new feature 
> will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
> VIRTUAL_MEMORY_BYTES_MAX.
> This JIRA has the same feature from MAPREDUCE-4710.  I file this new YARN 
> JIRA since MAPREDUCE-4710 is pretty old one from MR 1.x era, it more or less 
> assumes a branch-1 architecture, should be close at this point.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5034) Failing tests after using try-with-resources

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272620#comment-15272620
 ] 

Hadoop QA commented on YARN-5034:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
34s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
19s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
43s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 4s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 22s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
21s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 145m 41s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.yarn.client.TestGetGroups |
|   | hadoop.yarn.client.api.impl.TestAMRMProxy |
| JDK v1.8.0_91 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestYarnClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestNMClient |
| JDK v1.7.0_95 Failed junit tests | hadoop.yarn.client.TestGetGroups |
|   | hadoop.yarn.client.api.impl.TestAMRMProxy |
| JDK v1.7.0_95 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | 

[jira] [Updated] (YARN-5044) Add peak memory usage counter for each task

2016-05-05 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-5044:
---
Description: 
Need the same 

Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
are snapshots of memory usage of that task. They are not sufficient for users 
to understand peak memory usage by that task, e.g. in order to diagnose task 
failures, tune job parameters or change application design. This new feature 
will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
VIRTUAL_MEMORY_BYTES_MAX.


> Add peak memory usage counter for each task
> ---
>
> Key: YARN-5044
> URL: https://issues.apache.org/jira/browse/YARN-5044
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>
> Need the same 
> Each task has counters PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES, which 
> are snapshots of memory usage of that task. They are not sufficient for users 
> to understand peak memory usage by that task, e.g. in order to diagnose task 
> failures, tune job parameters or change application design. This new feature 
> will add two more counters for each task: PHYSICAL_MEMORY_BYTES_MAX and 
> VIRTUAL_MEMORY_BYTES_MAX.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5044) Add peak memory usage counter for each task

2016-05-05 Thread Yufei Gu (JIRA)
Yufei Gu created YARN-5044:
--

 Summary: Add peak memory usage counter for each task
 Key: YARN-5044
 URL: https://issues.apache.org/jira/browse/YARN-5044
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Yufei Gu
Assignee: Yufei Gu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5040) CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when yarn.nodemanager.resource.percentage-physical-cpu-limit < 100

2016-05-05 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272587#comment-15272587
 ] 

Varun Vasudev commented on YARN-5040:
-

I think I can reproduce this on a VM. Assigning it to myself.

> CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when 
> yarn.nodemanager.resource.percentage-physical-cpu-limit < 100
> --
>
> Key: YARN-5040
> URL: https://issues.apache.org/jira/browse/YARN-5040
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Sidharta Seethana
>
> /cc [~vvasudev]
> We have been running some benchmarks internally with resource isolation 
> enabled. We have consistently run into kernel panics when running a large job 
> ( a large pi job, terasort ). These kernel panics wen't away when we set 
> yarn.nodemanager.resource.percentage-physical-cpu-limit=100 . Anything less 
> than 100 triggers different behavior in YARN's CPU resource handler which 
> seems to cause these issues. Looking at the kernel crash dumps, the 
> backtraces were different - sometimes pointing to java processes, sometimes 
> not. 
> Kernel versions used : 3.10.0-229.14.1.el7.x86_64 and 
> 3.10.0-327.13.1.el7.x86_64 . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5040) CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when yarn.nodemanager.resource.percentage-physical-cpu-limit < 100

2016-05-05 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev reassigned YARN-5040:
---

Assignee: Varun Vasudev

> CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when 
> yarn.nodemanager.resource.percentage-physical-cpu-limit < 100
> --
>
> Key: YARN-5040
> URL: https://issues.apache.org/jira/browse/YARN-5040
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0
>Reporter: Sidharta Seethana
>Assignee: Varun Vasudev
>
> /cc [~vvasudev]
> We have been running some benchmarks internally with resource isolation 
> enabled. We have consistently run into kernel panics when running a large job 
> ( a large pi job, terasort ). These kernel panics wen't away when we set 
> yarn.nodemanager.resource.percentage-physical-cpu-limit=100 . Anything less 
> than 100 triggers different behavior in YARN's CPU resource handler which 
> seems to cause these issues. Looking at the kernel crash dumps, the 
> backtraces were different - sometimes pointing to java processes, sometimes 
> not. 
> Kernel versions used : 3.10.0-229.14.1.el7.x86_64 and 
> 3.10.0-327.13.1.el7.x86_64 . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-05-05 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272554#comment-15272554
 ] 

Varun Vasudev edited comment on YARN-4676 at 5/5/16 4:03 PM:
-

[~rkanter], [~kasha], [~djp] - instead of storing the timeouts in a state 
store, we could also modify the RM-NM protocol to support a delayed shutdown. 
That way when the node is decommissioned gracefully, we tell the NM to shutdown 
after the specified timeout. There'll have to some logic to cancel a shutdown 
for handling re-commissioned nodes but we won't need to worry about updating 
the RM state store with timeouts/timestamps. It also avoids the clock skew 
issue that Karthik mentioned above. Like Karthik and Robert mentioned, I'm fine 
with handling this in a follow up JIRA as long as the command exits without 
doing anything if graceful decommission is specified and the cluster is setup 
with work preserving restart.


was (Author: vvasudev):
[~rkanter], [~kasha], [~djp] - instead of storing the timeouts in a state 
store, we could also modify the RM-NM protocol to support a delayed shutdown. 
That way when the node is decommissioned gracefull, we tell the NM to shutdown 
after the specified timeout. There'll have to some logic to cancel a shutdown 
for handling re-commissioned nodes but we won't need to worry about updating 
the RM state store with timeouts/timestamps. It also avoids the clock skew 
issue that Karthik mentioned above. Like Karthik and Robert mentioned, I'm fine 
with handling this in a follow up JIRA as long as the command exits without 
doing anything if graceful decommission is specified and the cluster is setup 
with work preserving restart.

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> 
>
> Key: YARN-4676
> URL: https://issues.apache.org/jira/browse/YARN-4676
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Daniel Zhi
>Assignee: Daniel Zhi
>  Labels: features
> Attachments: GracefulDecommissionYarnNode.pdf, 
> GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, 
> YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, 
> YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, 
> YARN-4676.012.patch, YARN-4676.013.patch
>
>
> YARN-4676 implements an automatic, asynchronous and flexible mechanism to 
> graceful decommission
> YARN nodes. After user issues the refreshNodes request, ResourceManager 
> automatically evaluates
> status of all affected nodes to kicks out decommission or recommission 
> actions. RM asynchronously
> tracks container and application status related to DECOMMISSIONING nodes to 
> decommission the
> nodes immediately after there are ready to be decommissioned. Decommissioning 
> timeout at individual
> nodes granularity is supported and could be dynamically updated. The 
> mechanism naturally supports multiple
> independent graceful decommissioning “sessions” where each one involves 
> different sets of nodes with
> different timeout settings. Such support is ideal and necessary for graceful 
> decommission request issued
> by external cluster management software instead of human.
> DecommissioningNodeWatcher inside ResourceTrackingService tracks 
> DECOMMISSIONING nodes status automatically and asynchronously after 
> client/admin made the graceful decommission request. It tracks 
> DECOMMISSIONING nodes status to decide when, after all running containers on 
> the node have completed, will be transitioned into DECOMMISSIONED state. 
> NodesListManager detect and handle include and exclude list changes to kick 
> out decommission or recommission as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-05-05 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272554#comment-15272554
 ] 

Varun Vasudev commented on YARN-4676:
-

[~rkanter], [~kasha], [~djp] - instead of storing the timeouts in a state 
store, we could also modify the RM-NM protocol to support a delayed shutdown. 
That way when the node is decommissioned gracefull, we tell the NM to shutdown 
after the specified timeout. There'll have to some logic to cancel a shutdown 
for handling re-commissioned nodes but we won't need to worry about updating 
the RM state store with timeouts/timestamps. It also avoids the clock skew 
issue that Karthik mentioned above. Like Karthik and Robert mentioned, I'm fine 
with handling this in a follow up JIRA as long as the command exits without 
doing anything if graceful decommission is specified and the cluster is setup 
with work preserving restart.

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> 
>
> Key: YARN-4676
> URL: https://issues.apache.org/jira/browse/YARN-4676
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Daniel Zhi
>Assignee: Daniel Zhi
>  Labels: features
> Attachments: GracefulDecommissionYarnNode.pdf, 
> GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch, 
> YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, 
> YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, 
> YARN-4676.012.patch, YARN-4676.013.patch
>
>
> YARN-4676 implements an automatic, asynchronous and flexible mechanism to 
> graceful decommission
> YARN nodes. After user issues the refreshNodes request, ResourceManager 
> automatically evaluates
> status of all affected nodes to kicks out decommission or recommission 
> actions. RM asynchronously
> tracks container and application status related to DECOMMISSIONING nodes to 
> decommission the
> nodes immediately after there are ready to be decommissioned. Decommissioning 
> timeout at individual
> nodes granularity is supported and could be dynamically updated. The 
> mechanism naturally supports multiple
> independent graceful decommissioning “sessions” where each one involves 
> different sets of nodes with
> different timeout settings. Such support is ideal and necessary for graceful 
> decommission request issued
> by external cluster management software instead of human.
> DecommissioningNodeWatcher inside ResourceTrackingService tracks 
> DECOMMISSIONING nodes status automatically and asynchronously after 
> client/admin made the graceful decommission request. It tracks 
> DECOMMISSIONING nodes status to decide when, after all running containers on 
> the node have completed, will be transitioned into DECOMMISSIONED state. 
> NodesListManager detect and handle include and exclude list changes to kick 
> out decommission or recommission as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-05-05 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272550#comment-15272550
 ] 

Varun Vasudev commented on YARN-4676:
-

Thanks for the patch [~danzhi]. My apologies for coming in late but I have some 
concerns about the patch and the approach.

The code to read the hostnames and timeouts in HostsFileReader is a little 
fragile and may lead to problems.

1.
{code}
+  // look ahead for optional timeout values
+  Integer timeout = null;
+  if (i < nodes.length - 1) {
+timeout = tryParseInteger(nodes[i+1]);
+  }
+  map.put(nodes[i], timeout);
+  // skip the timeout if exist
+  if (timeout != null) {
+i++;
+  }
{code}

This code assumes that the node names are non-numerical - this is assumption is 
not correct. As per RFC 1123, you can have hostnames made up entirely of 
digits. It also looks like we decommission nodes based on hostname only, 
whereas it is possible to run multiple nodemanagers on a node - this is 
probably something we can revisit later.

2.
{code}
+  map.put(nodes[i], timeout);
{code}
{code}
+  private static Integer tryParseInteger(String str) {
+try{
+  int num = Integer.parseInt(str);
+  return num;
+} catch (Exception e) {
+  return null;
+}
+  }
{code}

Is it possible for us to use -1 instead of null to specify that a timeout 
wasn't specified?

3.
{code}
+  private static void prettyLogMap(
+  String type, Map excludes, String filename) {
+if (excludes.size() == 0) {
+  return;
+}
+StringBuilder sb = new StringBuilder();
+for (Entry n : excludes.entrySet()) {
+  if (n.getValue() != null) {
+sb.append(String.format("%n  %s : %d", n.getKey(), n.getValue()));
+  } else {
+sb.append(String.format("%n  %s", n.getKey()));
+  }
+}
+LOG.info("List of " + type + " hosts from " + filename + sb.toString());
+  }
{code}

Instead of %n, can we just print all the hosts on a single line so that we can 
use grep to filter out the lines.

4.
{code}
+  @Test
+  public void testHostFileReaderWithTimeout() throws Exception {
{code}

The test needs to be updated to include numeric hostnames.

5.
{code}
+
+@Override
+public Integer getDecommissioningTimeout() {
+  return null;
+}
+  /**
+   * Get the DecommissionTimeout.
+   *
+   * @return decommissionTimeout
+   */
+  public abstract Integer getDecommissionTimeout();
   }
{code}

Similar to above, can we user -1 instead of null?

6.
{code}
+  public static final String DECOMMISSIONING_DEFAULT_TIMEOUT_KEY =
+  RM_PREFIX + "decommissioning.default.timeout";
+  public static final int DEFAULT_DECOMMISSIONING_TIMEOUT = 3600;
{code}

Can you please rename DECOMMISSIONING_DEFAULT_TIMEOUT_KEY to 
RM_NODE_GRACEFUL_DECOMMISSION_TIMEOUT, "decommissioning.default.timeout" to 
"nodemanager-graceful-decommission-timeout-secs" and 
DEFAULT_DECOMMISSIONING_TIMEOUT to DEFAULT_RM_NODE_GRACEFUL_DECOMMISSION_TIMEOUT

7.
{code}
+  public static final String NM_EXIT_WAIT_MS = NM_PREFIX + "exit-wait.ms";
+  public static final long DEFAULT_NM_EXIT_WAIT_MS = 5000;
{code}

I saw your reasoning for this in your earlier comments, but I'm not convinced 
this should be in the YARN nodemanager. This seems like an issue with the EMR 
setup. The change adds a wait time for all shutdowns. Please remove it.

8.
{code}
+// Additional seconds to wait before forcefully decommission nodes.
+// This is usually not needed since RM enforces timeout automatically.
+final int gracePeriod = 20;
{code}
Can you explain why this is needed? And why 20 seconds for the grace period?

9.
{code}
+  if ("-g".equals(args[1]) || "-graceful".equals(args[1])) {
+if (args.length == 3) {
+  int timeout = validateTimeout(args[2]);
+  return refreshNodes(timeout);
+} else {
+  return refreshNodes(true);
+}
+  }
{code}

Just to clarify my understanding here -
yarn rmadmin -refreshNodes -g 1000 will decommission node gracefully up to a 
limit of 1000 seconds after which it will forcefully shut down the nodes
yarn rmadmin -refreshNodes -g -1 will gracefully shutdown the nodes with the 
timeout being the value of 
yarn.resourcemanager.node-graceful-decommission-timeout
yarn rmadmin -refreshNodes -g is the same as "yarn rmadmin -refreshNodes -g -1"
Is my understanding correct?

10.
{code}
+  @Override
+  public synchronized void setDecommissionTimeout(Integer timeout) {
+maybeInitBuilder();
+if (timeout != null) {
+  builder.setDecommissionTimeout(timeout);
+} else {
+  builder.clearDecommissionTimeout();
+}
+  }
+
+  @Override
+  public synchronized Integer getDecommissionTimeout() {
+RefreshNodesRequestProtoOrBuilder p = viaProto ? proto 

[jira] [Commented] (YARN-5000) [YARN-3368] App attempt page is not loading when timeline server is not started

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272526#comment-15272526
 ] 

Hadoop QA commented on YARN-5000:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 37s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 4m 23s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 15s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 19 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
35s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 7m 8s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:f38692c |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802435/YARN-5000-YARN-3368.4.patch
 |
| JIRA Issue | YARN-5000 |
| Optional Tests |  asflicense  |
| uname | Linux 7750c6d0691d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-3368 / 2d617a5 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/11344/artifact/patchprocess/whitespace-tabs.txt
 |
| modules | C:  hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui   .  U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/11344/console |
| Powered by | Apache Yetus 0.2.0   http://yetus.apache.org |


This message was automatically generated.



> [YARN-3368] App attempt page is not loading when timeline server is not 
> started
> ---
>
> Key: YARN-5000
> URL: https://issues.apache.org/jira/browse/YARN-5000
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-5000.patch, 
> AppFinishedAndNoTimelineServer.png, AppRunningAndNoTimelineServer.png, 
> AppRunningAndNoTimelineServer_v2.png, YARN-5000-YARN-3368.1.patch, 
> YARN-5000-YARN-3368.2.patch, YARN-5000-YARN-3368.3.patch, 
> YARN-5000-YARN-3368.4.patch
>
>
> If timeline server is not started, app attempt page is not getting loaded.
> In new web-ui, yarnContainer route is tightly coupled with both RM and 
> Timeline server. And if one of server is not up, page will not load. If 
> timeline server is not up, container information from RM is to be displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4390) Do surgical preemption based on reserved container in CapacityScheduler

2016-05-05 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272503#comment-15272503
 ] 

Eric Payne commented on YARN-4390:
--

Sorry, [~leftnoteasy], for the delay.

+1. LGTM (+)

> Do surgical preemption based on reserved container in CapacityScheduler
> ---
>
> Key: YARN-4390
> URL: https://issues.apache.org/jira/browse/YARN-4390
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.0.0, 2.8.0, 2.7.3
>Reporter: Eric Payne
>Assignee: Wangda Tan
> Attachments: QueueNotHittingMax.jpg, YARN-4390-design.1.pdf, 
> YARN-4390-test-results.pdf, YARN-4390.1.patch, YARN-4390.2.patch, 
> YARN-4390.3.branch-2.patch, YARN-4390.3.patch, YARN-4390.4.patch, 
> YARN-4390.5.patch, YARN-4390.6.patch, YARN-4390.7.patch, YARN-4390.8.patch
>
>
> There are multiple reasons why preemption could unnecessarily preempt 
> containers. One is that an app could be requesting a large container (say 
> 8-GB), and the preemption monitor could conceivably preempt multiple 
> containers (say 8, 1-GB containers) in order to fill the large container 
> request. These smaller containers would then be rejected by the requesting AM 
> and potentially given right back to the preempted app.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5000) [YARN-3368] App attempt page is not loading when timeline server is not started

2016-05-05 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5000:
--
Attachment: YARN-5000-YARN-3368.4.patch

Uploading a cleaner patch after changing LICENSE.txt and some error messages. 
[~leftnoteasy], kindly help to check the same.

> [YARN-3368] App attempt page is not loading when timeline server is not 
> started
> ---
>
> Key: YARN-5000
> URL: https://issues.apache.org/jira/browse/YARN-5000
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-5000.patch, 
> AppFinishedAndNoTimelineServer.png, AppRunningAndNoTimelineServer.png, 
> AppRunningAndNoTimelineServer_v2.png, YARN-5000-YARN-3368.1.patch, 
> YARN-5000-YARN-3368.2.patch, YARN-5000-YARN-3368.3.patch, 
> YARN-5000-YARN-3368.4.patch
>
>
> If timeline server is not started, app attempt page is not getting loaded.
> In new web-ui, yarnContainer route is tightly coupled with both RM and 
> Timeline server. And if one of server is not up, page will not load. If 
> timeline server is not up, container information from RM is to be displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list

2016-05-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272455#comment-15272455
 ] 

Hudson commented on YARN-4311:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9722 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9722/])
YARN-4311. Removing nodes from include and exclude lists will not remove 
(jlowe: rev d0da13229cf692579c8c9db47a93f6c6255392c8)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/RMNodeWrapper.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMServerUtils.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesNodes.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockNodes.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/nodemanager/NodeInfo.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNode.java


> Removing nodes from include and exclude lists will not remove them from 
> decommissioned nodes list
> -
>
> Key: YARN-4311
> URL: https://issues.apache.org/jira/browse/YARN-4311
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: YARN-4311-branch-2.7.001.patch, 
> YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, 
> YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, 
> YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, 
> YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, 
> YARN-4311-v15.patch, YARN-4311-v16.patch, YARN-4311-v17.patch, 
> YARN-4311-v18.patch, YARN-4311-v2.patch, YARN-4311-v3.patch, 
> YARN-4311-v4.patch, YARN-4311-v5.patch, YARN-4311-v6.patch, 
> YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch
>
>
> In order to fully forget about a node, removing the node from include and 
> exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The 
> tricky part that [~jlowe] pointed out was the case when include lists are not 
> used, in that case we don't want the nodes to fall off if they are not active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5034) Failing tests after using try-with-resources

2016-05-05 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated YARN-5034:
---
Attachment: YARN-5034.12.patch

Change stop to close in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestHedgingRequestRMFailoverProxyProvider.java

> Failing tests after using try-with-resources
> 
>
> Key: YARN-5034
> URL: https://issues.apache.org/jira/browse/YARN-5034
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-5034.01.patch, YARN-5034.02.patch, 
> YARN-5034.03.patch, YARN-5034.04.patch, YARN-5034.05.patch, 
> YARN-5034.06.patch, YARN-5034.07.patch, YARN-5034.08.patch, 
> YARN-5034.09.patch, YARN-5034.10.patch, YARN-5034.11.patch, YARN-5034.12.patch
>
>
> This JIRA for following up failing tests. I am not able to reproduce locally 
> neither on mac nor CentOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4311) Removing nodes from include and exclude lists will not remove them from decommissioned nodes list

2016-05-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272341#comment-15272341
 ] 

Jason Lowe commented on YARN-4311:
--

+1 lgtm.  Committing this.


> Removing nodes from include and exclude lists will not remove them from 
> decommissioned nodes list
> -
>
> Key: YARN-4311
> URL: https://issues.apache.org/jira/browse/YARN-4311
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
> Attachments: YARN-4311-branch-2.7.001.patch, 
> YARN-4311-branch-2.7.002.patch, YARN-4311-branch-2.7.003.patch, 
> YARN-4311-branch-2.7.004.patch, YARN-4311-v1.patch, YARN-4311-v10.patch, 
> YARN-4311-v11.patch, YARN-4311-v11.patch, YARN-4311-v12.patch, 
> YARN-4311-v13.patch, YARN-4311-v13.patch, YARN-4311-v14.patch, 
> YARN-4311-v15.patch, YARN-4311-v16.patch, YARN-4311-v17.patch, 
> YARN-4311-v18.patch, YARN-4311-v2.patch, YARN-4311-v3.patch, 
> YARN-4311-v4.patch, YARN-4311-v5.patch, YARN-4311-v6.patch, 
> YARN-4311-v7.patch, YARN-4311-v8.patch, YARN-4311-v9.patch
>
>
> In order to fully forget about a node, removing the node from include and 
> exclude list is not sufficient. The RM lists it under Decomm-ed nodes. The 
> tricky part that [~jlowe] pointed out was the case when include lists are not 
> used, in that case we don't want the nodes to fall off if they are not active.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5039) Applications ACCEPTED but not starting

2016-05-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272333#comment-15272333
 ] 

Jason Lowe commented on YARN-5039:
--

Yes, if the problem is indeed the same as that reported in YARN-4610 then 
leaving reservations-continue-looking enabled by default in 2.7.3 or later will 
be fine.


> Applications ACCEPTED but not starting
> --
>
> Key: YARN-5039
> URL: https://issues.apache.org/jira/browse/YARN-5039
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Miles Crawford
> Attachments: Screen Shot 2016-05-04 at 1.57.19 PM.png, Screen Shot 
> 2016-05-04 at 2.41.22 PM.png, resource-manager-application-starts.log.gz, 
> yarn-yarn-resourcemanager-ip-10-12-47-144.log.gz
>
>
> Often when we submit applications to an incompletely utilized cluster, they 
> sit, unable to start for no apparent reason.
> There are multiple nodes in the cluster with available resources, but the 
> resourcemanger logs show that scheduling is being skipped. The scheduling is 
> skipped because the application itself has reserved the node? I'm not sure 
> how to interpret this log output:
> {code}
> 2016-05-04 20:19:21,315 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:21,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-53.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,232 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-53.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Trying to fulfill reservation for 
> application application_1462291866507_0025 on node: 
> ip-10-12-43-54.us-west-2.compute.internal:8041
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue 
> (ResourceManager Event Processor): Reserved container  
> application=application_1462291866507_0025 resource= 
> queue=default: capacity=1.0, absoluteCapacity=1.0, 
> usedResources=, usedCapacity=0.7126589, 
> absoluteUsedCapacity=0.7126589, numApps=2, numContainers=33 
> usedCapacity=0.7126589 absoluteUsedCapacity=0.7126589 used= vCores:33> cluster=
> 2016-05-04 20:19:22,316 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
>  (ResourceManager Event Processor): Skipping scheduling since node 
> ip-10-12-43-54.us-west-2.compute.internal:8041 is reserved by application 
> appattempt_1462291866507_0025_01
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: 

[jira] [Commented] (YARN-5034) Failing tests after using try-with-resources

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272329#comment-15272329
 ] 

Hadoop QA commented on YARN-5034:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 48s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.8.0_91. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 18s 
{color} | {color:green} hadoop-mapreduce-client-app in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 29s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:cf2ee45 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12802402/YARN-5034.11.patch |
| JIRA Issue | YARN-5034 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux beb075e946df 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 72b0477 |
| Default Java | 1.7.0_95 |
| Multi-JDK versions |  

[jira] [Updated] (YARN-5034) Failing tests after using try-with-resources

2016-05-05 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated YARN-5034:
---
Attachment: YARN-5034.11.patch

11: 1st again

> Failing tests after using try-with-resources
> 
>
> Key: YARN-5034
> URL: https://issues.apache.org/jira/browse/YARN-5034
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-5034.01.patch, YARN-5034.02.patch, 
> YARN-5034.03.patch, YARN-5034.04.patch, YARN-5034.05.patch, 
> YARN-5034.06.patch, YARN-5034.07.patch, YARN-5034.08.patch, 
> YARN-5034.09.patch, YARN-5034.10.patch, YARN-5034.11.patch
>
>
> This JIRA for following up failing tests. I am not able to reproduce locally 
> neither on mac nor CentOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5038) [YARN-3368] Application and Container pages shows wrong values when RM is stopped

2016-05-05 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-5038:
--
Summary: [YARN-3368] Application and Container pages shows wrong values 
when RM is stopped  (was: [YARN-3368] Applications and Container pages shows 
wrong values when RM is stopped)

> [YARN-3368] Application and Container pages shows wrong values when RM is 
> stopped
> -
>
> Key: YARN-5038
> URL: https://issues.apache.org/jira/browse/YARN-5038
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Sunil G
>Assignee: Sunil G
>
> Few minor issues to fix.
> - In Applications page, "Running Container" is shows as -1 when app is 
> finished.
> - In container page, "Finished Time" is showing 1970 as date by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5034) Failing tests after using try-with-resources

2016-05-05 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272249#comment-15272249
 ] 

Hadoop QA commented on YARN-5034:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
32s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} trunk passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 13s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s 
{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git 
apply --whitespace=fix. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 2s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_91. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 18s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_95. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 145m 28s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.yarn.client.TestGetGroups |
|   | hadoop.yarn.client.api.impl.TestAMRMProxy |
| JDK v1.8.0_91 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestYarnClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestNMClient |
| JDK v1.7.0_95 Failed junit tests | hadoop.yarn.client.TestGetGroups |
|   | hadoop.yarn.client.api.impl.TestAMRMProxy |
| JDK v1.7.0_95 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | 

[jira] [Commented] (YARN-5023) TestAMRestart#testShouldNotCountFailureToMaxAttemptRetry random failure

2016-05-05 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272139#comment-15272139
 ] 

sandflee commented on YARN-5023:


I write a script to auto test and reproduce it, file YARN-5043, may it help

> TestAMRestart#testShouldNotCountFailureToMaxAttemptRetry random failure
> ---
>
> Key: YARN-5023
> URL: https://issues.apache.org/jira/browse/YARN-5023
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: Bibin A Chundatt
>Assignee: sandflee
> Attachments: YARN-5023.01.patch
>
>
> https://builds.apache.org/job/PreCommit-YARN-Build/11296/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdk1.8.0_91.txt
> {noformat}
> Tests run: 10, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 96.482 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
> testShouldNotCountFailureToMaxAttemptRetry(org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart)
>   Time elapsed: 56.467 sec  <<< FAILURE!
> java.lang.AssertionError: Attempt state is not correct (timeout). 
> expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:266)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:225)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:207)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForAttemptScheduled(MockRM.java:955)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAM(MockRM.java:942)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.launchAndRegisterAM(MockRM.java:961)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForNewAMToLaunchAndRegister(MockRM.java:295)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart.testShouldNotCountFailureToMaxAttemptRetry(TestAMRestart.java:647)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5034) Failing tests after using try-with-resources

2016-05-05 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated YARN-5034:
---
Attachment: YARN-5034.10.patch

10: Refactor testHedgingRequestProxyProvider

> Failing tests after using try-with-resources
> 
>
> Key: YARN-5034
> URL: https://issues.apache.org/jira/browse/YARN-5034
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Trivial
> Fix For: 2.7.0
>
> Attachments: YARN-5034.01.patch, YARN-5034.02.patch, 
> YARN-5034.03.patch, YARN-5034.04.patch, YARN-5034.05.patch, 
> YARN-5034.06.patch, YARN-5034.07.patch, YARN-5034.08.patch, 
> YARN-5034.09.patch, YARN-5034.10.patch
>
>
> This JIRA for following up failing tests. I am not able to reproduce locally 
> neither on mac nor CentOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5043) TestAMRestart.testRMAppAttemptFailuresValidityInterval random fail

2016-05-05 Thread sandflee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sandflee updated YARN-5043:
---
Attachment: TestAMRestart-output.txt

> TestAMRestart.testRMAppAttemptFailuresValidityInterval random fail
> --
>
> Key: YARN-5043
> URL: https://issues.apache.org/jira/browse/YARN-5043
> Project: Hadoop YARN
>  Issue Type: Test
>Reporter: sandflee
> Attachments: TestAMRestart-output.txt
>
>
> {noformat}
> Test set: 
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
> ---
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 31.558 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
> testRMAppAttemptFailuresValidityInterval(org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart)
>   Time elapsed: 31.509 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<3>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart.testRMAppAttemptFailuresValidityInterval(TestAMRestart.java:913)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5043) TestAMRestart.testRMAppAttemptFailuresValidityInterval random fail

2016-05-05 Thread sandflee (JIRA)
sandflee created YARN-5043:
--

 Summary: TestAMRestart.testRMAppAttemptFailuresValidityInterval 
random fail
 Key: YARN-5043
 URL: https://issues.apache.org/jira/browse/YARN-5043
 Project: Hadoop YARN
  Issue Type: Test
Reporter: sandflee


{noformat}
Test set: 
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
---
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 31.558 sec <<< 
FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart
testRMAppAttemptFailuresValidityInterval(org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart)
  Time elapsed: 31.509 sec  <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart.testRMAppAttemptFailuresValidityInterval(TestAMRestart.java:913)

{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >