[jira] [Updated] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-12 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9052:
-
Attachment: YARN-9052.testlogs.002.patch

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-9052.001.patch, YARN-9052.002.patch, 
> YARN-9052.003.patch, YARN-9052.testlogs.002.patch, 
> YARN-9052.testlogs.002.patch, YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9971) YARN Native Service HttpProbe logs THIS_HOST in error messages

2019-11-12 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-9971:
---

 Summary: YARN Native Service HttpProbe logs THIS_HOST in error 
messages
 Key: YARN-9971
 URL: https://issues.apache.org/jira/browse/YARN-9971
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-native-services
Affects Versions: 3.3.0
Reporter: Prabhu Joseph
Assignee: Tarun Parimi


YARN Native Service HttpProbe logs THIS_HOST in error messages. While logging, 
missed to use the replaced url string.

{code:java}
2019-11-12 19:25:47,317 [pool-7-thread-1] INFO  probe.HttpProbe - Probe 
http://${THIS_HOST}:18010/master-status failed for IP 172.27.75.198: 
java.net.ConnectException: Connection refused (Connection refused)
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-12 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972752#comment-16972752
 ] 

Hadoop QA commented on YARN-9052:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 23 new + 35 unchanged - 0 fixed = 58 total (was 35) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 41s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}100m 31s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}156m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9052 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985645/YARN-9052.testlogs.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ed75b49d463a 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / b83b9ab |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/25145/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/25145/artifact/out/whitespace-eol.txt
 |
| unit | 

[jira] [Commented] (YARN-9923) Detect missing Docker binary or not running Docker daemon

2019-11-12 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972705#comment-16972705
 ] 

Hadoop QA commented on YARN-9923:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
46s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 26 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  8m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
24m 47s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
13s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 48s{color} | {color:orange} root: The patch generated 15 new + 603 unchanged 
- 44 fixed = 618 total (was 647) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  8m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  1s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m  
7s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
30s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
50s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 13s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
54s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  

[jira] [Comment Edited] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-12 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972669#comment-16972669
 ] 

Szilard Nemeth edited comment on YARN-9052 at 11/12/19 6:01 PM:


Thanks [~sunilg]! 
Right now, I'm in the process of adding some logs to all parameters so that I 
can be confident my change won't break anything.
The main patch will also contain the same logging, so if something is different 
in the test results, I can see which field it was the culprit.
Do you know if I can download test logs for all testcases from Jenkins?


was (Author: snemeth):
Thanks [~sunilg]! 
Right now, I'm in the process of adding some logs to all parameters so that I 
can be confident my change won't break anything.
The main patch will also contain the same logging, so if something is different 
in the test results, I can see which field it was the culprit.

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-9052.001.patch, YARN-9052.002.patch, 
> YARN-9052.003.patch, YARN-9052.testlogs.002.patch, YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-12 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972669#comment-16972669
 ] 

Szilard Nemeth commented on YARN-9052:
--

Thanks [~sunilg]! 
Right now, I'm in the process of adding some logs to all parameters so that I 
can be confident my change won't break anything.
The main patch will also contain the same logging, so if something is different 
in the test results, I can see which field it was the culprit.

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-9052.001.patch, YARN-9052.002.patch, 
> YARN-9052.003.patch, YARN-9052.testlogs.002.patch, YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-12 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9052:
-
Attachment: YARN-9052.testlogs.002.patch

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-9052.001.patch, YARN-9052.002.patch, 
> YARN-9052.003.patch, YARN-9052.testlogs.002.patch, YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption

2019-11-12 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972667#comment-16972667
 ] 

Hudson commented on YARN-9537:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17632 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17632/])
YARN-9537. Add configuration to disable AM preemption. Contributed by (yufei: 
rev b83b9ab41874646e92eb28b7f9153eaba858f4d0)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFSAppAttempt.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairSchedulerPreemption.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairSchedulerConfiguration.java


> Add configuration to disable AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.2.0, 3.1.2
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9537-002.patch, YARN-9537.001.patch, 
> YARN-9537.003.patch, YARN-9537.004.patch, YARN-9537.005.patch, 
> YARN-9537.006.patch
>
>
> In this issue, i will add a configuration to support disable AM preemption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9537) Add configuration to disable AM preemption

2019-11-12 Thread Yufei Gu (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-9537:
---
Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed

> Add configuration to disable AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.2.0, 3.1.2
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9537-002.patch, YARN-9537.001.patch, 
> YARN-9537.003.patch, YARN-9537.004.patch, YARN-9537.005.patch, 
> YARN-9537.006.patch
>
>
> In this issue, i will add a configuration to support disable AM preemption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption

2019-11-12 Thread Yufei Gu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972654#comment-16972654
 ] 

Yufei Gu commented on YARN-9537:


Committed to trunk. Thanks for the contribution, [~cane]. Thanks for the 
review, [~adam.antal] and [~snemeth]. 

> Add configuration to disable AM preemption
> --
>
> Key: YARN-9537
> URL: https://issues.apache.org/jira/browse/YARN-9537
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 3.2.0, 3.1.2
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
> Attachments: YARN-9537-002.patch, YARN-9537.001.patch, 
> YARN-9537.003.patch, YARN-9537.004.patch, YARN-9537.005.patch, 
> YARN-9537.006.patch
>
>
> In this issue, i will add a configuration to support disable AM preemption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-12 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972612#comment-16972612
 ] 

Hadoop QA commented on YARN-9052:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
42s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 10s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 7 new + 35 unchanged - 0 fixed = 42 total (was 35) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 5 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 23s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}117m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate
 |
|   | hadoop.yarn.server.resourcemanager.TestNodeBlacklistingOnAMFailures |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestSchedulingRequestContainerAllocationAsync
 |
|   | hadoop.yarn.server.resourcemanager.TestLeaderElectorService |
|   | 
hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesSchedulerActivities |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestSchedulingRequestContainerAllocation
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationPriority |
|   | hadoop.yarn.server.resourcemanager.TestResourceTrackerService |
|   | hadoop.yarn.server.resourcemanager.volume.csi.TestVolumeProcessor |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimitsByPartition
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerAutoQueueCreation
 

[jira] [Comment Edited] (YARN-9964) Queue metrics turn negative when relabeling a node with running containers to default partition

2019-11-12 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972588#comment-16972588
 ] 

Manikandan R edited comment on YARN-9964 at 11/12/19 4:34 PM:
--

[~jhung] Yes, similar issue has been captured in YARN-9767 (Refer #1 for more 
details). Converting this to sub task of YARN-6492 for ease of tracking as 
discussed in 
https://issues.apache.org/jira/browse/YARN-6492?focusedCommentId=16905471=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16905471


was (Author: maniraj...@gmail.com):
[~jhung] Yes, similar issue has been captured in YARN-9767 (Refer #1 for more 
details).

> Queue metrics turn negative when relabeling a node with running containers to 
> default partition 
> 
>
> Key: YARN-9964
> URL: https://issues.apache.org/jira/browse/YARN-9964
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Priority: Major
>
> YARN-6467 changed queue metrics logic to only update certain metrics if it's 
> for default partition. But if an app runs containers in a labeled node, then 
> this node is moved to default partition, then the container is released, this 
> container's resource won't increment queue's allocated resource, but will 
> decrement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9964) Queue metrics turn negative when relabeling a node with running containers to default partition

2019-11-12 Thread Manikandan R (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-9964:
---
Parent: YARN-6492
Issue Type: Sub-task  (was: Bug)

> Queue metrics turn negative when relabeling a node with running containers to 
> default partition 
> 
>
> Key: YARN-9964
> URL: https://issues.apache.org/jira/browse/YARN-9964
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Priority: Major
>
> YARN-6467 changed queue metrics logic to only update certain metrics if it's 
> for default partition. But if an app runs containers in a labeled node, then 
> this node is moved to default partition, then the container is released, this 
> container's resource won't increment queue's allocated resource, but will 
> decrement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9964) Queue metrics turn negative when relabeling a node with running containers to default partition

2019-11-12 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972588#comment-16972588
 ] 

Manikandan R commented on YARN-9964:


[~jhung] Yes, similar issue has been captured in YARN-9767 (Refer #1 for more 
details).

> Queue metrics turn negative when relabeling a node with running containers to 
> default partition 
> 
>
> Key: YARN-9964
> URL: https://issues.apache.org/jira/browse/YARN-9964
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Priority: Major
>
> YARN-6467 changed queue metrics logic to only update certain metrics if it's 
> for default partition. But if an app runs containers in a labeled node, then 
> this node is moved to default partition, then the container is released, this 
> container's resource won't increment queue's allocated resource, but will 
> decrement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9768) RM Renew Delegation token thread should timeout and retry

2019-11-12 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972585#comment-16972585
 ] 

Manikandan R commented on YARN-9768:


Thank you [~inigoiri] for reviews.

[~bibinchundatt] Can you please take a look?

> RM Renew Delegation token thread should timeout and retry
> -
>
> Key: YARN-9768
> URL: https://issues.apache.org/jira/browse/YARN-9768
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9768.001.patch, YARN-9768.002.patch, 
> YARN-9768.003.patch, YARN-9768.004.patch, YARN-9768.005.patch, 
> YARN-9768.006.patch, YARN-9768.007.patch, YARN-9768.008.patch
>
>
> Delegation token renewer thread in RM (DelegationTokenRenewer.java) renews 
> HDFS tokens received to check for validity and expiration time.
> This call is made to an underlying HDFS NN or Router Node (which has exact 
> APIs as HDFS NN). If one of the nodes is bad and the renew call is stuck the 
> thread remains stuck indefinitely. The thread should ideally timeout the 
> renewToken and retry from the client's perspective.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9868) Validate %primary_group queue in CS queue manager

2019-11-12 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972582#comment-16972582
 ] 

Manikandan R commented on YARN-9868:


[~snemeth] Can you please trigger jenkins and review the patch?

> Validate %primary_group queue in CS queue manager
> -
>
> Key: YARN-9868
> URL: https://issues.apache.org/jira/browse/YARN-9868
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9868.001.patch, YARN-9868.002.patch
>
>
> As part of %secondary_group mapping, we ensure o/p of %secondary_group while 
> processing the queue mapping is available using CSQueueManager. Similarly, we 
> will need to same for %primary_group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9865) Capacity scheduler: add support for combined %user + %secondary_group mapping

2019-11-12 Thread Manikandan R (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972581#comment-16972581
 ] 

Manikandan R commented on YARN-9865:


Thanks [~snemeth] and [~pbacsko].

Created YARN-9969  and YARN-9970 to address the review comments.

 

> Capacity scheduler: add support for combined %user + %secondary_group mapping
> -
>
> Key: YARN-9865
> URL: https://issues.apache.org/jira/browse/YARN-9865
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9865-005.patch, YARN-9865.001.patch, 
> YARN-9865.002.patch, YARN-9865.003.patch, YARN-9865.004.patch
>
>
> Similiar to YARN-9841, but for secondary group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9970) Refactor TestUserGroupMappingPlacementRule#verifyQueueMapping

2019-11-12 Thread Manikandan R (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-9970:
---
Description: Scope of this Jira is to refactor 
TestUserGroupMappingPlacementRule#verifyQueueMapping and QueueMapping class as 
discussed in 
https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482
  (was: Scope of this Jira is to refactor 
https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482)

> Refactor TestUserGroupMappingPlacementRule#verifyQueueMapping
> -
>
> Key: YARN-9970
> URL: https://issues.apache.org/jira/browse/YARN-9970
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
>
> Scope of this Jira is to refactor 
> TestUserGroupMappingPlacementRule#verifyQueueMapping and QueueMapping class 
> as discussed in 
> https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9970) Refactor TestUserGroupMappingPlacementRule#verifyQueueMapping

2019-11-12 Thread Manikandan R (Jira)
Manikandan R created YARN-9970:
--

 Summary: Refactor 
TestUserGroupMappingPlacementRule#verifyQueueMapping
 Key: YARN-9970
 URL: https://issues.apache.org/jira/browse/YARN-9970
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Manikandan R
Assignee: Manikandan R






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9970) Refactor TestUserGroupMappingPlacementRule#verifyQueueMapping

2019-11-12 Thread Manikandan R (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-9970:
---
Description: Scope of this Jira is to refactor 
https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482

> Refactor TestUserGroupMappingPlacementRule#verifyQueueMapping
> -
>
> Key: YARN-9970
> URL: https://issues.apache.org/jira/browse/YARN-9970
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
>
> Scope of this Jira is to refactor 
> https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9969) Improve yarn.scheduler.capacity.queue-mappings documentation

2019-11-12 Thread Manikandan R (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-9969:
---
Description: As discussed in 
https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482,
 scope of this Jira is to improve the yarn.scheduler.capacity.queue-mappings in 
CapacityScheduler.md.  (was: As discussed in 
https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482,
 scope of this Jira is to improve the )

> Improve yarn.scheduler.capacity.queue-mappings documentation
> 
>
> Key: YARN-9969
> URL: https://issues.apache.org/jira/browse/YARN-9969
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
>
> As discussed in 
> https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482,
>  scope of this Jira is to improve the yarn.scheduler.capacity.queue-mappings 
> in CapacityScheduler.md.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9969) Improve yarn.scheduler.capacity.queue-mappings documentation

2019-11-12 Thread Manikandan R (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-9969:
---
Description: As discussed in 
https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482,
 scope of this Jira is to improve the 

> Improve yarn.scheduler.capacity.queue-mappings documentation
> 
>
> Key: YARN-9969
> URL: https://issues.apache.org/jira/browse/YARN-9969
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
>
> As discussed in 
> https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482,
>  scope of this Jira is to improve the 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9912) Support u:user2:%secondary_group queue mapping

2019-11-12 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972575#comment-16972575
 ] 

Hadoop QA commented on YARN-9912:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 12s{color} 
| {color:red} YARN-9912 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-9912 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985242/YARN-9912.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25144/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Support u:user2:%secondary_group queue mapping
> --
>
> Key: YARN-9912
> URL: https://issues.apache.org/jira/browse/YARN-9912
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9912.001.patch, YARN-9912.002.patch
>
>
> Similar to u:user2:%primary_group mapping, add support for 
> u:user2:%secondary_group queue mapping as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9969) Improve yarn.scheduler.capacity.queue-mappings documentation

2019-11-12 Thread Manikandan R (Jira)
Manikandan R created YARN-9969:
--

 Summary: Improve yarn.scheduler.capacity.queue-mappings 
documentation
 Key: YARN-9969
 URL: https://issues.apache.org/jira/browse/YARN-9969
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Manikandan R
Assignee: Manikandan R






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-12 Thread Sunil G (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972552#comment-16972552
 ] 

Sunil G commented on YARN-9052:
---

Really appreciate [~snemeth]'s efforts here in cleaning up MockRM with a 
builder. Kudos!

I will try and help in this one for reviews etc.

cc [~leftnoteasy] [~cheersyang] [~rohithsharmaks]

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-9052.001.patch, YARN-9052.002.patch, 
> YARN-9052.003.patch, YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9052) Replace all MockRM submit method definitions with a builder

2019-11-12 Thread Szilard Nemeth (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9052:
-
Attachment: YARN-9052.testlogs.patch

> Replace all MockRM submit method definitions with a builder
> ---
>
> Key: YARN-9052
> URL: https://issues.apache.org/jira/browse/YARN-9052
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Szilard Nemeth
>Priority: Minor
> Attachments: YARN-9052.001.patch, YARN-9052.002.patch, 
> YARN-9052.003.patch, YARN-9052.testlogs.patch
>
>
> MockRM has 31 definitions of submitApp, most of them having more than 
> acceptable number of parameters, ranging from 2 to even 22 parameters, which 
> makes the code completely unreadable.
> On top of unreadability, it's very hard to follow what RmApp will be produced 
> for tests as they often pass a lot of empty / null values as parameters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9923) Detect missing Docker binary or not running Docker daemon

2019-11-12 Thread Adam Antal (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YARN-9923:
-
Attachment: YARN-9923.003.patch

> Detect missing Docker binary or not running Docker daemon
> -
>
> Key: YARN-9923
> URL: https://issues.apache.org/jira/browse/YARN-9923
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9923.001.patch, YARN-9923.002.patch, 
> YARN-9923.003.patch
>
>
> Currently if a NodeManager is enabled to allocate Docker containers, but the 
> specified binary (docker.binary in the container-executor.cfg) is missing the 
> container allocation fails with the following error message:
> {noformat}
> Container launch fails
> Exit code: 29
> Exception message: Launch container failed
> Shell error output: sh: : No 
> such file or directory
> Could not inspect docker network to get type /usr/bin/docker network inspect 
> host --format='{{.Driver}}'.
> Error constructing docker command, docker error code=-1, error 
> message='Unknown error'
> {noformat}
> I suggest to add a property say "yarn.nodemanager.runtime.linux.docker.check" 
> to have the following options:
> - STARTUP: setting this option the NodeManager would not start if Docker 
> binaries are missing or the Docker daemon is not running (the exception is 
> considered FATAL during startup)
> - RUNTIME: would give a more detailed/user-friendly exception in 
> NodeManager's side (NM logs) if Docker binaries are missing or the daemon is 
> not working. This would also prevent further Docker container allocation as 
> long as the binaries do not exist and the docker daemon is not running.
> - NONE (default): preserving the current behaviour, throwing exception during 
> container allocation, carrying on using the default retry procedure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9923) Detect missing Docker binary or not running Docker daemon

2019-11-12 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972480#comment-16972480
 ] 

Adam Antal commented on YARN-9923:
--

Hi [~snemeth]!

Thanks for the thorough review. I have found your points reasonable, let me 
reflect those.

Resolved the following points: 2, 3, 4, 5, 6, 9, 10, 12, 14, 17, 19. Please 
verify whether I have understood your comments and fixed those items properly.

Some extra things: 
1. Will do this after this comment.
7. Partly do that, please check the function 
{{NodeHealthScriptRunner#newInstance}}.
8. I refactored a bit more {{TimedHealthReporterService#setHealthStatus}}, now 
every call sets the {{lastReportedTime}} attribute as well. Also I don't want 
to change the basic idea behind "being health or not", so I think the boolean 
flag is just enough for stating the main information - is the node healthy. If 
not, every other detail should go into the health report which can a be as 
verbose as the writer of the services wants it to be.
11. Can not be resolved since managing {{this.task}} is the responsibility of 
the super class ({{TimedHealthReporterService}}), and I think it should not set 
directly to there.
13. I am not sure about that, but now it does. See 8. for further detail.
15. Please check whether it makes sense. Also a new test case is added to 
demonstrate this - please take a look at 
({{TestNodeHealthCheckerService#testCustomHealthReporter}}).
16. No, it's not a programmatic error. This was a solution to mock a specific 
service for testing (we first add the mocked service of dirsHandler for 
example, and later when we want to do this again inside the logic of the class, 
the mocked one will be kept). Added debug level logging there.
18. It's an allMatch looking for all {{HealthReporter}} s to return true, if at 
least one of it returns false, it will fail. It will fail in case every service 
returns false. I don't see why we should add test for this - that would test 
the allMatch function and not the behaviour of the class.
20. The whole function got a bit simplified there - using less low-level calls. 
Could you please take a look at that?
21. Yes, the tests were not finished, I finished it in patch v3. Could you 
check?

Thanks!

> Detect missing Docker binary or not running Docker daemon
> -
>
> Key: YARN-9923
> URL: https://issues.apache.org/jira/browse/YARN-9923
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Affects Versions: 3.2.1
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9923.001.patch, YARN-9923.002.patch
>
>
> Currently if a NodeManager is enabled to allocate Docker containers, but the 
> specified binary (docker.binary in the container-executor.cfg) is missing the 
> container allocation fails with the following error message:
> {noformat}
> Container launch fails
> Exit code: 29
> Exception message: Launch container failed
> Shell error output: sh: : No 
> such file or directory
> Could not inspect docker network to get type /usr/bin/docker network inspect 
> host --format='{{.Driver}}'.
> Error constructing docker command, docker error code=-1, error 
> message='Unknown error'
> {noformat}
> I suggest to add a property say "yarn.nodemanager.runtime.linux.docker.check" 
> to have the following options:
> - STARTUP: setting this option the NodeManager would not start if Docker 
> binaries are missing or the Docker daemon is not running (the exception is 
> considered FATAL during startup)
> - RUNTIME: would give a more detailed/user-friendly exception in 
> NodeManager's side (NM logs) if Docker binaries are missing or the daemon is 
> not working. This would also prevent further Docker container allocation as 
> long as the binaries do not exist and the docker daemon is not running.
> - NONE (default): preserving the current behaviour, throwing exception during 
> container allocation, carrying on using the default retry procedure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9968) Public Localizer is exiting in NodeManager due to NullPointerException

2019-11-12 Thread Tarun Parimi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972420#comment-16972420
 ] 

Tarun Parimi commented on YARN-9968:


Hi [~snemeth]. Thanks for looking into this.
The issue is not reproducing for me so far. This is happening on a heavily 
loaded prod cluster. The cluster also is configured to use 
DefaultContainerExecutor , so the localizing is all done completely inside the 
NM jvm process.

The null pointer occurs in the below code where tracker.handle() is called. 
Looks like tracker is becoming null for some reason. Doing a null check on 
tracker might be a simple workaround, but understanding how the issue occurred 
might give us a better way to fix this.
{code:java}
 final String diagnostics = "Failed to download resource " +
  assoc.getResource() + " " + e.getCause();
  tracker.handle(new ResourceFailedLocalizationEvent(
  assoc.getResource().getRequest(), diagnostics));
{code}

There are also multiple HDFS warnings while doing localization in the log just 
before this NullPointerException. So I think those HDFS issues while localizing 
are definitely related and are causing the issue in the first place. But I 
haven't completely figured out how.

{code:java}
WARN  impl.BlockReaderFactory 
(BlockReaderFactory.java:getRemoteBlockReaderFromTcp(764)) - I/O error 
constructing remote block reader.
java.io.IOException: Got error, status=ERROR, status message opReadBlock 
BP-290360126-127.0.0.1-1559634768162:blk_3454574939_2740457478 received 
exception java.io.IOException: No data exists for block 
BP-290360126-127.0.0.1-1559634768162:blk_blk_3454574939_2740457478, for 
OP_READ_BLOCK, self=/127.0.0.1:15810, remote=/127.0.0.1:50010, for file 
/tmp/hadoop-yarn/staging/job-user/.staging/job_1571858983080_36874/job.jar, for 
pool BP-290360126-127.0.0.1-1559634768162 block 3814574939_2740867478
at 
org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:110)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.checkSuccess(BlockReaderRemote.java:440)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderRemote.newBlockReader(BlockReaderRemote.java:408)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:853)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:749)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379)
at 
org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:641)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:572)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:754)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:820)
at java.io.DataInputStream.read(DataInputStream.java:149)
at 
org.apache.commons.io.input.ProxyInputStream.read(ProxyInputStream.java:100)
at 
org.apache.commons.io.input.TeeInputStream.read(TeeInputStream.java:129)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at java.io.PushbackInputStream.read(PushbackInputStream.java:186)
at java.util.zip.ZipInputStream.readFully(ZipInputStream.java:403)
at java.util.zip.ZipInputStream.readLOC(ZipInputStream.java:278)
at java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:122)
at java.util.jar.JarInputStream.(JarInputStream.java:83)
at java.util.jar.JarInputStream.(JarInputStream.java:62)
at org.apache.hadoop.util.RunJar.unJar(RunJar.java:114)
at org.apache.hadoop.util.RunJar.unJarAndSave(RunJar.java:167)
at org.apache.hadoop.yarn.util.FSDownload.unpack(FSDownload.java:354)
at 
org.apache.hadoop.yarn.util.FSDownload.downloadAndUnpack(FSDownload.java:303)
at 
org.apache.hadoop.yarn.util.FSDownload.verifyAndCopy(FSDownload.java:283)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:67)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:414)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:411)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:411)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ContainerLocalizer$FSDownloadWrapper.doDownloadCall(ContainerLocalizer.java:242)
at 

[jira] [Commented] (YARN-9968) Public Localizer is exiting in NodeManager due to NullPointerException

2019-11-12 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972413#comment-16972413
 ] 

Szilard Nemeth commented on YARN-9968:
--

Hi [~tarunparimi]!
Could you please add reproduction steps? Thanks!

> Public Localizer is exiting in NodeManager due to NullPointerException
> --
>
> Key: YARN-9968
> URL: https://issues.apache.org/jira/browse/YARN-9968
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.0
>Reporter: Tarun Parimi
>Assignee: Tarun Parimi
>Priority: Major
>
> The Public Localizer is encountering a NullPointerException and exiting.
> {code:java}
> ERROR localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:run(995)) - Error: Shutting down
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:981)
> INFO  localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:run(997)) - Public cache exiting
> {code}
> The NodeManager still keeps on running. Subsequent localization events for 
> containers keep encountering the below error, resulting in failed 
> Localization of all new containers. 
> {code:java}
> ERROR localizer.ResourceLocalizationService 
> (ResourceLocalizationService.java:addResource(920)) - Failed to submit rsrc { 
> { hdfs://namespace/raw/user/.staging/job/conf.xml 1572071824603, FILE, null 
> },pending,[(container_e30_1571858463080_48304_01_000134)],12513553420029113,FAILED}
>  for download. Either queue is full or threadpool is shutdown.
> java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.ExecutorCompletionService$QueueingFuture@55c7fa21 
> rejected from 
> org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@46067edd[Terminated,
>  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 
> 382286]
> at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
> at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
> at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
> at 
> java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:181)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:899)
> {code}
> When this happens, the NodeManager becomes usable only after a restart.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9968) Public Localizer is exiting in NodeManager due to NullPointerException

2019-11-12 Thread Tarun Parimi (Jira)
Tarun Parimi created YARN-9968:
--

 Summary: Public Localizer is exiting in NodeManager due to 
NullPointerException
 Key: YARN-9968
 URL: https://issues.apache.org/jira/browse/YARN-9968
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.1.0
Reporter: Tarun Parimi
Assignee: Tarun Parimi


The Public Localizer is encountering a NullPointerException and exiting.

{code:java}
ERROR localizer.ResourceLocalizationService 
(ResourceLocalizationService.java:run(995)) - Error: Shutting down
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.run(ResourceLocalizationService.java:981)

INFO  localizer.ResourceLocalizationService 
(ResourceLocalizationService.java:run(997)) - Public cache exiting
{code}

The NodeManager still keeps on running. Subsequent localization events for 
containers keep encountering the below error, resulting in failed Localization 
of all new containers. 

{code:java}
ERROR localizer.ResourceLocalizationService 
(ResourceLocalizationService.java:addResource(920)) - Failed to submit rsrc { { 
hdfs://namespace/raw/user/.staging/job/conf.xml 1572071824603, FILE, null 
},pending,[(container_e30_1571858463080_48304_01_000134)],12513553420029113,FAILED}
 for download. Either queue is full or threadpool is shutdown.
java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ExecutorCompletionService$QueueingFuture@55c7fa21 rejected 
from 
org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@46067edd[Terminated, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 382286]
at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
at 
java.util.concurrent.ExecutorCompletionService.submit(ExecutorCompletionService.java:181)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$PublicLocalizer.addResource(ResourceLocalizationService.java:899)
{code}

When this happens, the NodeManager becomes usable only after a restart.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9967) Fix NodeManager failing to start when Hdfs Auxillary Jar is set

2019-11-12 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph reassigned YARN-9967:
---

Assignee: Tarun Parimi

> Fix NodeManager failing to start when Hdfs Auxillary Jar is set
> ---
>
> Key: YARN-9967
> URL: https://issues.apache.org/jira/browse/YARN-9967
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: auxservices, nodemanager
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Tarun Parimi
>Priority: Major
>
> Loading an auxiliary jar from a Hdfs location on a node manager fails with 
> ClassNotFound Exception
> {code:java}
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> classpath: []
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> system classes: [java., javax.accessibility., javax.activation., 
> javax.activity., javax.annotation., javax.annotation.processing., 
> javax.crypto., javax.imageio., javax.jws., javax.lang.model., 
> -javax.management.j2ee., javax.management., javax.naming., javax.net., 
> javax.print., javax.rmi., javax.script., -javax.security.auth.message., 
> javax.security.auth., javax.security.cert., javax.security.sasl., 
> javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., 
> -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., 
> org.xml.sax., org.apache.commons.logging., org.apache.log4j., 
> -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, 
> hdfs-default.xml, mapred-default.xml, yarn-default.xml]
> 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
> in state INITED
> java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016)
> {code}
> *Repro:*
> {code:java}
> 1. Prepare a custom auxiliary service jar and place it on hdfs
> [hdfs@yarndocker-1 yarn]$ cat TestShuffleHandler2.java 
> package org;
> import org.apache.hadoop.yarn.server.api.AuxiliaryService;
> import org.apache.hadoop.yarn.server.api.ApplicationInitializationContext;
> import org.apache.hadoop.yarn.server.api.ApplicationTerminationContext;
> import java.nio.ByteBuffer;
> public class TestShuffleHandler2 extends AuxiliaryService {
> public static final String MAPREDUCE_TEST_SHUFFLE_SERVICEID = 
> "test_shuffle2";
> public TestShuffleHandler2() {
>   super("testshuffle2");
> }
> @Override
> public void initializeApplication(ApplicationInitializationContext 
> context) {
> }
> @Override
> public void stopApplication(ApplicationTerminationContext context) {
> }
> @Override
> public synchronized ByteBuffer getMetaData() {
>   return ByteBuffer.allocate(0); 
> }
>   }
>   
> [hdfs@yarndocker-1 yarn]$ javac -d . -cp `hadoop classpath` 
> TestShuffleHandler2.java 
> [hdfs@yarndocker-1 yarn]$ jar cvf auxhdfs.jar org/
> [hdfs@yarndocker-1 mapreduce]$ hadoop fs -mkdir /AUX
> [hdfs@yarndocker-1 

[jira] [Created] (YARN-9967) Fix NodeManager failing to start when Hdfs Auxillary Jar is set

2019-11-12 Thread Prabhu Joseph (Jira)
Prabhu Joseph created YARN-9967:
---

 Summary: Fix NodeManager failing to start when Hdfs Auxillary Jar 
is set
 Key: YARN-9967
 URL: https://issues.apache.org/jira/browse/YARN-9967
 Project: Hadoop YARN
  Issue Type: Bug
  Components: auxservices, nodemanager
Affects Versions: 3.3.0
Reporter: Prabhu Joseph


Loading an auxiliary jar from a Hdfs location on a node manager fails with 
ClassNotFound Exception
{code:java}
2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
classpath: []
2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
system classes: [java., javax.accessibility., javax.activation., 
javax.activity., javax.annotation., javax.annotation.processing., 
javax.crypto., javax.imageio., javax.jws., javax.lang.model., 
-javax.management.j2ee., javax.management., javax.naming., javax.net., 
javax.print., javax.rmi., javax.script., -javax.security.auth.message., 
javax.security.auth., javax.security.cert., javax.security.sasl., javax.sound., 
javax.sql., javax.swing., javax.tools., javax.transaction., 
-javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., org.xml.sax., 
org.apache.commons.logging., org.apache.log4j., -org.apache.hadoop.hbase., 
org.apache.hadoop., core-default.xml, hdfs-default.xml, mapred-default.xml, 
yarn-default.xml]
2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: Service 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
in state INITED
java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936)
at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016)
{code}
*Repro:*
{code:java}
1. Prepare a custom auxiliary service jar and place it on hdfs

[hdfs@yarndocker-1 yarn]$ cat TestShuffleHandler2.java 
package org;
import org.apache.hadoop.yarn.server.api.AuxiliaryService;
import org.apache.hadoop.yarn.server.api.ApplicationInitializationContext;
import org.apache.hadoop.yarn.server.api.ApplicationTerminationContext;
import java.nio.ByteBuffer;

public class TestShuffleHandler2 extends AuxiliaryService {
public static final String MAPREDUCE_TEST_SHUFFLE_SERVICEID = 
"test_shuffle2";
public TestShuffleHandler2() {
  super("testshuffle2");
}
@Override
public void initializeApplication(ApplicationInitializationContext context) 
{
}
@Override
public void stopApplication(ApplicationTerminationContext context) {
}
@Override
public synchronized ByteBuffer getMetaData() {
  return ByteBuffer.allocate(0); 
}
  }
  
[hdfs@yarndocker-1 yarn]$ javac -d . -cp `hadoop classpath` 
TestShuffleHandler2.java 
[hdfs@yarndocker-1 yarn]$ jar cvf auxhdfs.jar org/
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -mkdir /AUX
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -put /tmp/auxhdfs.jar /AUX
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -chmod 777 /AUX
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -chmod 600 /AUX/auxhdfs.jar
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -chown -R yarn:hadoop  /AUX

2. Configure YARN NodeManager (yarn-site.xml) to pick from Hdfs

  
yarn.nodemanager.aux-services
auxhdfs
  

  

[jira] [Updated] (YARN-9965) Fix NodeManager failing to start on subsequent times when Hdfs Auxillary Jar is set

2019-11-12 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9965:

Summary: Fix NodeManager failing to start on subsequent times when Hdfs 
Auxillary Jar is set  (was: Fix NodeManager failing to start when Hdfs 
Auxillary Jar is set)

> Fix NodeManager failing to start on subsequent times when Hdfs Auxillary Jar 
> is set
> ---
>
> Key: YARN-9965
> URL: https://issues.apache.org/jira/browse/YARN-9965
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: auxservices, nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9965-001.patch
>
>
> Loading an auxiliary jar from a Hdfs location on a node manager works as 
> expected on first time. The subsequent restart fails with 
> ClassNotFoundException
> {code:java}
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> classpath: []
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> system classes: [java., javax.accessibility., javax.activation., 
> javax.activity., javax.annotation., javax.annotation.processing., 
> javax.crypto., javax.imageio., javax.jws., javax.lang.model., 
> -javax.management.j2ee., javax.management., javax.naming., javax.net., 
> javax.print., javax.rmi., javax.script., -javax.security.auth.message., 
> javax.security.auth., javax.security.cert., javax.security.sasl., 
> javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., 
> -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., 
> org.xml.sax., org.apache.commons.logging., org.apache.log4j., 
> -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, 
> hdfs-default.xml, mapred-default.xml, yarn-default.xml]
> 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
> in state INITED
> java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016)
> {code}
>  
> The issue happens when reusing the previous localized auxillary service jar. 
> The localized jar file is appended with /* when reusing which has caused the 
> issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9965) Fix NodeManager failing to start when Hdfs Auxillary Jar is set

2019-11-12 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972364#comment-16972364
 ] 

Prabhu Joseph commented on YARN-9965:
-

That's my mistake initially, will remember to submit patch with test case. 

Below is the functionality test case done to validate the fix for reference.

*Repro*

{code}
1. Prepare a custom auxiliary service jar and place it on hdfs

[hdfs@yarndocker-1 yarn]$ cat TestShuffleHandler2.java 
package org;
import org.apache.hadoop.yarn.server.api.AuxiliaryService;
import org.apache.hadoop.yarn.server.api.ApplicationInitializationContext;
import org.apache.hadoop.yarn.server.api.ApplicationTerminationContext;
import java.nio.ByteBuffer;

public class TestShuffleHandler2 extends AuxiliaryService {
public static final String MAPREDUCE_TEST_SHUFFLE_SERVICEID = 
"test_shuffle2";
public TestShuffleHandler2() {
  super("testshuffle2");
}
@Override
public void initializeApplication(ApplicationInitializationContext context) 
{
}
@Override
public void stopApplication(ApplicationTerminationContext context) {
}
@Override
public synchronized ByteBuffer getMetaData() {
  return ByteBuffer.allocate(0); 
}
  }
  
[hdfs@yarndocker-1 yarn]$ javac -d . -cp `hadoop classpath` 
TestShuffleHandler2.java 
[hdfs@yarndocker-1 yarn]$ jar cvf auxhdfs.jar org/
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -mkdir /AUX
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -put /tmp/auxhdfs.jar /AUX
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -chmod 777 /AUX
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -chmod 600 /AUX/auxhdfs.jar
[hdfs@yarndocker-1 mapreduce]$ hadoop fs -chown -R yarn:hadoop  /AUX

2. Configure YARN NodeManager (yarn-site.xml) to pick from Hdfs

  
yarn.nodemanager.aux-services
auxhdfs
  

  
yarn.nodemanager.aux-services.auxhdfs.class
org.TestShuffleHandler2
  
  
  
yarn.nodemanager.aux-services.auxhdfs.remote-classpath
/AUX/auxhdfs.jar
  

  
yarn.nodemanager.aux-services.auxhdfs.system-classes

org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices
  
{code}
  
After the patch, the issue does not happen.








> Fix NodeManager failing to start when Hdfs Auxillary Jar is set
> ---
>
> Key: YARN-9965
> URL: https://issues.apache.org/jira/browse/YARN-9965
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: auxservices, nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9965-001.patch
>
>
> Loading an auxiliary jar from a Hdfs location on a node manager works as 
> expected on first time. The subsequent restart fails with 
> ClassNotFoundException
> {code:java}
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> classpath: []
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> system classes: [java., javax.accessibility., javax.activation., 
> javax.activity., javax.annotation., javax.annotation.processing., 
> javax.crypto., javax.imageio., javax.jws., javax.lang.model., 
> -javax.management.j2ee., javax.management., javax.naming., javax.net., 
> javax.print., javax.rmi., javax.script., -javax.security.auth.message., 
> javax.security.auth., javax.security.cert., javax.security.sasl., 
> javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., 
> -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., 
> org.xml.sax., org.apache.commons.logging., org.apache.log4j., 
> -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, 
> hdfs-default.xml, mapred-default.xml, yarn-default.xml]
> 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
> in state INITED
> java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> 

[jira] [Commented] (YARN-9697) Efficient allocation of Opportunistic containers.

2019-11-12 Thread Abhishek Modi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972296#comment-16972296
 ] 

Abhishek Modi commented on YARN-9697:
-

Thanks [~bibinchundatt] and [~elgoiri] for review. Committed to trunk.

> Efficient allocation of Opportunistic containers.
> -
>
> Key: YARN-9697
> URL: https://issues.apache.org/jira/browse/YARN-9697
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Abhishek Modi
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-9697.001.patch, YARN-9697.002.patch, 
> YARN-9697.003.patch, YARN-9697.004.patch, YARN-9697.005.patch, 
> YARN-9697.006.patch, YARN-9697.007.patch, YARN-9697.008.patch, 
> YARN-9697.009.patch, YARN-9697.ut.patch, YARN-9697.ut2.patch, 
> YARN-9697.wip1.patch, YARN-9697.wip2.patch
>
>
> In the current implementation, opportunistic containers are allocated based 
> on the number of queued opportunistic container information received in node 
> heartbeat. This information becomes stale as soon as more opportunistic 
> containers are allocated on that node.
> Allocation of opportunistic containers happens on the same heartbeat in which 
> AM asks for the containers. When multiple applications request for 
> Opportunistic containers, containers might get allocated on the same set of 
> nodes as already allocated containers on the node are not considered while 
> serving requests from different applications. This can lead to uneven 
> allocation of Opportunistic containers across the cluster leading to 
> increased queuing time 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9697) Efficient allocation of Opportunistic containers.

2019-11-12 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972291#comment-16972291
 ] 

Hudson commented on YARN-9697:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17630 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17630/])
YARN-9697. Efficient allocation of Opportunistic containers. Contributed 
(abmodi: rev fb512f50877438acb01fe6b3ec96c12b4db61694)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/scheduler/OpportunisticContainerAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/scheduler/OpportunisticContainerContext.java
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/scheduler/TestOpportunisticContainerAllocator.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/distributed/TestCentralizedOpportunisticContainerAllocator.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/scheduler/TestDistributedOpportunisticContainerAllocator.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/distributed/CentralizedOpportunisticContainerAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/distributed/TestNodeQueueLoadMonitor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/OpportunisticContainerAllocatorAMService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/scheduler/DistributedOpportunisticContainerAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/distributed/NodeQueueLoadMonitor.java


> Efficient allocation of Opportunistic containers.
> -
>
> Key: YARN-9697
> URL: https://issues.apache.org/jira/browse/YARN-9697
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Abhishek Modi
>Assignee: Abhishek Modi
>Priority: Major
> Attachments: YARN-9697.001.patch, YARN-9697.002.patch, 
> YARN-9697.003.patch, YARN-9697.004.patch, YARN-9697.005.patch, 
> YARN-9697.006.patch, YARN-9697.007.patch, YARN-9697.008.patch, 
> YARN-9697.009.patch, YARN-9697.ut.patch, YARN-9697.ut2.patch, 
> YARN-9697.wip1.patch, YARN-9697.wip2.patch
>
>
> In the current implementation, opportunistic containers are allocated based 
> on the number of queued opportunistic container information received in node 
> heartbeat. This information becomes stale as soon as more opportunistic 
> containers are allocated on that node.
> Allocation of opportunistic containers happens on the same heartbeat in which 
> AM asks for the containers. When multiple applications request for 
> Opportunistic containers, containers might get allocated on the same set of 
> nodes as already allocated containers on the node are not considered while 
> serving requests from different applications. This can lead to uneven 
> allocation of Opportunistic containers across the cluster leading to 
> increased queuing time 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2019-11-12 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972181#comment-16972181
 ] 

Adam Antal commented on YARN-8373:
--

Thanks for the patch [~wilfreds]. Reading through your analysis and the 
resolution, I agree with that. +1 (non-binding).

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Girish Bhat
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: newbie
> Attachments: YARN-8373.001.patch, YARN-8373.002.patch, 
> YARN-8373.003.patch
>
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9963) Add getIpAndHost to RuncContainerRuntime

2019-11-12 Thread kevin su (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kevin su reassigned YARN-9963:
--

Assignee: kevin su

> Add getIpAndHost to RuncContainerRuntime
> 
>
> Key: YARN-9963
> URL: https://issues.apache.org/jira/browse/YARN-9963
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: kevin su
>Priority: Major
>
> {{RuncContainerRuntime}} does not currently implement this logic, but 
> {{DockerLinuxContainerRuntime}} does.
> See YARN-5430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org