[jira] [Commented] (YARN-6342) Issues in async API of TimelineClient

2017-03-21 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935667#comment-15935667
 ] 

Rohith Sharma K S commented on YARN-6342:
-

Shall we also fix draining of all entities which are exist in queue may be 
instead of hard coding for 2 seconds, shall it be taken as configuration? IMO 2 
seconds is too less where many entities are not published during stop.

> Issues in async API of TimelineClient
> -
>
> Key: YARN-6342
> URL: https://issues.apache.org/jira/browse/YARN-6342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Haibo Chen
>  Labels: yarn-5355-merge-blocker
>
> Found these with [~rohithsharma] while browsing the code
> - In stop: it calls shutdownNow which doens't wait for pending tasks, should 
> it use shutdown instead ?
> {code}
> public void stop() {
>   LOG.info("Stopping TimelineClient.");
>   executor.shutdownNow();
>   try {
> executor.awaitTermination(DRAIN_TIME_PERIOD, TimeUnit.MILLISECONDS);
>   } catch (InterruptedException e) {
> {code}
> - In TimelineClientImpl#createRunnable:
> If any exception happens when publish one entity 
> (publishWithoutBlockingOnQueue), the thread exists. I think it should try 
> best effort to continue publishing the timeline entities, one failure should 
> not cause all followup entities not published.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-03-21 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935642#comment-15935642
 ] 

Robert Kanter commented on YARN-6050:
-

I looked into the {{TestCapacitySchedulerNodeLabelUpdate}} failure.  It looks 
like the problem is that when {{RMServerUtils#getApplicableNodeCountForAM}} 
calls 
{{CommonNodeLabelsManager#getLabelsToNodes(RMNodeLabelsManager.NO_LABEL)}}, it 
always returns an empty list of {{NodeId}} because it's not a real label and it 
looks like there's a bunch of special handling for it.  The original code that 
was here was calling 
{{CommonNodeLabelsManager#getActiveNMCountPerLabel(RMNodeLabelsManager.NO_LABEL)}}
 (from the {{RMNodeLabelsManager}} subclass), which does return the number of 
nodes without a label.  This behavior is inconsistent 

I'll try to figure out a way around this problem or some simple/cheap way to 
determine the {{NodeId}} for unlabeled/default labeled nodes.  But from my 
quick look at the code and the debugger, it doesn't look like this is tracked 
anywhere.

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch, YARN-6050.004.patch, YARN-6050.005.patch, 
> YARN-6050.006.patch, YARN-6050.007.patch, YARN-6050.008.patch, 
> YARN-6050.009.patch, YARN-6050.010.patch, YARN-6050.011.patch, 
> YARN-6050.012.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935617#comment-15935617
 ] 

Hadoop QA commented on YARN-6050:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 17 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 16m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 20s{color} | {color:orange} root: The patch generated 10 new + 1665 
unchanged - 5 fixed = 1675 total (was 1670) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 880 unchanged - 2 fixed = 880 total (was 882) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
33s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 36s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}102m 
18s{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}250m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 

[jira] [Commented] (YARN-6339) Improve performance for createAndGetApplicationReport

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935616#comment-15935616
 ] 

Hadoop QA commented on YARN-6339:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
40s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
41s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
59s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 41m 
51s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}120m  1s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6339 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859809/YARN-6339.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 182c97830a1e 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f462e1f |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/15352/artifact/patchprocess/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/15352/artifact/patchprocess/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test 

[jira] [Commented] (YARN-6343) Docker docs MR example is broken

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935565#comment-15935565
 ] 

Hadoop QA commented on YARN-6343:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6343 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859815/YARN_6343_001.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux b5d43acc0330 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f462e1f |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15354/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Docker docs MR example is broken
> 
>
> Key: YARN-6343
> URL: https://issues.apache.org/jira/browse/YARN-6343
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha3
>Reporter: Daniel Templeton
>Assignee: Prashant Jha
>  Labels: docs, newbie
> Attachments: YARN_6343_001.patch
>
>
> In the example, the -D args come before pi, but it should be the other way 
> around.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6343) Docker docs MR example is broken

2017-03-21 Thread Prashant Jha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Jha updated YARN-6343:
---
Attachment: YARN_6343_001.patch

Fix command to run the pi app.

> Docker docs MR example is broken
> 
>
> Key: YARN-6343
> URL: https://issues.apache.org/jira/browse/YARN-6343
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha3
>Reporter: Daniel Templeton
>Assignee: Prashant Jha
>  Labels: docs, newbie
> Attachments: YARN_6343_001.patch
>
>
> In the example, the -D args come before pi, but it should be the other way 
> around.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6046) Documentation correction in YarnApplicationSecurity

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935551#comment-15935551
 ] 

Hadoop QA commented on YARN-6046:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 14m 53s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6046 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859811/YARN-6046.001.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 8b7d51d9aacd 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f462e1f |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15353/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Documentation correction in YarnApplicationSecurity
> ---
>
> Key: YARN-6046
> URL: https://issues.apache.org/jira/browse/YARN-6046
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Prashant Jha
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-6046.001.patch
>
>
> Few documentation correction required in 
> hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html
> {code}
> 1. Suring AM startup, log in to Kerberos.
> {code}
> {code}
> Don’t. Rely on the lifespan of the 
> {code}
> {code}
> renewed automatically; the AM pushes out 
> {code}
> {code}
> In an insecure cluster, the application will run as the identity of the 
> account of the node manager, typically something such as yarn or mapred. By 
> default, the application will access HDFS as that user, with a different home 
> directory, and with a different user identified in audit logs and on file 
> system owner attributes.
> {code}
> Need to reframe sentence.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6326) Shouldn't use AppAttemptIds to fetch applications while AM Simulator tracks app in SLS

2017-03-21 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935534#comment-15935534
 ] 

Yufei Gu commented on YARN-6326:


Looks like we don't need to put this on branch-2 since it based on YARN-1471 
which is only available in Hadoop 3. Thanks [~rkanter] for the review and 
commit.

> Shouldn't use AppAttemptIds to fetch applications while AM Simulator tracks 
> app in SLS
> --
>
> Key: YARN-6326
> URL: https://issues.apache.org/jira/browse/YARN-6326
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6326.001.patch, YARN-6326.002.patch, 
> YARN-6326.003.patch, YARN-6326.004.patch, YARN-6326.005.patch
>
>
> This causes a NPE issue. Beside the NPE, the metrics won't reflect the 
> different attempts. We should pass ApplicationId Instead of AppAttemptId. The 
> NPE caused by the issue:
> {code}
> 2017-03-13 20:43:39,153 INFO appmaster.AMSimulator: Submit a new application 
> application_1489463017173_0001
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:327)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.getSchedulerApp(FairScheduler.java:1028)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.FairSchedulerMetrics.trackApp(FairSchedulerMetrics.java:68)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.addTrackedApp(ResourceSchedulerWrapper.java:799)
>   at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.trackApp(AMSimulator.java:338)
>   at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.firstStep(AMSimulator.java:156)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:90)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Exception in thread "pool-6-thread-1" java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:105)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6343) Docker docs MR example is broken

2017-03-21 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton reassigned YARN-6343:
--

Assignee: Prashant Jha  (was: Daniel Templeton)

> Docker docs MR example is broken
> 
>
> Key: YARN-6343
> URL: https://issues.apache.org/jira/browse/YARN-6343
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha3
>Reporter: Daniel Templeton
>Assignee: Prashant Jha
>  Labels: docs, newbie
>
> In the example, the -D args come before pi, but it should be the other way 
> around.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5934) Fix TestTimelineWebServices.testPrimaryFilterNumericString

2017-03-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935531#comment-15935531
 ] 

Hudson commented on YARN-5934:
--

ABORTED: Integrated in Jenkins build Hadoop-trunk-Commit #11440 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11440/])
YARN-5934. Fix TestTimelineWebServices.testPrimaryFilterNumericString 
(varunsaxena: rev f462e1ff68d698a669406d967336bb812cf6981b)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/test/java/org/apache/hadoop/yarn/server/timeline/webapp/TestTimelineWebServices.java


> Fix TestTimelineWebServices.testPrimaryFilterNumericString
> --
>
> Key: YARN-5934
> URL: https://issues.apache.org/jira/browse/YARN-5934
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Fix For: 3.0.0-alpha3
>
> Attachments: YARN-5934.01.patch
>
>
> {noformat}
> Running org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
> Tests run: 28, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 43.297 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
> testPrimaryFilterNumericString(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices)
>   Time elapsed: 1.209 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<0> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testPrimaryFilterNumericString(TestTimelineWebServices.java:348)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6046) Documentation correction in YarnApplicationSecurity

2017-03-21 Thread Prashant Jha (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Jha updated YARN-6046:
---
Attachment: YARN-6046.001.patch

Fixing doc bug.

> Documentation correction in YarnApplicationSecurity
> ---
>
> Key: YARN-6046
> URL: https://issues.apache.org/jira/browse/YARN-6046
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Prashant Jha
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-6046.001.patch
>
>
> Few documentation correction required in 
> hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html
> {code}
> 1. Suring AM startup, log in to Kerberos.
> {code}
> {code}
> Don’t. Rely on the lifespan of the 
> {code}
> {code}
> renewed automatically; the AM pushes out 
> {code}
> {code}
> In an insecure cluster, the application will run as the identity of the 
> account of the node manager, typically something such as yarn or mapred. By 
> default, the application will access HDFS as that user, with a different home 
> directory, and with a different user identified in audit logs and on file 
> system owner attributes.
> {code}
> Need to reframe sentence.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6357) Implement TimelineCollector#putEntitiesAsync

2017-03-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935524#comment-15935524
 ] 

Varun Saxena commented on YARN-6357:


By the way, coming to the patch, although almost all the writer implementations 
will have asynchronous writes and some buffering but there is no guarantee 
(unlikely but writer implementations can choose to persist data to backend 
immediately and choose to implement flush as a no-op), so lets name the method 
writeTimelineEntities instead of writeTimelineEntitiesAsync.
Also persistOutstandingTimelineEntities can be renamed to 
persistBufferedTimelineEntities. Or simply name the method flushEntities.

Other than this patch looks fine to me.

> Implement TimelineCollector#putEntitiesAsync
> 
>
> Key: YARN-6357
> URL: https://issues.apache.org/jira/browse/YARN-6357
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2, timelineserver
>Affects Versions: YARN-2928
>Reporter: Joep Rottinghuis
>Assignee: Haibo Chen
>  Labels: yarn-5355-merge-blocker
> Attachments: YARN-6357.01.patch, YARN-6357.02.patch
>
>
> As discovered and discussed in YARN-5269 the 
> TimelineCollector#putEntitiesAsync method is currently not implemented and 
> TimelineCollector#putEntities is asynchronous.
> TimelineV2ClientImpl#putEntities vs TimelineV2ClientImpl#putEntitiesAsync 
> correctly call TimelineEntityDispatcher#dispatchEntities(boolean sync,... 
> with the correct argument. This argument does seem to make it into the 
> params, and on the server side TimelineCollectorWebService#putEntities 
> correctly pulls the async parameter from the rest call. See line 156:
> {code}
> boolean isAsync = async != null && async.trim().equalsIgnoreCase("true");
> {code}
> However, this is where the problem starts. It simply calls 
> TimelineCollector#putEntities and ignores the value of isAsync. It should 
> instead have called TimelineCollector#putEntitiesAsync, which is currently 
> not implemented.
> putEntities should call putEntitiesAsync and then after that call 
> writer.flush()
> The fact that we flush on close and we flush periodically should be more of a 
> concern of avoiding data loss; close in case sync is never called and the 
> periodic flush to guard against having data from slow writers get buffered 
> for a long time and expose us to risk of loss in case the collector crashes 
> with data in its buffers. Size-based flush is a different concern to avoid 
> blowing up memory footprint.
> The spooling behavior is also somewhat separate.
> We have two separate methods on our API putEntities and putEntitiesAsync and 
> they should have different behavior beyond waiting for the request to be sent.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6339) Improve performance for createAndGetApplicationReport

2017-03-21 Thread yunjiong zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yunjiong zhao updated YARN-6339:

Attachment: YARN-6339.003.patch

[~wangda], Good suggestion.
Update patch set logAggregationStatusForAppReport to volatile, no need  changes 
in createAndGetApplicationReport() any more since it's safe update 
logAggregationStatusForAppReport inside getLogAggregationStatusForAppReport().
Thanks for your time to review the patch.

> Improve performance for createAndGetApplicationReport
> -
>
> Key: YARN-6339
> URL: https://issues.apache.org/jira/browse/YARN-6339
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
> Attachments: YARN-6339.001.patch, YARN-6339.002.patch, 
> YARN-6339.003.patch
>
>
> There are two performance issue when calling createAndGetApplicationReport:
> One is inside ProtoUtils.convertFromProtoFormat, replace is too slow for 
> clusters which have more than 3000 nodes. Use substring is much better: 
> https://issues.apache.org/jira/browse/YARN-6285?focusedCommentId=15923241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15923241
> Another one is inside getLogAggregationReportsForApp, if some application's 
> LogAggregationStatus is TIME_OUT, every time it was called it will create an 
> HashMap which will produce lots of garbage.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6372) Add default value for NM disk validator

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935506#comment-15935506
 ] 

Hadoop QA commented on YARN-6372:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m  
7s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6372 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859798/YARN-6372.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8ee0afb94e31 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cc938e9 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15350/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15350/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add default value for NM disk validator
> ---
>
> Key: YARN-6372
> URL: https://issues.apache.org/jira/browse/YARN-6372
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> 

[jira] [Commented] (YARN-5934) Fix TestTimelineWebServices.testPrimaryFilterNumericString

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935496#comment-15935496
 ] 

Hadoop QA commented on YARN-5934:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
42s{color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the 
patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5934 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12840386/YARN-5934.01.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5a153d6273cb 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 
09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cc938e9 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15351/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15351/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix TestTimelineWebServices.testPrimaryFilterNumericString
> --
>
> Key: YARN-5934
> URL: https://issues.apache.org/jira/browse/YARN-5934
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: YARN-5934.01.patch
>
>
> {noformat}
> Running 

[jira] [Commented] (YARN-6326) Shouldn't use AppAttemptIds to fetch applications while AM Simulator tracks app in SLS

2017-03-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935490#comment-15935490
 ] 

Hudson commented on YARN-6326:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11439 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11439/])
YARN-6326. Shouldn't use AppAttemptIds to fetch applications while AM (rkanter: 
rev cc938e99ec0904824c8072184eff75619fcaf040)
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/FairSchedulerMetrics.java
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SchedulerWrapper.java
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SchedulerMetrics.java
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/appmaster/AMSimulator.java
* (edit) 
hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/SLSCapacityScheduler.java
* (edit) 
hadoop-tools/hadoop-sls/src/test/java/org/apache/hadoop/yarn/sls/appmaster/TestAMSimulator.java


> Shouldn't use AppAttemptIds to fetch applications while AM Simulator tracks 
> app in SLS
> --
>
> Key: YARN-6326
> URL: https://issues.apache.org/jira/browse/YARN-6326
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6326.001.patch, YARN-6326.002.patch, 
> YARN-6326.003.patch, YARN-6326.004.patch, YARN-6326.005.patch
>
>
> This causes a NPE issue. Beside the NPE, the metrics won't reflect the 
> different attempts. We should pass ApplicationId Instead of AppAttemptId. The 
> NPE caused by the issue:
> {code}
> 2017-03-13 20:43:39,153 INFO appmaster.AMSimulator: Submit a new application 
> application_1489463017173_0001
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:327)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.getSchedulerApp(FairScheduler.java:1028)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.FairSchedulerMetrics.trackApp(FairSchedulerMetrics.java:68)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.addTrackedApp(ResourceSchedulerWrapper.java:799)
>   at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.trackApp(AMSimulator.java:338)
>   at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.firstStep(AMSimulator.java:156)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:90)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Exception in thread "pool-6-thread-1" java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:105)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6371) NodeHealthCheckerService member vars should be final

2017-03-21 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton reassigned YARN-6371:
--

Assignee: Zola Petkovic  (was: Daniel Templeton)

> NodeHealthCheckerService member vars should be final
> 
>
> Key: YARN-6371
> URL: https://issues.apache.org/jira/browse/YARN-6371
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Zola Petkovic
>Priority: Trivial
>  Labels: newbie
>
> {code}
>   private NodeHealthScriptRunner nodeHealthScriptRunner;
>   private LocalDirsHandlerService dirsHandler;
> {code}
> They can both be final.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6372) Add default value for NM disk validator

2017-03-21 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6372:
---
Attachment: YARN-6372.001.patch

> Add default value for NM disk validator
> ---
>
> Key: YARN-6372
> URL: https://issues.apache.org/jira/browse/YARN-6372
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.7.3, 3.0.0-alpha2
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6372.001.patch
>
>
> YARN-5137 make DiskChecker pluggable in NodeManager. We should give a default 
> value in case NM does't provide the configuration item.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6326) Shouldn't use AppAttemptIds to fetch applications while AM Simulator tracks app in SLS

2017-03-21 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935452#comment-15935452
 ] 

Robert Kanter commented on YARN-6326:
-

I've pushed it to trunk, but this doesn't apply cleanly to branch-2.  Can you 
upload a branch-2 version of the patch?

> Shouldn't use AppAttemptIds to fetch applications while AM Simulator tracks 
> app in SLS
> --
>
> Key: YARN-6326
> URL: https://issues.apache.org/jira/browse/YARN-6326
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6326.001.patch, YARN-6326.002.patch, 
> YARN-6326.003.patch, YARN-6326.004.patch, YARN-6326.005.patch
>
>
> This causes a NPE issue. Beside the NPE, the metrics won't reflect the 
> different attempts. We should pass ApplicationId Instead of AppAttemptId. The 
> NPE caused by the issue:
> {code}
> 2017-03-13 20:43:39,153 INFO appmaster.AMSimulator: Submit a new application 
> application_1489463017173_0001
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:327)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.getSchedulerApp(FairScheduler.java:1028)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.FairSchedulerMetrics.trackApp(FairSchedulerMetrics.java:68)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.addTrackedApp(ResourceSchedulerWrapper.java:799)
>   at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.trackApp(AMSimulator.java:338)
>   at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.firstStep(AMSimulator.java:156)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:90)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Exception in thread "pool-6-thread-1" java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:105)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5934) Fix TestTimelineWebServices.testPrimaryFilterNumericString

2017-03-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935443#comment-15935443
 ] 

Varun Saxena commented on YARN-5934:


bq. It looks like Jackson1 parses 123abc to 123, on the other hand, Jackson2 
parses 123abc to "123abc". YARN-3723 documented that Jackson library decides 
how the value is casted, so I'm think we can remove the tests.
Agree. Will commit it.

> Fix TestTimelineWebServices.testPrimaryFilterNumericString
> --
>
> Key: YARN-5934
> URL: https://issues.apache.org/jira/browse/YARN-5934
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
> Attachments: YARN-5934.01.patch
>
>
> {noformat}
> Running org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
> Tests run: 28, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 43.297 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices
> testPrimaryFilterNumericString(org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices)
>   Time elapsed: 1.209 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<0> but was:<3>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.timeline.webapp.TestTimelineWebServices.testPrimaryFilterNumericString(TestTimelineWebServices.java:348)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6326) Shouldn't use AppAttemptIds to fetch applications while AM Simulator tracks app in SLS

2017-03-21 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935440#comment-15935440
 ] 

Robert Kanter commented on YARN-6326:
-

+1

> Shouldn't use AppAttemptIds to fetch applications while AM Simulator tracks 
> app in SLS
> --
>
> Key: YARN-6326
> URL: https://issues.apache.org/jira/browse/YARN-6326
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-6326.001.patch, YARN-6326.002.patch, 
> YARN-6326.003.patch, YARN-6326.004.patch, YARN-6326.005.patch
>
>
> This causes a NPE issue. Beside the NPE, the metrics won't reflect the 
> different attempts. We should pass ApplicationId Instead of AppAttemptId. The 
> NPE caused by the issue:
> {code}
> 2017-03-13 20:43:39,153 INFO appmaster.AMSimulator: Submit a new application 
> application_1489463017173_0001
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getApplicationAttempt(AbstractYarnScheduler.java:327)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.getSchedulerApp(FairScheduler.java:1028)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.FairSchedulerMetrics.trackApp(FairSchedulerMetrics.java:68)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.addTrackedApp(ResourceSchedulerWrapper.java:799)
>   at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.trackApp(AMSimulator.java:338)
>   at 
> org.apache.hadoop.yarn.sls.appmaster.AMSimulator.firstStep(AMSimulator.java:156)
>   at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:90)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Exception in thread "pool-6-thread-1" java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.sls.scheduler.TaskRunner$Task.run(TaskRunner.java:105)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6372) Add default value for NM disk validator

2017-03-21 Thread Yufei Gu (JIRA)
Yufei Gu created YARN-6372:
--

 Summary: Add default value for NM disk validator
 Key: YARN-6372
 URL: https://issues.apache.org/jira/browse/YARN-6372
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: nodemanager
Affects Versions: 3.0.0-alpha2, 2.7.3
Reporter: Yufei Gu
Assignee: Yufei Gu


YARN-5137 make DiskChecker pluggable in NodeManager. We should give a default 
value in case NM does't provide the configuration item.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5952) Create REST API for changing YARN scheduler configurations

2017-03-21 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935392#comment-15935392
 ] 

Xuan Gong commented on YARN-5952:
-

Thanks for the patch.[~jhung]

The patch looks good to me.


[~leftnoteasy] could you take a look as well ?

> Create REST API for changing YARN scheduler configurations
> --
>
> Key: YARN-5952
> URL: https://issues.apache.org/jira/browse/YARN-5952
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-5952.001.patch, YARN-5952.002.patch, 
> YARN-5952-YARN-5734.003.patch, YARN-5952-YARN-5734.004.patch
>
>
> Based on the design in YARN-5734.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6284) hasAlreadyRun should be final in ResourceManager.StandByTransitionRunnable

2017-03-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935365#comment-15935365
 ] 

Hudson commented on YARN-6284:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11438 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11438/])
YARN-6284. hasAlreadyRun should be final in (templedf: rev 
0a05c5c5989edeba2cffe16e80350245778cefce)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java


> hasAlreadyRun should be final in ResourceManager.StandByTransitionRunnable
> --
>
> Key: YARN-6284
> URL: https://issues.apache.org/jira/browse/YARN-6284
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Laura Adams
>  Labels: newbie
> Attachments: YARN-6284.001.patch
>
>
> {code}
> // The atomic variable to make sure multiple threads with the same 
> runnable
> // run only once.
> private AtomicBoolean hasAlreadyRun = new AtomicBoolean(false);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935348#comment-15935348
 ] 

ASF GitHub Bot commented on YARN-6302:
--

Github user szegedim commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/200#discussion_r107277279
  
--- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
 ---
@@ -111,6 +113,58 @@
   private LinuxContainerRuntime linuxContainerRuntime;
 
   /**
+   * The container exit code.
+   */
+  public enum LinuxContainerExecutorExitCode {
--- End diff --

Done.


> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6050) AMs can't be scheduled on racks or nodes

2017-03-21 Thread Robert Kanter (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-6050:

Attachment: YARN-6050.012.patch

Thanks [~kasha].  The 012 patch:
# I removed all Node Label changes.  It now strips out the wildcard port 
{{NodeId}}'s
# I didn't add a {{getNodeTracker}} method because {{ClusterNodeTracker}} has 
generics and things were getting really ugly with some compiler warnings and 
issues with rawtypes.  Instead, I kept {{getClusterNodeIdsByResourceName}} that 
returns {{List}} as before, but I moved it out of {{YarnScheduler}} 
into {{ResourceScheduler}} and instead of implementing it in each of the 
Scheduler subclasses, I only implemented it in {{AbstractYarnScheduler}}.  I 
think this is the cleanest solution.
# I added some comments to {{RMUtils#getApplicableNodeCountForAM}} to make it 
easier to follow.

I've also put it on ReviewBoard if that's easier to look at: 
https://reviews.apache.org/r/57819/

> AMs can't be scheduled on racks or nodes
> 
>
> Key: YARN-6050
> URL: https://issues.apache.org/jira/browse/YARN-6050
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6050.001.patch, YARN-6050.002.patch, 
> YARN-6050.003.patch, YARN-6050.004.patch, YARN-6050.005.patch, 
> YARN-6050.006.patch, YARN-6050.007.patch, YARN-6050.008.patch, 
> YARN-6050.009.patch, YARN-6050.010.patch, YARN-6050.011.patch, 
> YARN-6050.012.patch
>
>
> Yarn itself supports rack/node aware scheduling for AMs; however, there 
> currently are two problems:
> # To specify hard or soft rack/node requests, you have to specify more than 
> one {{ResourceRequest}}.  For example, if you want to schedule an AM only on 
> "rackA", you have to create two {{ResourceRequest}}, like this:
> {code}
> ResourceRequest.newInstance(PRIORITY, ANY, CAPABILITY, NUM_CONTAINERS, false);
> ResourceRequest.newInstance(PRIORITY, "rackA", CAPABILITY, NUM_CONTAINERS, 
> true);
> {code}
> The problem is that the Yarn API doesn't actually allow you to specify more 
> than one {{ResourceRequest}} in the {{ApplicationSubmissionContext}}.  The 
> current behavior is to either build one from {{getResource}} or directly from 
> {{getAMContainerResourceRequest}}, depending on if 
> {{getAMContainerResourceRequest}} is null or not.  We'll need to add a third 
> method, say {{getAMContainerResourceRequests}}, which takes a list of 
> {{ResourceRequest}} so that clients can specify the multiple resource 
> requests.
> # There are some places where things are hardcoded to overwrite what the 
> client specifies.  These are pretty straightforward to fix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935336#comment-15935336
 ] 

ASF GitHub Bot commented on YARN-6302:
--

Github user szegedim commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/200#discussion_r107274199
  
--- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/ConfigurationException.java
 ---
@@ -19,13 +19,12 @@
 package org.apache.hadoop.yarn.exceptions;
 
 import org.apache.hadoop.classification.InterfaceAudience.Public;
-import org.apache.hadoop.classification.InterfaceStability.Unstable;
 
 /**
- * This exception is thrown on unrecoverable container launch errors.
+ * This exception is thrown on unrecoverable configuration errors.
+ * An example is container launch error due to configuration.
  */
 @Public
--- End diff --

All right.


> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935337#comment-15935337
 ] 

ASF GitHub Bot commented on YARN-6302:
--

Github user szegedim commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/200#discussion_r107274254
  
--- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/ConfigurationException.java
 ---
@@ -0,0 +1,43 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.yarn.exceptions;
+
+import org.apache.hadoop.classification.InterfaceAudience.Public;
+
+/**
+ * This exception is thrown on unrecoverable configuration errors.
+ * An example is container launch error due to configuration.
+ */
+@Public
--- End diff --

All right.


> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6371) NodeHealthCheckerService member vars should be final

2017-03-21 Thread Daniel Templeton (JIRA)
Daniel Templeton created YARN-6371:
--

 Summary: NodeHealthCheckerService member vars should be final
 Key: YARN-6371
 URL: https://issues.apache.org/jira/browse/YARN-6371
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 3.0.0-alpha2
Reporter: Daniel Templeton
Assignee: Daniel Templeton
Priority: Trivial


{code}
  private NodeHealthScriptRunner nodeHealthScriptRunner;
  private LocalDirsHandlerService dirsHandler;
{code}

They can both be final.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6342) Issues in async API of TimelineClient

2017-03-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935313#comment-15935313
 ] 

Varun Saxena edited comment on YARN-6342 at 3/21/17 8:53 PM:
-

[~haibochen], correct. FutureTask#run won't throw any exception.

How about fixing the other issue which I pointed out?
While stopping NM Timeline Publisher i.e. in NMTimelinePublisher#serviceStop we 
are not explicitly stopping app related timeline clients which would mean that 
some of the entities lying around in the queue may not be written to 
collector...
Thoughts?


was (Author: varun_saxena):
[~haibochen], correct. FutureTask#run won't throw any exception.

How about fixing the other issue which I pointed out?
While stopping i.e. in NMTimelinePublisher#serviceStop we are not explicitly 
stopping app related timeline clients which would mean that some of the 
entities lying around in the queue may not be written...
Thoughts?

> Issues in async API of TimelineClient
> -
>
> Key: YARN-6342
> URL: https://issues.apache.org/jira/browse/YARN-6342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Haibo Chen
>  Labels: yarn-5355-merge-blocker
>
> Found these with [~rohithsharma] while browsing the code
> - In stop: it calls shutdownNow which doens't wait for pending tasks, should 
> it use shutdown instead ?
> {code}
> public void stop() {
>   LOG.info("Stopping TimelineClient.");
>   executor.shutdownNow();
>   try {
> executor.awaitTermination(DRAIN_TIME_PERIOD, TimeUnit.MILLISECONDS);
>   } catch (InterruptedException e) {
> {code}
> - In TimelineClientImpl#createRunnable:
> If any exception happens when publish one entity 
> (publishWithoutBlockingOnQueue), the thread exists. I think it should try 
> best effort to continue publishing the timeline entities, one failure should 
> not cause all followup entities not published.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6342) Issues in async API of TimelineClient

2017-03-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935313#comment-15935313
 ] 

Varun Saxena commented on YARN-6342:


[~haibochen], correct. FutureTask#run won't throw any exception.

How about fixing the other issue which I pointed out?
While stopping i.e. in NMTimelinePublisher#serviceStop we are not explicitly 
stopping app related timeline clients which would mean that some of the 
entities lying around in the queue may not be written...
Thoughts?

> Issues in async API of TimelineClient
> -
>
> Key: YARN-6342
> URL: https://issues.apache.org/jira/browse/YARN-6342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Haibo Chen
>  Labels: yarn-5355-merge-blocker
>
> Found these with [~rohithsharma] while browsing the code
> - In stop: it calls shutdownNow which doens't wait for pending tasks, should 
> it use shutdown instead ?
> {code}
> public void stop() {
>   LOG.info("Stopping TimelineClient.");
>   executor.shutdownNow();
>   try {
> executor.awaitTermination(DRAIN_TIME_PERIOD, TimeUnit.MILLISECONDS);
>   } catch (InterruptedException e) {
> {code}
> - In TimelineClientImpl#createRunnable:
> If any exception happens when publish one entity 
> (publishWithoutBlockingOnQueue), the thread exists. I think it should try 
> best effort to continue publishing the timeline entities, one failure should 
> not cause all followup entities not published.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5368) memory leak at timeline server

2017-03-21 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935302#comment-15935302
 ] 

Varun Saxena commented on YARN-5368:


[~jeagles], nice catch.
Closing of only the last iterator in the loop must be the reason for leak. 
The leak we got in NM, albeit due to our private code, was also due to 
DBIterator not being closed.

Using try-with-resources approach for DBIterator should be fine. 
How about using try-with-resources for DBIterator elsewhere in the 
RollingLevelDBTimelineStore class i.e. where it's not used in the loop, just to 
make the code consistent.

> memory leak at timeline server
> --
>
> Key: YARN-5368
> URL: https://issues.apache.org/jira/browse/YARN-5368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.1
> Environment: HDP2.4
> CentOS 6.7
> jdk1.8.0_72
>Reporter: Wataru Yukawa
>Assignee: Jonathan Eagles
> Attachments: YARN-5368.1.patch
>
>
> memory usage of timeline server machine increases gradually.
> https://gyazo.com/952dad96c77ae053bae2e4d8c8ab0572
> please check since April.
> According to my investigation, timeline server used about 25GB.
> top command result
> {code}
> 90577 yarn  20   0 28.4g  25g  12m S  0.0 40.1   5162:53 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn ...
> {code}
> ps command result
> {code}
> $ ps ww 90577
>  90577 ?Sl   5162:53 /usr/java/jdk1.8.0_72/bin/java 
> -Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.home.dir= 
> -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,EWMA,RFA 
> -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -Dyarn.policy.file=hadoop-policy.xml 
> -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.home.dir=/usr/hdp/current/hadoop-yarn-timelineserver 
> -Dhadoop.home.dir=/usr/hdp/2.4.0.0-169/hadoop 
> -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -classpath 
> /usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/*:/usr/hdp/2.4.0.0-169/hadoop/.//*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/./:/usr/hdp/2.4.0.0-169/hadoop-hdfs/lib/*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/.//*:/usr/hdp/2.4.0.0-169/hadoop-yarn/lib/*:/usr/hdp/2.4.0.0-169/hadoop-yarn/.//*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/lib/*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//*::/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/current/hadoop-yarn-timelineserver/.//*:/usr/hdp/current/hadoop-yarn-timelineserver/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf/timelineserver-config/log4j.properties
>  
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> {code}
>  
> Alghough I set -Xmx1024m, actual memory usage is 25GB.
> After I restart timeline server, memory usage of timeline server machine 
> decreases.
> https://gyazo.com/130600c17a7d41df8606727a859ae7e3
> Now timelineserver uses less than 1GB memory.
> top command result
> {code}
>  6163 yarn  20   0 3959m 783m  46m S  0.3  1.2   3:37.60 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 ...
> {code}
> I suspect memory leak at timeline server.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935295#comment-15935295
 ] 

ASF GitHub Bot commented on YARN-6302:
--

Github user templedf commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/200#discussion_r107266403
  
--- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
 ---
@@ -37,8 +37,8 @@ enum command {
 
 enum errorcodes {
   INVALID_ARGUMENT_NUMBER = 1,
-  INVALID_USER_NAME, //2
-  INVALID_COMMAND_PROVIDED, //3
+  //INVALID_USER_NAME 2
--- End diff --

Yeah, I didn't mean it was your fault.  Salvaging this code isn't your 
problem. :)


> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935293#comment-15935293
 ] 

ASF GitHub Bot commented on YARN-6302:
--

Github user templedf commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/200#discussion_r107266532
  
--- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/ConfigurationException.java
 ---
@@ -0,0 +1,43 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.yarn.exceptions;
+
+import org.apache.hadoop.classification.InterfaceAudience.Public;
+
+/**
+ * This exception is thrown on unrecoverable configuration errors.
+ * An example is container launch error due to configuration.
+ */
+@Public
--- End diff --

Should this be @Evolving?


> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935294#comment-15935294
 ] 

ASF GitHub Bot commented on YARN-6302:
--

Github user templedf commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/200#discussion_r107268039
  
--- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
 ---
@@ -111,6 +113,58 @@
   private LinuxContainerRuntime linuxContainerRuntime;
 
   /**
+   * The container exit code.
+   */
+  public enum LinuxContainerExecutorExitCode {
--- End diff --

Since this is an inner class of LCE, you can safely drop the LCE from the 
enum name, which will make the subsequent code less messy.


> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935297#comment-15935297
 ] 

ASF GitHub Bot commented on YARN-6302:
--

Github user templedf commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/200#discussion_r107268430
  
--- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
 ---
@@ -525,6 +580,23 @@ public int launchContainer(ContainerStartContext ctx) 
throws IOException {
 logOutput(diagnostics);
 container.handle(new ContainerDiagnosticsUpdateEvent(containerId,
 diagnostics));
+if (exitCode == LinuxContainerExecutorExitCode.
--- End diff --

Right.  Forgot about that.  You'd basically have to recreate the same code 
in the enum to get an instance from an int.

Maybe add an equals() method to the enum that can compare against ints as 
well?  Maybe not worth it.  Just shortening the enum name may be enough...


> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6302) Fail the node, if Linux Container Executor is not configured properly

2017-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935296#comment-15935296
 ] 

ASF GitHub Bot commented on YARN-6302:
--

Github user templedf commented on a diff in the pull request:

https://github.com/apache/hadoop/pull/200#discussion_r107242843
  
--- Diff: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/exceptions/ConfigurationException.java
 ---
@@ -19,13 +19,12 @@
 package org.apache.hadoop.yarn.exceptions;
 
 import org.apache.hadoop.classification.InterfaceAudience.Public;
-import org.apache.hadoop.classification.InterfaceStability.Unstable;
 
 /**
- * This exception is thrown on unrecoverable container launch errors.
+ * This exception is thrown on unrecoverable configuration errors.
+ * An example is container launch error due to configuration.
  */
 @Public
--- End diff --

Maybe make it evolving?


> Fail the node, if Linux Container Executor is not configured properly
> -
>
> Key: YARN-6302
> URL: https://issues.apache.org/jira/browse/YARN-6302
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
>
> We have a cluster that has one node with misconfigured Linux Container 
> Executor. Every time an AM or regular container is launched on the cluster, 
> it will fail. The node will still have resources available, so it keeps 
> failing apps until the administrator notices the issue and decommissions the 
> node. AM Blacklisting only helps, if the application is already running.
> As a possible improvement, when the LCE is used on the cluster and a NM gets 
> certain errors back from the LCE, like error 24 configuration not found, we 
> should not try to allocate anything on the node anymore or shut down the node 
> entirely. That kind of problem normally does not fix itself and it means that 
> nothing can really run on that node.
> {code}
> Application application_1488920587909_0010 failed 2 times due to AM Container 
> for appattempt_1488920587909_0010_02 exited with exitCode: -1000
> Failing this attempt.Diagnostics: Application application_1488920587909_0010 
> initialization failed (exitCode=24) with output:
> For more detailed output, check the application tracking page: 
> http://node-1.domain.com:8088/cluster/app/application_1488920587909_0010 Then 
> click on links to logs of each attempt.
> . Failing the application.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6368) Decommissioning an NM results in a -1 exit code

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935277#comment-15935277
 ] 

Hadoop QA commented on YARN-6368:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 73 unchanged - 0 fixed = 74 total (was 73) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6368 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859780/YARN-6368.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 7c29a656db7a 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c01d15a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/15348/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15348/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15348/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Decommissioning an NM results in a -1 exit code
> ---
>
> Key: YARN-6368
> URL: 

[jira] [Comment Edited] (YARN-6339) Improve performance for createAndGetApplicationReport

2017-03-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935235#comment-15935235
 ] 

Wangda Tan edited comment on YARN-6339 at 3/21/17 8:18 PM:
---

[~zhaoyunjiong], thanks for explanation, generally make sense to me.

Few more suggestions:

1) Can we make use of {{isLogAggregationFinished}} in 
{{createAndGetApplicationReport}} to avoid following checks:
{code}
  if (LogAggregationStatus.FAILED == logAggregationStatusApp ||
  LogAggregationStatus.SUCCEEDED == logAggregationStatusApp ||
  LogAggregationStatus.TIME_OUT == logAggregationStatusApp) {
try {
  this.writeLock.lock();
  this.logAggregationStatusForAppReport = logAggregationStatusApp;
} finally {
  this.writeLock.unlock();
}
  }
{code}

2) Can we make logAggregationStatusForAppReport to be volatile to avoid the 
additional writelock of {{createAndGetApplicationReport}}?



was (Author: leftnoteasy):
[~zhaoyunjiong], thanks for explanation, generally make sense to me.

Few more suggestions:

1) Can we make use of {{isLogAggregationFinished}} in 
{{createAndGetApplicationReport}} to avoid following checks:
{code}
  if (LogAggregationStatus.FAILED == logAggregationStatusApp ||
  LogAggregationStatus.SUCCEEDED == logAggregationStatusApp ||
  LogAggregationStatus.TIME_OUT == logAggregationStatusApp) {
try {
  this.writeLock.lock();
  this.logAggregationStatusForAppReport = logAggregationStatusApp;
} finally {
  this.writeLock.unlock();
}
  }
{code}

2) Can we make logAggregationStatusForAppReport to be volatile to avoid the 
additional writelock?


> Improve performance for createAndGetApplicationReport
> -
>
> Key: YARN-6339
> URL: https://issues.apache.org/jira/browse/YARN-6339
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
> Attachments: YARN-6339.001.patch, YARN-6339.002.patch
>
>
> There are two performance issue when calling createAndGetApplicationReport:
> One is inside ProtoUtils.convertFromProtoFormat, replace is too slow for 
> clusters which have more than 3000 nodes. Use substring is much better: 
> https://issues.apache.org/jira/browse/YARN-6285?focusedCommentId=15923241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15923241
> Another one is inside getLogAggregationReportsForApp, if some application's 
> LogAggregationStatus is TIME_OUT, every time it was called it will create an 
> HashMap which will produce lots of garbage.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6339) Improve performance for createAndGetApplicationReport

2017-03-21 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935235#comment-15935235
 ] 

Wangda Tan commented on YARN-6339:
--

[~zhaoyunjiong], thanks for explanation, generally make sense to me.

Few more suggestions:

1) Can we make use of {{isLogAggregationFinished}} in 
{{createAndGetApplicationReport}} to avoid following checks:
{code}
  if (LogAggregationStatus.FAILED == logAggregationStatusApp ||
  LogAggregationStatus.SUCCEEDED == logAggregationStatusApp ||
  LogAggregationStatus.TIME_OUT == logAggregationStatusApp) {
try {
  this.writeLock.lock();
  this.logAggregationStatusForAppReport = logAggregationStatusApp;
} finally {
  this.writeLock.unlock();
}
  }
{code}

2) Can we make logAggregationStatusForAppReport to be volatile to avoid the 
additional writelock?


> Improve performance for createAndGetApplicationReport
> -
>
> Key: YARN-6339
> URL: https://issues.apache.org/jira/browse/YARN-6339
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
> Attachments: YARN-6339.001.patch, YARN-6339.002.patch
>
>
> There are two performance issue when calling createAndGetApplicationReport:
> One is inside ProtoUtils.convertFromProtoFormat, replace is too slow for 
> clusters which have more than 3000 nodes. Use substring is much better: 
> https://issues.apache.org/jira/browse/YARN-6285?focusedCommentId=15923241=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15923241
> Another one is inside getLogAggregationReportsForApp, if some application's 
> LogAggregationStatus is TIME_OUT, every time it was called it will create an 
> HashMap which will produce lots of garbage.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5600) Allow setting yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935231#comment-15935231
 ] 

Hadoop QA commented on YARN-5600:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
50s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 522 unchanged - 21 fixed = 522 total (was 543) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
56s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
28s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 97m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5600 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12843484/YARN-5600.017.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux fc83739b5187 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 
15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c01d15a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15346/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 

[jira] [Commented] (YARN-3427) Remove deprecated methods from ResourceCalculatorProcessTree

2017-03-21 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935220#comment-15935220
 ] 

Yufei Gu commented on YARN-3427:


+1 (non-binding)

> Remove deprecated methods from ResourceCalculatorProcessTree
> 
>
> Key: YARN-3427
> URL: https://issues.apache.org/jira/browse/YARN-3427
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.7.0
>Reporter: Karthik Kambatla
>Assignee: Miklos Szegedi
>Priority: Blocker
> Attachments: YARN-3427.000.patch
>
>
> In 2.7, we made ResourceCalculatorProcessTree Public and exposed some 
> existing ill-formed methods as deprecated ones for use by Tez.
> We should remove it in 3.0.0, considering that the methods have been 
> deprecated for the all 2.x.y releases that it is marked Public in. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6370) Properly handle rack requests for non-active subclusters in LocalityMulticastAMRMProxyPolicy

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935194#comment-15935194
 ] 

Hadoop QA commented on YARN-6370:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
32s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} YARN-2915 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
11s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6370 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859776/YARN-6370-YARN-2915.v1.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4c28118af2eb 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-2915 / 3778d87 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15347/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15347/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Properly handle rack requests for non-active subclusters in 
> LocalityMulticastAMRMProxyPolicy
> 
>
> Key: YARN-6370
> URL: https://issues.apache.org/jira/browse/YARN-6370
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: 

[jira] [Updated] (YARN-6368) Decommissioning an NM results in a -1 exit code

2017-03-21 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-6368:
-
Attachment: YARN-6368.002.patch

Attaching fix for the build break

> Decommissioning an NM results in a -1 exit code
> ---
>
> Key: YARN-6368
> URL: https://issues.apache.org/jira/browse/YARN-6368
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
> Attachments: YARN-6368.000.patch, YARN-6368.001.patch, 
> YARN-6368.002.patch
>
>
> In NodeManager.java we should exit normally in case the RM shuts down the 
> node:
> {code}
> } finally {
>   if (shouldExitOnShutdownEvent
>   && !ShutdownHookManager.get().isShutdownInProgress()) {
> ExitUtil.terminate(-1);
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6370) Properly handle rack requests for non-active subclusters in LocalityMulticastAMRMProxyPolicy

2017-03-21 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-6370:
---
Attachment: YARN-6370-YARN-2915.v1.patch

> Properly handle rack requests for non-active subclusters in 
> LocalityMulticastAMRMProxyPolicy
> 
>
> Key: YARN-6370
> URL: https://issues.apache.org/jira/browse/YARN-6370
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-6370-YARN-2915.v1.patch
>
>
> When splitting resource request in LocalityMulticastAMRMProxyPolicy. For rack 
> requests for non-active subclusters, we will assign them to home subcluster 
> instead. However we should not add them to bookkeeper as node request since 
> their ask number should not contribute to the count that determine how to 
> distribute ANY requests. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6370) Properly handle rack requests for non-active subclusters in LocalityMulticastAMRMProxyPolicy

2017-03-21 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-6370:
---
Description: When splitting resource request in 
LocalityMulticastAMRMProxyPolicy. For rack requests for non-active subclusters, 
we will assign them to home subcluster instead. However we should not add them 
to bookkeeper as node request since their ask number should not contribute to 
the count that determine how to distribute ANY requests. 

> Properly handle rack requests for non-active subclusters in 
> LocalityMulticastAMRMProxyPolicy
> 
>
> Key: YARN-6370
> URL: https://issues.apache.org/jira/browse/YARN-6370
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>
> When splitting resource request in LocalityMulticastAMRMProxyPolicy. For rack 
> requests for non-active subclusters, we will assign them to home subcluster 
> instead. However we should not add them to bookkeeper as node request since 
> their ask number should not contribute to the count that determine how to 
> distribute ANY requests. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-6370) Properly handle rack requests for non-active subclusters in LocalityMulticastAMRMProxyPolicy

2017-03-21 Thread Botong Huang (JIRA)
Botong Huang created YARN-6370:
--

 Summary: Properly handle rack requests for non-active subclusters 
in LocalityMulticastAMRMProxyPolicy
 Key: YARN-6370
 URL: https://issues.apache.org/jira/browse/YARN-6370
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Botong Huang
Assignee: Botong Huang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6284) hasAlreadyRun should be final in ResourceManager.StandByTransitionRunnable

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935134#comment-15935134
 ] 

Hadoop QA commented on YARN-6284:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 39m 50s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 39s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6284 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859770/YARN-6284.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux fe8962493248 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / c01d15a |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/15345/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15345/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/15345/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> hasAlreadyRun should be final in ResourceManager.StandByTransitionRunnable
> 

[jira] [Commented] (YARN-6284) hasAlreadyRun should be final in ResourceManager.StandByTransitionRunnable

2017-03-21 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935118#comment-15935118
 ] 

Daniel Templeton commented on YARN-6284:


Thanks for the patch, [~laura.adams].  LGTM. +1  I'll get it committed when I 
get a chance.

> hasAlreadyRun should be final in ResourceManager.StandByTransitionRunnable
> --
>
> Key: YARN-6284
> URL: https://issues.apache.org/jira/browse/YARN-6284
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Laura Adams
>  Labels: newbie
> Attachments: YARN-6284.001.patch
>
>
> {code}
> // The atomic variable to make sure multiple threads with the same 
> runnable
> // run only once.
> private AtomicBoolean hasAlreadyRun = new AtomicBoolean(false);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6368) Decommissioning an NM results in a -1 exit code

2017-03-21 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935111#comment-15935111
 ] 

Haibo Chen commented on YARN-6368:
--

Looks like the last build was bad. [~rkanter] Can you kick it off again?

> Decommissioning an NM results in a -1 exit code
> ---
>
> Key: YARN-6368
> URL: https://issues.apache.org/jira/browse/YARN-6368
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
> Attachments: YARN-6368.000.patch, YARN-6368.001.patch
>
>
> In NodeManager.java we should exit normally in case the RM shuts down the 
> node:
> {code}
> } finally {
>   if (shouldExitOnShutdownEvent
>   && !ShutdownHookManager.get().isShutdownInProgress()) {
> ExitUtil.terminate(-1);
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6368) Decommissioning an NM results in a -1 exit code

2017-03-21 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935101#comment-15935101
 ] 

Haibo Chen commented on YARN-6368:
--

Thanks [~miklos.szeg...@cloudera.com] for the update! The patch LGTM. +1 
nonbinding

> Decommissioning an NM results in a -1 exit code
> ---
>
> Key: YARN-6368
> URL: https://issues.apache.org/jira/browse/YARN-6368
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
> Attachments: YARN-6368.000.patch, YARN-6368.001.patch
>
>
> In NodeManager.java we should exit normally in case the RM shuts down the 
> node:
> {code}
> } finally {
>   if (shouldExitOnShutdownEvent
>   && !ShutdownHookManager.get().isShutdownInProgress()) {
> ExitUtil.terminate(-1);
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5600) Allow setting yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2017-03-21 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935084#comment-15935084
 ] 

Miklos Szegedi commented on YARN-5600:
--

[~vvasudev], please review the latest patch. Do you have any other comments on 
this jira?

> Allow setting yarn.nodemanager.delete.debug-delay-sec on a per-application 
> basis
> 
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch, 
> YARN-5600.008.patch, YARN-5600.009.patch, YARN-5600.010.patch, 
> YARN-5600.011.patch, YARN-5600.012.patch, YARN-5600.013.patch, 
> YARN-5600.014.patch, YARN-5600.015.patch, YARN-5600.016.patch, 
> YARN-5600.017.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5368) memory leak at timeline server

2017-03-21 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935074#comment-15935074
 ] 

Jonathan Eagles commented on YARN-5368:
---

[~Naganarasimha], I have posted a patch that addresses the missing 
iterator.close() leak. Let me know if you are ok with the try-with-resources 
approach to take care of this case

> memory leak at timeline server
> --
>
> Key: YARN-5368
> URL: https://issues.apache.org/jira/browse/YARN-5368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.1
> Environment: HDP2.4
> CentOS 6.7
> jdk1.8.0_72
>Reporter: Wataru Yukawa
>Assignee: Jonathan Eagles
> Attachments: YARN-5368.1.patch
>
>
> memory usage of timeline server machine increases gradually.
> https://gyazo.com/952dad96c77ae053bae2e4d8c8ab0572
> please check since April.
> According to my investigation, timeline server used about 25GB.
> top command result
> {code}
> 90577 yarn  20   0 28.4g  25g  12m S  0.0 40.1   5162:53 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn ...
> {code}
> ps command result
> {code}
> $ ps ww 90577
>  90577 ?Sl   5162:53 /usr/java/jdk1.8.0_72/bin/java 
> -Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.home.dir= 
> -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,EWMA,RFA 
> -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -Dyarn.policy.file=hadoop-policy.xml 
> -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.home.dir=/usr/hdp/current/hadoop-yarn-timelineserver 
> -Dhadoop.home.dir=/usr/hdp/2.4.0.0-169/hadoop 
> -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -classpath 
> /usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/*:/usr/hdp/2.4.0.0-169/hadoop/.//*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/./:/usr/hdp/2.4.0.0-169/hadoop-hdfs/lib/*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/.//*:/usr/hdp/2.4.0.0-169/hadoop-yarn/lib/*:/usr/hdp/2.4.0.0-169/hadoop-yarn/.//*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/lib/*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//*::/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/current/hadoop-yarn-timelineserver/.//*:/usr/hdp/current/hadoop-yarn-timelineserver/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf/timelineserver-config/log4j.properties
>  
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> {code}
>  
> Alghough I set -Xmx1024m, actual memory usage is 25GB.
> After I restart timeline server, memory usage of timeline server machine 
> decreases.
> https://gyazo.com/130600c17a7d41df8606727a859ae7e3
> Now timelineserver uses less than 1GB memory.
> top command result
> {code}
>  6163 yarn  20   0 3959m 783m  46m S  0.3  1.2   3:37.60 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 ...
> {code}
> I suspect memory leak at timeline server.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-6329) Remove unnecessary TODO comment from AppLogAggregatorImpl.java

2017-03-21 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton reassigned YARN-6329:
--

Assignee: Daniel Templeton

> Remove unnecessary TODO comment from AppLogAggregatorImpl.java
> --
>
> Key: YARN-6329
> URL: https://issues.apache.org/jira/browse/YARN-6329
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.8.0
>Reporter: Akira Ajisaka
>Assignee: Daniel Templeton
>Priority: Minor
>  Labels: newbie
>
> After YARN-3116, this TODO comment is unnecessary.
> {code}
>   // TODO: The condition: containerId.getId() == 1 to determine an AM 
> container
>   // is not always true.
>   private boolean shouldUploadLogs(ContainerLogContext logContext) {
> return logAggPolicy.shouldDoLogAggregation(logContext);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5368) memory leak at timeline server

2017-03-21 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935019#comment-15935019
 ] 

Jonathan Eagles commented on YARN-5368:
---

Unit test failure in TestTimelineWebServices is covered by YARN-5934. 
Checkstyle method length was pre-existing.

> memory leak at timeline server
> --
>
> Key: YARN-5368
> URL: https://issues.apache.org/jira/browse/YARN-5368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.1
> Environment: HDP2.4
> CentOS 6.7
> jdk1.8.0_72
>Reporter: Wataru Yukawa
>Assignee: Jonathan Eagles
> Attachments: YARN-5368.1.patch
>
>
> memory usage of timeline server machine increases gradually.
> https://gyazo.com/952dad96c77ae053bae2e4d8c8ab0572
> please check since April.
> According to my investigation, timeline server used about 25GB.
> top command result
> {code}
> 90577 yarn  20   0 28.4g  25g  12m S  0.0 40.1   5162:53 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn ...
> {code}
> ps command result
> {code}
> $ ps ww 90577
>  90577 ?Sl   5162:53 /usr/java/jdk1.8.0_72/bin/java 
> -Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.home.dir= 
> -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,EWMA,RFA 
> -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -Dyarn.policy.file=hadoop-policy.xml 
> -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.home.dir=/usr/hdp/current/hadoop-yarn-timelineserver 
> -Dhadoop.home.dir=/usr/hdp/2.4.0.0-169/hadoop 
> -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -classpath 
> /usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/*:/usr/hdp/2.4.0.0-169/hadoop/.//*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/./:/usr/hdp/2.4.0.0-169/hadoop-hdfs/lib/*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/.//*:/usr/hdp/2.4.0.0-169/hadoop-yarn/lib/*:/usr/hdp/2.4.0.0-169/hadoop-yarn/.//*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/lib/*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//*::/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/current/hadoop-yarn-timelineserver/.//*:/usr/hdp/current/hadoop-yarn-timelineserver/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf/timelineserver-config/log4j.properties
>  
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> {code}
>  
> Alghough I set -Xmx1024m, actual memory usage is 25GB.
> After I restart timeline server, memory usage of timeline server machine 
> decreases.
> https://gyazo.com/130600c17a7d41df8606727a859ae7e3
> Now timelineserver uses less than 1GB memory.
> top command result
> {code}
>  6163 yarn  20   0 3959m 783m  46m S  0.3  1.2   3:37.60 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 ...
> {code}
> I suspect memory leak at timeline server.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6367) YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before parse the JSON object from NMWebService

2017-03-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15935009#comment-15935009
 ] 

Hudson commented on YARN-6367:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11437 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11437/])
YARN-6367. YARN logs CLI needs alway check (junping_du: rev 
c01d15ab2731b6710c94ff3bfa37d496a87b0c9f)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/LogsCLI.java


> YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before 
> parse the JSON object from NMWebService
> -
>
> Key: YARN-6367
> URL: https://issues.apache.org/jira/browse/YARN-6367
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Xuan Gong
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: YARN-6367.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6284) hasAlreadyRun should be final in ResourceManager.StandByTransitionRunnable

2017-03-21 Thread Laura Adams (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laura Adams updated YARN-6284:
--
Attachment: YARN-6284.001.patch

> hasAlreadyRun should be final in ResourceManager.StandByTransitionRunnable
> --
>
> Key: YARN-6284
> URL: https://issues.apache.org/jira/browse/YARN-6284
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.0.0-alpha2
>Reporter: Daniel Templeton
>Assignee: Laura Adams
>  Labels: newbie
> Attachments: YARN-6284.001.patch
>
>
> {code}
> // The atomic variable to make sure multiple threads with the same 
> runnable
> // run only once.
> private AtomicBoolean hasAlreadyRun = new AtomicBoolean(false);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6368) Decommissioning an NM results in a -1 exit code

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934989#comment-15934989
 ] 

Hadoop QA commented on YARN-6368:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
21s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
25s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 25s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 15s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 28 unchanged - 0 fixed = 29 total (was 28) {color} 
|
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
12s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 25s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 19m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-6368 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859765/YARN-6368.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux dbfd0ad35213 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2841666 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/15344/artifact/patchprocess/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/15344/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/15344/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| checkstyle | 

[jira] [Updated] (YARN-6367) YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before parse the JSON object from NMWebService

2017-03-21 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-6367:
-
Fix Version/s: 3.0.0-alpha3

> YARN logs CLI needs alway check containerLogsInfo/containerLogInfo before 
> parse the JSON object from NMWebService
> -
>
> Key: YARN-6367
> URL: https://issues.apache.org/jira/browse/YARN-6367
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Xuan Gong
> Fix For: 2.9.0, 3.0.0-alpha3
>
> Attachments: YARN-6367.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5179) Issue of CPU usage of containers

2017-03-21 Thread Manikandan R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikandan R updated YARN-5179:
---
Attachment: YARN-5179.xls

> Issue of CPU usage of containers
> 
>
> Key: YARN-5179
> URL: https://issues.apache.org/jira/browse/YARN-5179
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
> Environment: Both on Windows and Linux
>Reporter: Zhongkai Mi
> Attachments: YARN-5179.xls
>
>
> // Multiply by 1000 to avoid losing data when converting to int 
>int milliVcoresUsed = (int) (cpuUsageTotalCoresPercentage * 1000 
>   * maxVCoresAllottedForContainers /nodeCpuPercentageForYARN); 
> This formula will not get right CPU usage based vcore if vcores != physical 
> cores. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5179) Issue of CPU usage of containers

2017-03-21 Thread Manikandan R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934970#comment-15934970
 ] 

Manikandan R commented on YARN-5179:


Thanks [~miklos.szeg...@cloudera.com] for your comments.

After my previous comment and proposals, had a offline discussion with 
[~naganarasimha...@apache.org] and was able to correct my understanding of 
vcores in general and especially by playing around with Default Container 
Executor & Linux container executor (with cgroups as well). 

Then, I ran a test with stress command (stress --cpu  4) to confirm my 
understanding about vcores with various possible options and compiled results 
in excel sheet. I've attached the same for reference. My box has 2 pcores and 
resourceCalculatorPlugin.getNumProcessors() is 4. Various Options are 

Using Linux container executor - 

1. Physical cpu limit is 100, Vcores is 2, Strict resource usage is false. 
cpuUsagePercentPerCore is 320 & Millivcores is 1600
2. Physical cpu limit is 100, Vcores is 4, Strict resource usage is false. 
cpuUsagePercentPerCore is 360 & Millivcores is 3600
3. Physical cpu limit is 50, Vcores is 2, Strict resource usage is false. 
cpuUsagePercentPerCore is 99 & Millivcores is 990
4. Physical cpu limit is 50, Vcores is 4, Strict resource usage is false. 
cpuUsagePercentPerCore is 98 & Millivcores is 1960

5. Physical cpu limit is 100, Vcores is 2, Strict resource usage is true. 
cpuUsagePercentPerCore is 100 & Millivcores is 500
6. Physical cpu limit is 100, Vcores is 4, Strict resource usage is true. 
cpuUsagePercentPerCore is 50  & Millivcores is 500
7. Physical cpu limit is 50, Vcores is 2, Strict resource usage is true. 
cpuUsagePercentPerCore is 50 & Millivcores is 500
8. Physical cpu limit is 50, Vcores is 4, Strict resource usage is true. 
cpuUsagePercentPerCore is 25 &  Millivcores is 500

Using Default Linux executor - 

Physical cpu limit is 100, Vcores is 200. cpuUsagePercentPerCore is 380 & 
Millivcores is 19
Physical cpu limit is 50, Vcores is 200. cpuUsagePercentPerCore is 375 & 
Millivcores is 375000

Verified O/p of millivcores for each above option and its correctness with 
[~naganarasimha...@apache.org]. Initally, We had a doubt on our own 
understanding only when physical cpu limit is 50 and thought it should be half 
of the current millivcores, then we were able to reason it this way - for ex, 
in option 3, max utilization can go upto 200 because 
resourceCalculatorPlugin.getNumProcessors() is 2 (50% of 4 processors), but we 
got 99 as utilization. It means half of vcores should have been utilized, hence 
millivcores should be 1000 (approx). Based on these test results and our 
understanding, we don't see any issue with current calculation (like 
[~miklos.szeg...@cloudera.com] said in his comments).

Please correct me if you see any gap.

> Issue of CPU usage of containers
> 
>
> Key: YARN-5179
> URL: https://issues.apache.org/jira/browse/YARN-5179
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.0
> Environment: Both on Windows and Linux
>Reporter: Zhongkai Mi
> Attachments: YARN-5179.xls
>
>
> // Multiply by 1000 to avoid losing data when converting to int 
>int milliVcoresUsed = (int) (cpuUsageTotalCoresPercentage * 1000 
>   * maxVCoresAllottedForContainers /nodeCpuPercentageForYARN); 
> This formula will not get right CPU usage based vcore if vcores != physical 
> cores. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6368) Decommissioning an NM results in a -1 exit code

2017-03-21 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-6368:
-
Attachment: YARN-6368.001.patch

Thank you, [~haibochen] for the review. I attached an updated patch.

> Decommissioning an NM results in a -1 exit code
> ---
>
> Key: YARN-6368
> URL: https://issues.apache.org/jira/browse/YARN-6368
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Minor
> Attachments: YARN-6368.000.patch, YARN-6368.001.patch
>
>
> In NodeManager.java we should exit normally in case the RM shuts down the 
> node:
> {code}
> } finally {
>   if (shouldExitOnShutdownEvent
>   && !ShutdownHookManager.get().isShutdownInProgress()) {
> ExitUtil.terminate(-1);
>   }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6359) TestRM#testApplicationKillAtAcceptedState fails rarely due to race condition

2017-03-21 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934938#comment-15934938
 ] 

Robert Kanter commented on YARN-6359:
-

Test failure unrelated.

> TestRM#testApplicationKillAtAcceptedState fails rarely due to race condition
> 
>
> Key: YARN-6359
> URL: https://issues.apache.org/jira/browse/YARN-6359
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.9.0, 3.0.0-alpha3
>Reporter: Robert Kanter
>Assignee: Robert Kanter
> Attachments: YARN-6359.001.patch, YARN-6359.002.patch
>
>
> We've seen (very rarely) a test failure in 
> {{TestRM#testApplicationKillAtAcceptedState}}
> {noformat}
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRM.testApplicationKillAtAcceptedState(TestRM.java:645)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5368) memory leak at timeline server

2017-03-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934888#comment-15934888
 ] 

Hadoop QA commented on YARN-5368:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice:
 The patch generated 1 new + 5 unchanged - 1 fixed = 6 total (was 6) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 46s{color} 
| {color:red} hadoop-yarn-server-applicationhistoryservice in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 21m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.timeline.webapp.TestTimelineWebServices |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5368 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12859755/YARN-5368.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux e0e79f287c6a 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2841666 |
| Default Java | 1.8.0_121 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/15343/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/15343/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/15343/testReport/ |
| modules | C: 

[jira] [Updated] (YARN-5368) memory leak at timeline server

2017-03-21 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-5368:
--
Attachment: YARN-5368.1.patch

> memory leak at timeline server
> --
>
> Key: YARN-5368
> URL: https://issues.apache.org/jira/browse/YARN-5368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.1
> Environment: HDP2.4
> CentOS 6.7
> jdk1.8.0_72
>Reporter: Wataru Yukawa
>Assignee: Jonathan Eagles
> Attachments: YARN-5368.1.patch
>
>
> memory usage of timeline server machine increases gradually.
> https://gyazo.com/952dad96c77ae053bae2e4d8c8ab0572
> please check since April.
> According to my investigation, timeline server used about 25GB.
> top command result
> {code}
> 90577 yarn  20   0 28.4g  25g  12m S  0.0 40.1   5162:53 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn ...
> {code}
> ps command result
> {code}
> $ ps ww 90577
>  90577 ?Sl   5162:53 /usr/java/jdk1.8.0_72/bin/java 
> -Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.home.dir= 
> -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,EWMA,RFA 
> -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -Dyarn.policy.file=hadoop-policy.xml 
> -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.home.dir=/usr/hdp/current/hadoop-yarn-timelineserver 
> -Dhadoop.home.dir=/usr/hdp/2.4.0.0-169/hadoop 
> -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -classpath 
> /usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/*:/usr/hdp/2.4.0.0-169/hadoop/.//*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/./:/usr/hdp/2.4.0.0-169/hadoop-hdfs/lib/*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/.//*:/usr/hdp/2.4.0.0-169/hadoop-yarn/lib/*:/usr/hdp/2.4.0.0-169/hadoop-yarn/.//*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/lib/*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//*::/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/current/hadoop-yarn-timelineserver/.//*:/usr/hdp/current/hadoop-yarn-timelineserver/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf/timelineserver-config/log4j.properties
>  
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> {code}
>  
> Alghough I set -Xmx1024m, actual memory usage is 25GB.
> After I restart timeline server, memory usage of timeline server machine 
> decreases.
> https://gyazo.com/130600c17a7d41df8606727a859ae7e3
> Now timelineserver uses less than 1GB memory.
> top command result
> {code}
>  6163 yarn  20   0 3959m 783m  46m S  0.3  1.2   3:37.60 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 ...
> {code}
> I suspect memory leak at timeline server.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5368) memory leak at timeline server

2017-03-21 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles reassigned YARN-5368:
-

Assignee: Jonathan Eagles

> memory leak at timeline server
> --
>
> Key: YARN-5368
> URL: https://issues.apache.org/jira/browse/YARN-5368
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelineserver
>Affects Versions: 2.7.1
> Environment: HDP2.4
> CentOS 6.7
> jdk1.8.0_72
>Reporter: Wataru Yukawa
>Assignee: Jonathan Eagles
>
> memory usage of timeline server machine increases gradually.
> https://gyazo.com/952dad96c77ae053bae2e4d8c8ab0572
> please check since April.
> According to my investigation, timeline server used about 25GB.
> top command result
> {code}
> 90577 yarn  20   0 28.4g  25g  12m S  0.0 40.1   5162:53 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn ...
> {code}
> ps command result
> {code}
> $ ps ww 90577
>  90577 ?Sl   5162:53 /usr/java/jdk1.8.0_72/bin/java 
> -Dproc_timelineserver -Xmx1024m -Dhdp.version=2.4.0.0-169 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log -Dyarn.home.dir= 
> -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,EWMA,RFA 
> -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -Dyarn.policy.file=hadoop-policy.xml 
> -Djava.io.tmpdir=/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir 
> -Dhadoop.log.dir=/var/log/hadoop-yarn/yarn 
> -Dyarn.log.dir=/var/log/hadoop-yarn/yarn 
> -Dhadoop.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.log.file=yarn-yarn-timelineserver-myhost.log 
> -Dyarn.home.dir=/usr/hdp/current/hadoop-yarn-timelineserver 
> -Dhadoop.home.dir=/usr/hdp/2.4.0.0-169/hadoop 
> -Dhadoop.root.logger=INFO,EWMA,RFA -Dyarn.root.logger=INFO,EWMA,RFA 
> -Djava.library.path=:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir:/usr/hdp/2.4.0.0-169/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.0.0-169/hadoop/lib/native:/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir
>  -classpath 
> /usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/conf:/usr/hdp/2.4.0.0-169/hadoop/lib/*:/usr/hdp/2.4.0.0-169/hadoop/.//*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/./:/usr/hdp/2.4.0.0-169/hadoop-hdfs/lib/*:/usr/hdp/2.4.0.0-169/hadoop-hdfs/.//*:/usr/hdp/2.4.0.0-169/hadoop-yarn/lib/*:/usr/hdp/2.4.0.0-169/hadoop-yarn/.//*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/lib/*:/usr/hdp/2.4.0.0-169/hadoop-mapreduce/.//*::/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/2.4.0.0-169/tez/*:/usr/hdp/2.4.0.0-169/tez/lib/*:/usr/hdp/2.4.0.0-169/tez/conf:/usr/hdp/current/hadoop-yarn-timelineserver/.//*:/usr/hdp/current/hadoop-yarn-timelineserver/lib/*:/usr/hdp/2.4.0.0-169/hadoop/conf/timelineserver-config/log4j.properties
>  
> org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer
> {code}
>  
> Alghough I set -Xmx1024m, actual memory usage is 25GB.
> After I restart timeline server, memory usage of timeline server machine 
> decreases.
> https://gyazo.com/130600c17a7d41df8606727a859ae7e3
> Now timelineserver uses less than 1GB memory.
> top command result
> {code}
>  6163 yarn  20   0 3959m 783m  46m S  0.3  1.2   3:37.60 
> /usr/java/jdk1.8.0_72/bin/java -Dproc_timelineserver -Xmx1024m 
> -Dhdp.version=2.4.0.0-169 ...
> {code}
> I suspect memory leak at timeline server.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org