[jira] [Commented] (YARN-8351) RM is flooded with node attributes manager logs

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488448#comment-16488448
 ] 

genericqa commented on YARN-8351:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} YARN-3409 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 
26s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} YARN-3409 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} YARN-3409 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 68m 
44s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}126m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8351 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924872/YARN-8351-YARN-3409.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1ce6bd5d826b 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | YARN-3409 / 4cf0d40 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20851/testReport/ |
| Max. process+thread count | 827 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20851/console |
| 

[jira] [Commented] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"

2018-05-23 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488439#comment-16488439
 ] 

Rohith Sharma K S commented on YARN-8346:
-

Thanks [~jlowe] for quick turnaround. I verified the patch in cluster and 
working fine as expected.

I am +1 for the patch.

> Upgrading to 3.1 kills running containers with error "Opportunistic container 
> queue is full"
> 
>
> Key: YARN-8346
> URL: https://issues.apache.org/jira/browse/YARN-8346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0, 3.0.2
>Reporter: Rohith Sharma K S
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-8346.001.patch
>
>
> It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the 
> running containers are killed and second attempt is launched for that 
> application. The diagnostics message is "Opportunistic container queue is 
> full" which is the reason for container killed. 
> In NM log, I see below logs for after container is recovered.
> {noformat}
> 2018-05-23 17:18:50,655 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler:
>  Opportunistic container [container_e06_1527075664705_0001_01_01] will 
> not be queued at the NMsince max queue length [0] has been reached
> {noformat}
> Following steps are executed for rolling upgrade
> # Install 2.8.4 cluster and launch a MR job with distributed cache enabled.
> # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration.
> # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8351) RM is flooded with node attributes manager logs

2018-05-23 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488371#comment-16488371
 ] 

Weiwei Yang commented on YARN-8351:
---

Thanks [~sunilg] for the quick response!

> RM is flooded with node attributes manager logs
> ---
>
> Key: YARN-8351
> URL: https://issues.apache.org/jira/browse/YARN-8351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8351-YARN-3409.001.patch, YARN-8351.001.patch
>
>
> When distributed node attributes enabled, RM updates these attributes on each 
> NM HB interval, and each time it creates a log like
> {noformat}
> REPLACE attributes on nodes: NM="xxx", attributes=""
> {noformat}
> this should be in DEBUG level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8351) RM is flooded with node attributes manager logs

2018-05-23 Thread Sunil Govindan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488368#comment-16488368
 ] 

Sunil Govindan commented on YARN-8351:
--

Patch looks straight forward. Committing shortly. Pending jenkins.

> RM is flooded with node attributes manager logs
> ---
>
> Key: YARN-8351
> URL: https://issues.apache.org/jira/browse/YARN-8351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8351-YARN-3409.001.patch, YARN-8351.001.patch
>
>
> When distributed node attributes enabled, RM updates these attributes on each 
> NM HB interval, and each time it creates a log like
> {noformat}
> REPLACE attributes on nodes: NM="xxx", attributes=""
> {noformat}
> this should be in DEBUG level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8352) AM should retry on a different node after the previous application attempt fail

2018-05-23 Thread Zhizhen Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhizhen Hou updated YARN-8352:
--
Description: I submit a job to the yarn, and both two times the AM is 
allocated on the same node.  After the first allocate call to scheduler, the 
follows call should include the black node list, but now the black node list is 
always null.  (was: I submit a job to the yarn, and both two times the AM is 
allocated on the same node.  After the first allocate call to scheduler, the 
follows call should include the black node list, but now the black node list is 
a constant null.)

> AM should retry on a different node after the previous application attempt 
> fail
> ---
>
> Key: YARN-8352
> URL: https://issues.apache.org/jira/browse/YARN-8352
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.7.5
>Reporter: Zhizhen Hou
>Priority: Major
>
> I submit a job to the yarn, and both two times the AM is allocated on the 
> same node.  After the first allocate call to scheduler, the follows call 
> should include the black node list, but now the black node list is always 
> null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8351) RM is flooded with node attributes manager logs

2018-05-23 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8351:
--
Attachment: YARN-8351-YARN-3409.001.patch

> RM is flooded with node attributes manager logs
> ---
>
> Key: YARN-8351
> URL: https://issues.apache.org/jira/browse/YARN-8351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8351-YARN-3409.001.patch, YARN-8351.001.patch
>
>
> When distributed node attributes enabled, RM updates these attributes on each 
> NM HB interval, and each time it creates a log like
> {noformat}
> REPLACE attributes on nodes: NM="xxx", attributes=""
> {noformat}
> this should be in DEBUG level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8351) RM is flooded with node attributes manager logs

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488362#comment-16488362
 ] 

genericqa commented on YARN-8351:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} YARN-8351 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-8351 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924871/YARN-8351.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20850/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RM is flooded with node attributes manager logs
> ---
>
> Key: YARN-8351
> URL: https://issues.apache.org/jira/browse/YARN-8351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8351.001.patch
>
>
> When distributed node attributes enabled, RM updates these attributes on each 
> NM HB interval, and each time it creates a log like
> {noformat}
> REPLACE attributes on nodes: NM="xxx", attributes=""
> {noformat}
> this should be in DEBUG level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8351) RM is flooded with node attributes manager logs

2018-05-23 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8351:
--
Attachment: YARN-8351.001.patch

> RM is flooded with node attributes manager logs
> ---
>
> Key: YARN-8351
> URL: https://issues.apache.org/jira/browse/YARN-8351
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Major
> Attachments: YARN-8351.001.patch
>
>
> When distributed node attributes enabled, RM updates these attributes on each 
> NM HB interval, and each time it creates a log like
> {noformat}
> REPLACE attributes on nodes: NM="xxx", attributes=""
> {noformat}
> this should be in DEBUG level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8352) AM should retry on a different node after the previous application attempt fail

2018-05-23 Thread Zhizhen Hou (JIRA)
Zhizhen Hou created YARN-8352:
-

 Summary: AM should retry on a different node after the previous 
application attempt fail
 Key: YARN-8352
 URL: https://issues.apache.org/jira/browse/YARN-8352
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 2.7.5
Reporter: Zhizhen Hou


I submit a job to the yarn, and both two times the AM is allocated on the same 
node.  After the first allocate call to scheduler, the follows call should 
include the black node list, but now the black node list is a constant null.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8351) RM is flooded with node attributes manager logs

2018-05-23 Thread Weiwei Yang (JIRA)
Weiwei Yang created YARN-8351:
-

 Summary: RM is flooded with node attributes manager logs
 Key: YARN-8351
 URL: https://issues.apache.org/jira/browse/YARN-8351
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Weiwei Yang
Assignee: Weiwei Yang


When distributed node attributes enabled, RM updates these attributes on each 
NM HB interval, and each time it creates a log like

{noformat}
REPLACE attributes on nodes: NM="xxx", attributes=""
{noformat}

this should be in DEBUG level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5764) NUMA awareness support for launching containers

2018-05-23 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488321#comment-16488321
 ] 

Weiwei Yang commented on YARN-5764:
---

Hi [~devaraj.k], [~miklos.szeg...@cloudera.com], can you update the fixed 
version for this jira? 

> NUMA awareness support for launching containers
> ---
>
> Key: YARN-5764
> URL: https://issues.apache.org/jira/browse/YARN-5764
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Reporter: Olasoji
>Assignee: Devaraj K
>Priority: Major
> Attachments: NUMA Awareness for YARN Containers.pdf, NUMA Performance 
> Results.pdf, YARN-5764-v0.patch, YARN-5764-v1.patch, YARN-5764-v10.patch, 
> YARN-5764-v11.patch, YARN-5764-v2.patch, YARN-5764-v3.patch, 
> YARN-5764-v4.patch, YARN-5764-v5.patch, YARN-5764-v6.patch, 
> YARN-5764-v7.patch, YARN-5764-v8.patch, YARN-5764-v9.patch
>
>
> The purpose of this feature is to improve Hadoop performance by minimizing 
> costly remote memory accesses on non SMP systems. Yarn containers, on launch, 
> will be pinned to a specific NUMA node and all subsequent memory allocations 
> will be served by the same node, reducing remote memory accesses. The current 
> default behavior is to spread memory across all NUMA nodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-05-23 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488320#comment-16488320
 ] 

Eric Yang commented on YARN-7530:
-

+1 for branch-3.1 change.

> hadoop-yarn-services-api should be part of hadoop-yarn-services
> ---
>
> Key: YARN-7530
> URL: https://issues.apache.org/jira/browse/YARN-7530
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Chandni Singh
>Priority: Blocker
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-7530-branch-3.1.001.patch, YARN-7530.001.patch, 
> YARN-7530.002.patch
>
>
> Hadoop-yarn-services-api is currently a parallel project to 
> hadoop-yarn-services project.  It would be better if hadoop-yarn-services-api 
> is part of hadoop-yarn-services for correctness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488291#comment-16488291
 ] 

genericqa commented on YARN-6677:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 43s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 17 new + 145 unchanged - 0 fixed = 162 total (was 145) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 36s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 36m 25s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 96m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.TestContainerManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-6677 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924847/YARN-6677.00.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ff06c4489b33 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d996479 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20848/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit | 

[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488283#comment-16488283
 ] 

genericqa commented on YARN-8333:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 11s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry: The patch generated 7 new 
+ 8 unchanged - 0 fixed = 15 total (was 8) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 46s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
55s{color} | {color:green} hadoop-yarn-registry in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 48m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8333 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924854/YARN-8333.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a01231175909 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / d996479 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/20849/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-registry.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20849/testReport/ |
| Max. process+thread count | 442 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-registry |
| Console output | 

[jira] [Updated] (YARN-8350) NPE in service AM related to placement policy

2018-05-23 Thread Billie Rinaldi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billie Rinaldi updated YARN-8350:
-
Description: 
It seems like this NPE is happening in a service with more than one component 
when one component has a placement policy and the other does not. It causes the 
AM to crash.
{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.service.component.Component.requestContainers(Component.java:644)
at 
org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:310)
at 
org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:293)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
at 
org.apache.hadoop.yarn.service.component.Component.handle(Component.java:919)
at 
org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:344)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at 
org.apache.hadoop.yarn.service.ServiceMaster.lambda$serviceStart$0(ServiceMaster.java:253)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
at 
org.apache.hadoop.yarn.service.ServiceMaster.serviceStart(ServiceMaster.java:251)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:317)
{noformat}

  was:
It seems like this NPE is happening in a service with more than one component 
when one component has a placement policy and the other does not. It causes the 
AM to crash. See 
https://github.com/hortonworks/hadoop/blob/3c66d40e26bc2d0e17a6e1869201021a8c2f6df1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/Component.java
{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.service.component.Component.requestContainers(Component.java:644)
at 
org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:310)
at 
org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:293)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
at 
org.apache.hadoop.yarn.service.component.Component.handle(Component.java:919)
at 
org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:344)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at 
org.apache.hadoop.yarn.service.ServiceMaster.lambda$serviceStart$0(ServiceMaster.java:253)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
at 
org.apache.hadoop.yarn.service.ServiceMaster.serviceStart(ServiceMaster.java:251)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:317)
{noformat}


> NPE in service AM related to placement policy
> -
>
> Key: YARN-8350
> URL: https://issues.apache.org/jira/browse/YARN-8350
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Gour Saha
>Priority: Critical
>
> It seems like this NPE is happening in a service with more than one component 
> when one component has a placement policy and the other does not. It causes 
> the AM to 

[jira] [Commented] (YARN-4677) RMNodeResourceUpdateEvent update from scheduler can lead to race condition

2018-05-23 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488271#comment-16488271
 ] 

Robert Kanter commented on YARN-4677:
-

Thanks [~wilfreds] for the trunk patch and [~gphillips] for the branch-2 patch.

The trunk patch looks fine, but a couple things on the branch-2 patch:
 # Instead of calling {{getSchedulerNode}} and {{getNode}} again later on in 
{{nodeUpdate}}, we should simply use the {{schedulerNode}} we're now getting.
 # The comment about the TODO can be removed now.

> RMNodeResourceUpdateEvent update from scheduler can lead to race condition
> --
>
> Key: YARN-4677
> URL: https://issues.apache.org/jira/browse/YARN-4677
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: graceful, resourcemanager, scheduler
>Affects Versions: 2.7.1
>Reporter: Brook Zhou
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-4677-branch-2.001.patch, 
> YARN-4677-branch-2.002.patch, YARN-4677.01.patch
>
>
> When a node is in decommissioning state, there is time window between 
> completedContainer() and RMNodeResourceUpdateEvent get handled in 
> scheduler.nodeUpdate (YARN-3223). 
> So if a scheduling effort happens within this window, the new container could 
> still get allocated on this node. Even worse case is if scheduling effort 
> happen after RMNodeResourceUpdateEvent sent out but before it is propagated 
> to SchedulerNode - then the total resource is lower than used resource and 
> available resource is a negative value. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488266#comment-16488266
 ] 

genericqa commented on YARN-8292:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 45s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
42s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
26s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
37s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 6 new + 97 unchanged - 0 fixed = 103 total (was 97) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
39s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 41s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
17s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 36s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 90m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924843/YARN-8292.008.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 21ea8b19a4a8 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488263#comment-16488263
 ] 

Hsin-Liang Huang commented on YARN-8326:


Here is more detail information from node manager log that compares between 
Hadoop 3.0 and 2.6.  They are both running on 4 node cluster with 3 data nodes 
with same machine power/cpu/memory and same type of job.   I picked only one 
node to compare the container cycle. 

*1. On 3.0.*  when I request 8 containers to run on 3 data nodes,  I picked the 
second node to examine the log:

this job used 2 containers in this node:

 

container *container_e04_1527109836290_0004_01_02*  on application 
application_1527109836290_0004  (from container succeeded to Stopping container 
(from blue to red line) took about *4 seconds*)

 

152231 2018-05-23 15:04:45,541 INFO  containermanager.ContainerManagerImpl 
(ContainerManagerImpl.java:startContainerInternal(1059)) - Start request for 
container_e04_1527109836290_0004_01_02 by user hlhuang

152232 2018-05-23 15:04:45,657 INFO  containermanager.ContainerManagerImpl 
(ContainerManagerImpl.java:startContainerInternal(1127)) - Creating a new 
application reference for app application_1527109836290_0004

152233 2018-05-23 15:04:45,658 INFO  application.ApplicationImpl 
(ApplicationImpl.java:handle(632)) - Application application_1527109836290_0004 
transitioned from NEW to INITING

152234 2018-05-23 15:04:45,658 INFO  application.ApplicationImpl 
(ApplicationImpl.java:transition(446)) - Adding 
container_e04_1527109836290_0004_01_02 to application 
application_1527109836290_0004

152235 2018-05-23 15:04:45,658 INFO  application.ApplicationImpl 
(ApplicationImpl.java:handle(632)) - Application application_1527109836290_0004 
transitioned from INITING to RUNNING

152236 2018-05-23 15:04:45,659 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(2108)) - Container 
container_e04_1527109836290_0004_01_02 transitioned from NEW to SCHEDULED

152237 2018-05-23 15:04:45,659 INFO  containermanager.AuxServices 
(AuxServices.java:handle(220)) - Got event CONTAINER_INIT for appId 
application_1527109836290_0004

152238 2018-05-23 15:04:45,659 INFO  yarn.YarnShuffleService 
(YarnShuffleService.java:initializeContainer(289)) - Initializing container 
container_e04_1527109836290_0004_01_02

152239 2018-05-23 15:04:45,660 INFO  scheduler.ContainerScheduler 
(ContainerScheduler.java:startContainer(503)) - Starting container 
[container_e04_1527109836290_0004_01_02]

152246 2018-05-23 15:04:45,965 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(2108)) - Container 
container_e04_1527109836290_0004_01_02 transitioned from SCHEDULED to 
RUNNING

152247 2018-05-23 15:04:45,965 INFO  monitor.ContainersMonitorImpl 
(ContainersMonitorImpl.java:onStartMonitoringContainer(941)) - Starting 
resource-monitoring for container_e04_1527109836290_0004_01_02

{color:#205081}152250 2018-05-23 15:04:46,002 INFO  launcher.ContainerLaunch 
(ContainerLaunch.java:handleContainerExitCode(512)) - Container 
container_e04_1527109836290_0004_01_02 succeeded{color}

 

152251 2018-05-23 15:04:46,003 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(2108)) - Container 
container_e04_1527109836290_0004_01_02 transitioned from RUNNING to 
EXITED_WITH_SUCCESS

152252 2018-05-23 15:04:46,003 INFO  launcher.ContainerLaunch 
(ContainerLaunch.java:cleanupContainer(668)) - Cleaning up container 
container_e04_1527109836290_0004_01_02

152254 2018-05-23 15:04:48,132 INFO  nodemanager.LinuxContainerExecutor 
(LinuxContainerExecutor.java:deleteAsUser(794)) - Deleting absolute path : 
/hadoop/yarn/local/usercache/hlhuang/appcache/application_1527109836290_0004/container_e04_1527109836290_0004_01_02

152256 2018-05-23 15:04:48,133 INFO  container.ContainerImpl 
(ContainerImpl.java:handle(2108)) - Container 
container_e04_1527109836290_0004_01_02 transitioned from 
EXITED_WITH_SUCCESS to DONE

152258 2018-05-23 15:04:49,171 INFO  nodemanager.NodeStatusUpdaterImpl 
(NodeStatusUpdaterImpl.java:removeOrTrackCompletedContainersFromContext(682)) - 
Removed completed containers from NM context: 
[container_e04_1527109836290_0004_01_02]

152260 2018-05-23 15:04:50,289 INFO  application.ApplicationImpl 
(ApplicationImpl.java:transition(489)) - Removing 
container_e04_1527109836290_0004_01_02 from application 
application_1527109836290_0004

{color:#d04437}152261 2018-05-23 15:04:50,290 INFO  
monitor.ContainersMonitorImpl 
(ContainersMonitorImpl.java:onStopMonitoringContainer(932)) - Stopping 
resource-monitoring for container_e04_1527109836290_0004_01_02{color}

152263 2018-05-23 15:04:50,290 INFO  yarn.YarnShuffleService 
(YarnShuffleService.java:stopContainer(295)) - Stopping container 
container_e04_1527109836290_0004_01_02

152262 2018-05-23 15:04:50,290 INFO  containermanager.AuxServices 

[jira] [Commented] (YARN-8350) NPE in service AM related to placement policy

2018-05-23 Thread Billie Rinaldi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488242#comment-16488242
 ] 

Billie Rinaldi commented on YARN-8350:
--

I also tried adding a placement policy with an empty constraints array to the 
component that previously had no placement policy, and that resulted in a 
different NPE.

> NPE in service AM related to placement policy
> -
>
> Key: YARN-8350
> URL: https://issues.apache.org/jira/browse/YARN-8350
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Billie Rinaldi
>Assignee: Gour Saha
>Priority: Critical
>
> It seems like this NPE is happening in a service with more than one component 
> when one component has a placement policy and the other does not. It causes 
> the AM to crash. See 
> https://github.com/hortonworks/hadoop/blob/3c66d40e26bc2d0e17a6e1869201021a8c2f6df1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/Component.java
> {noformat}
> java.lang.NullPointerException
> at 
> org.apache.hadoop.yarn.service.component.Component.requestContainers(Component.java:644)
> at 
> org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:310)
> at 
> org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:293)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
> at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
> at 
> org.apache.hadoop.yarn.service.component.Component.handle(Component.java:919)
> at 
> org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:344)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at 
> org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
> at 
> org.apache.hadoop.yarn.service.ServiceMaster.lambda$serviceStart$0(ServiceMaster.java:253)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
> at 
> org.apache.hadoop.yarn.service.ServiceMaster.serviceStart(ServiceMaster.java:251)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at 
> org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:317)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-23 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488241#comment-16488241
 ] 

Eric Yang commented on YARN-8333:
-

Patch 001 added multi-A record per component.

> Load balance YARN services using RegistryDNS multiple A records
> ---
>
> Key: YARN-8333
> URL: https://issues.apache.org/jira/browse/YARN-8333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8333.001.patch
>
>
> For scaling stateless containers, it would be great to support DNS round 
> robin for fault tolerance and load balancing.  The current DNS record format 
> for RegistryDNS is 
> [container-instance].[application-name].[username].[domain].  For example:
> {code}
> appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123
> {code}
> It would be nice to add multi-A record that contains all IP addresses of the 
> same component in addition to the instance based records.  For example:
> {code}
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.123
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-23 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8333:

Attachment: YARN-8333.001.patch

> Load balance YARN services using RegistryDNS multiple A records
> ---
>
> Key: YARN-8333
> URL: https://issues.apache.org/jira/browse/YARN-8333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-8333.001.patch
>
>
> For scaling stateless containers, it would be great to support DNS round 
> robin for fault tolerance and load balancing.  The current DNS record format 
> for RegistryDNS is 
> [container-instance].[application-name].[username].[domain].  For example:
> {code}
> appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123
> {code}
> It would be nice to add multi-A record that contains all IP addresses of the 
> same component in addition to the instance based records.  For example:
> {code}
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.123
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8350) NPE in service AM related to placement policy

2018-05-23 Thread Billie Rinaldi (JIRA)
Billie Rinaldi created YARN-8350:


 Summary: NPE in service AM related to placement policy
 Key: YARN-8350
 URL: https://issues.apache.org/jira/browse/YARN-8350
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Billie Rinaldi
Assignee: Gour Saha


It seems like this NPE is happening in a service with more than one component 
when one component has a placement policy and the other does not. It causes the 
AM to crash. See 
https://github.com/hortonworks/hadoop/blob/3c66d40e26bc2d0e17a6e1869201021a8c2f6df1/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/component/Component.java
{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.yarn.service.component.Component.requestContainers(Component.java:644)
at 
org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:310)
at 
org.apache.hadoop.yarn.service.component.Component$FlexComponentTransition.transition(Component.java:293)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$500(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:487)
at 
org.apache.hadoop.yarn.service.component.Component.handle(Component.java:919)
at 
org.apache.hadoop.yarn.service.ServiceScheduler.serviceStart(ServiceScheduler.java:344)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
at 
org.apache.hadoop.yarn.service.ServiceMaster.lambda$serviceStart$0(ServiceMaster.java:253)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
at 
org.apache.hadoop.yarn.service.ServiceMaster.serviceStart(ServiceMaster.java:251)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.service.ServiceMaster.main(ServiceMaster.java:317)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488224#comment-16488224
 ] 

Hudson commented on YARN-4599:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14277 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14277/])
YARN-4599. Set OOM control for memory cgroups. (Miklos Szegedi via Haibo 
(haibochen: rev d9964799544eefcf424fcc178d987525f5356cdf)
* (edit) .gitignore
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupElasticMemoryController.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsHandler.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/test/oom_listener_test_main.cc
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/executor/ContainerSignalContext.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/DummyRunnableWithContext.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/impl/oom_listener.c
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/impl/oom_listener_main.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestCGroupsMemoryResourceHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitorResourceChange.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/oom-listener/impl/oom_listener.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupsMemoryResourceHandlerImpl.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/CGroupElasticMemoryController.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/TestDefaultOOMHandler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerCGroupsMemory.md
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/DefaultOOMHandler.java


> Set OOM control for memory cgroups
> --
>
> Key: YARN-4599
> URL: https://issues.apache.org/jira/browse/YARN-4599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Miklos Szegedi
>Priority: Major
>  Labels: oct16-medium
> Fix For: 3.2.0
>
> Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, 
> YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, 
> YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, 
> YARN-4599.007.patch, YARN-4599.008.patch, YARN-4599.009.patch, 
> 

[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-23 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488222#comment-16488222
 ] 

Eric Yang commented on YARN-8342:
-

We have the following options:

1.  Allow exemption to bind-mount launch-container.sh for untrusted yarn mode, 
and not drop launch_command.
2.  Change the name docker.privileged-containers.registries back to 
docker.trusted.registries.  Images outside of trusted.registries are disallowed.
3.  Add a error message to indicate that untrusted yarn mode without launch 
command is not supported.

Option 1 requires RHEL 7.5+ to be completely immune to security hole.  Option 2 
and 3 are safe but it would be hard for users to understand the problem was 
generated from Hadoop implementation limitations.

I am in favor of implementing option 1.  Thoughts?

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Priority: Critical
>  Labels: Docker
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-23 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6677:
-
Summary: Preempt opportunistic containers when root container cgroup goes 
over memory limit  (was: Preempt all opportunistic containers when root 
container cgroup goes over memory limit)

> Preempt opportunistic containers when root container cgroup goes over memory 
> limit
> --
>
> Key: YARN-6677
> URL: https://issues.apache.org/jira/browse/YARN-6677
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-6677.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6677) Preempt opportunistic containers when root container cgroup goes over memory limit

2018-05-23 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-6677:
-
Attachment: YARN-6677.00.patch

> Preempt opportunistic containers when root container cgroup goes over memory 
> limit
> --
>
> Key: YARN-6677
> URL: https://issues.apache.org/jira/browse/YARN-6677
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha3
>Reporter: Haibo Chen
>Assignee: Miklos Szegedi
>Priority: Major
> Attachments: YARN-6677.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-23 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488210#comment-16488210
 ] 

Haibo Chen commented on YARN-4599:
--

Thanks [~sandflee] for the initial proposal, [~miklos.szeg...@cloudera.com] for 
the patch and everyone else for the discussion! I have now committed the patch 
to trunk!

> Set OOM control for memory cgroups
> --
>
> Key: YARN-4599
> URL: https://issues.apache.org/jira/browse/YARN-4599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Miklos Szegedi
>Priority: Major
>  Labels: oct16-medium
> Fix For: 3.2.0
>
> Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, 
> YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, 
> YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, 
> YARN-4599.007.patch, YARN-4599.008.patch, YARN-4599.009.patch, 
> YARN-4599.010.patch, YARN-4599.011.patch, YARN-4599.012.patch, 
> YARN-4599.013.patch, YARN-4599.014.patch, YARN-4599.015.patch, 
> YARN-4599.016.patch, YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly 
> set OOM control so that containers are not killed as soon as they go over 
> their usage. Today, one could set the swappiness to control this, but 
> clusters with swap turned off exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative

2018-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488201#comment-16488201
 ] 

Wangda Tan commented on YARN-8292:
--

Updated (008) patch.

> Fix the dominant resource preemption cannot happen when some of the resource 
> vector becomes negative
> 
>
> Key: YARN-8292
> URL: https://issues.apache.org/jira/browse/YARN-8292
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8292.001.patch, YARN-8292.002.patch, 
> YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, 
> YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch
>
>
> This is an example of the problem: 
>   
> {code}
> //   guaranteed,  max,used,   pending
> "root(=[30:18:6  30:18:6 12:12:6 1:1:1]);" + //root
> "-a(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // a
> "-b(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // b
> "-c(=[10:6:2 10:6:2  0:0:0   1:1:1])"; // c
> {code}
> There're 3 resource types. Total resource of the cluster is 30:18:6
> For both of a/b, there're 3 containers running, each of container is 2:2:1.
> Queue c uses 0 resource, and have 1:1:1 pending resource.
> Under existing logic, preemption cannot happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative

2018-05-23 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488200#comment-16488200
 ] 

Wangda Tan commented on YARN-8292:
--

Thanks [~jlowe], addressed all comments. TestPreemptionForQueueWithPriorities 
is a flaky test which only fails for some cases (I tried 10+ times and failed 
once). I updated test case a bit to make it more stable and deterministic.

> Fix the dominant resource preemption cannot happen when some of the resource 
> vector becomes negative
> 
>
> Key: YARN-8292
> URL: https://issues.apache.org/jira/browse/YARN-8292
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8292.001.patch, YARN-8292.002.patch, 
> YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, 
> YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch
>
>
> This is an example of the problem: 
>   
> {code}
> //   guaranteed,  max,used,   pending
> "root(=[30:18:6  30:18:6 12:12:6 1:1:1]);" + //root
> "-a(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // a
> "-b(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // b
> "-c(=[10:6:2 10:6:2  0:0:0   1:1:1])"; // c
> {code}
> There're 3 resource types. Total resource of the cluster is 30:18:6
> For both of a/b, there're 3 containers running, each of container is 2:2:1.
> Queue c uses 0 resource, and have 1:1:1 pending resource.
> Under existing logic, preemption cannot happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative

2018-05-23 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-8292:
-
Attachment: YARN-8292.008.patch

> Fix the dominant resource preemption cannot happen when some of the resource 
> vector becomes negative
> 
>
> Key: YARN-8292
> URL: https://issues.apache.org/jira/browse/YARN-8292
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8292.001.patch, YARN-8292.002.patch, 
> YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, 
> YARN-8292.006.patch, YARN-8292.007.patch, YARN-8292.008.patch
>
>
> This is an example of the problem: 
>   
> {code}
> //   guaranteed,  max,used,   pending
> "root(=[30:18:6  30:18:6 12:12:6 1:1:1]);" + //root
> "-a(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // a
> "-b(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // b
> "-c(=[10:6:2 10:6:2  0:0:0   1:1:1])"; // c
> {code}
> There're 3 resource types. Total resource of the cluster is 30:18:6
> For both of a/b, there're 3 containers running, each of container is 2:2:1.
> Queue c uses 0 resource, and have 1:1:1 pending resource.
> Under existing logic, preemption cannot happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8327) Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows

2018-05-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488198#comment-16488198
 ] 

Hudson commented on YARN-8327:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14276 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14276/])
YARN-8327. Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on (inigoiri: 
rev f09dc73001fd5f3319765fa997f4b0ca9e8f2aff)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/logaggregation/TestAggregatedLogFormat.java


> Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows
> --
>
> Key: YARN-8327
> URL: https://issues.apache.org/jira/browse/YARN-8327
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3
>
> Attachments: YARN-8327.v1.patch, YARN-8327.v2.patch, 
> image-2018-05-18-16-52-08-250.png, image-2018-05-21-09-05-49-550.png
>
>
> TestAggregatedLogFormat#testReadAcontainerLogs1 fails on Windows because of 
> the line separator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7899) [AMRMProxy] Stateful FederationInterceptor for pending requests

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488179#comment-16488179
 ] 

genericqa commented on YARN-7899:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 20s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 14 new + 16 unchanged - 0 fixed = 30 total (was 16) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
14s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
13s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
59s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}115m 21s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-7899 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924828/YARN-7899.v2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 789936ec75bf 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / cddbbe5 |
| 

[jira] [Commented] (YARN-8327) Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488159#comment-16488159
 ] 

Íñigo Goiri commented on YARN-8327:
---

+1 on  [^YARN-8327.v2.patch].
Committing.

> Fix TestAggregatedLogFormat#testReadAcontainerLogs1 on Windows
> --
>
> Key: YARN-8327
> URL: https://issues.apache.org/jira/browse/YARN-8327
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8327.v1.patch, YARN-8327.v2.patch, 
> image-2018-05-18-16-52-08-250.png, image-2018-05-21-09-05-49-550.png
>
>
> TestAggregatedLogFormat#testReadAcontainerLogs1 fails on Windows because of 
> the line separator.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests to fix NPE failures

2018-05-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488111#comment-16488111
 ] 

Hudson commented on YARN-8348:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14275 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14275/])
YARN-8348. Incorrect and missing AfterClass in HBase-tests to fix NPE 
(inigoiri: rev d72615611cfa6bd82756270d4b10136ec1e56741)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorageEntities.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowRun.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorageApps.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorageDomain.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowActivity.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/flow/TestHBaseStorageFlowRunCompaction.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests/src/test/java/org/apache/hadoop/yarn/server/timelineservice/storage/TestHBaseTimelineStorageSchema.java


> Incorrect and missing AfterClass in HBase-tests to fix NPE failures
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This JIRA fixes the NPE failures in HBase-tests and reduces the failed tests 
> in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488100#comment-16488100
 ] 

genericqa commented on YARN-7530:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
15s{color} | {color:red} root in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-services-api in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-yarn-services-api in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m 
44s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-yarn-services-api in branch-3.1 failed. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
40s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 10s{color} 
| {color:red} hadoop-yarn-services-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 19s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
15s{color} | {color:red} hadoop-yarn-services-api in the patch failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 13s{color} 
| {color:red} hadoop-yarn-services-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | YARN-7530 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924835/YARN-7530-branch-3.1.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux 8bc14eca71b4 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 61b5b2f |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/20846/artifact/out/branch-mvninstall-root.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/20846/artifact/out/branch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-api.txt
 |
| mvnsite | 
https://builds.apache.org/job/PreCommit-YARN-Build/20846/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-services_hadoop-yarn-services-api.txt
 |
| javadoc | 

[jira] [Commented] (YARN-8292) Fix the dominant resource preemption cannot happen when some of the resource vector becomes negative

2018-05-23 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488092#comment-16488092
 ] 

Jason Lowe commented on YARN-8292:
--

Thanks for updating the patch!  The TestPreemptionForQueueWithPriorities 
failure appears to be related.

Nit: Using the new isAnyMajorResourceAboveZero method will be a bit more 
readable and more efficient than the fitsIn check against none since fitsIn 
does unnecessary unit conversion checks.

What is the point of the new static methods added to Resources?  It's more 
succinct to call the ResourceCalculator method directly, e.g.: 
rc.isAnyMajorResourceZeroOrNegative(resource) instead of 
Resources.isAnyMajorResourceZeroOrNegative(rc, resource).

It would be good to cleanup the whitespace nit.  Speaking of whitespace, one of 
the checkstyle errors was caused by a whitespace-only formatting change in this 
patch (the for loop in computeFixpointAllocation)


> Fix the dominant resource preemption cannot happen when some of the resource 
> vector becomes negative
> 
>
> Key: YARN-8292
> URL: https://issues.apache.org/jira/browse/YARN-8292
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Wangda Tan
>Priority: Critical
> Attachments: YARN-8292.001.patch, YARN-8292.002.patch, 
> YARN-8292.003.patch, YARN-8292.004.patch, YARN-8292.005.patch, 
> YARN-8292.006.patch, YARN-8292.007.patch
>
>
> This is an example of the problem: 
>   
> {code}
> //   guaranteed,  max,used,   pending
> "root(=[30:18:6  30:18:6 12:12:6 1:1:1]);" + //root
> "-a(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // a
> "-b(=[10:6:2 10:6:2  6:6:3   0:0:0]);" + // b
> "-c(=[10:6:2 10:6:2  0:0:0   1:1:1])"; // c
> {code}
> There're 3 resource types. Total resource of the cluster is 30:18:6
> For both of a/b, there're 3 containers running, each of container is 2:2:1.
> Queue c uses 0 resource, and have 1:1:1 pending resource.
> Under existing logic, preemption cannot happen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8343) YARN should have ability to run images only from a whitelist docker registries

2018-05-23 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488084#comment-16488084
 ] 

Eric Badger commented on YARN-8343:
---

I think we need to do a rework around privileged/non-privileged containers as 
is. I agree that there is use in a mechanism that you only allow images from a 
certain registry to run at all. This could manifest as a whitelist or as a flag 
to only accept images from the privileged registries list or something else 
that we design that makes this all less confusing

> YARN should have ability to run images only from a whitelist docker registries
> --
>
> Key: YARN-8343
> URL: https://issues.apache.org/jira/browse/YARN-8343
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Priority: Critical
>  Labels: Docker
>
> This is a superset of docker.privileged-containers.registries, admin can 
> specify a whitelist and all images from non-privileged-container.registries 
> will be rejected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests to fix NPE failures

2018-05-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8348:
--
Description: 
HBase tests are failing in 
[linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
 for 2 reasons: 
 * incorrect afterClass;
 * not defined KeyProviderTokenIssuer.

While in windows are failing for the previous 2 reasons plus * missing 
afterClass.

This JIRA fixes the NPE failures in HBase-tests and reduces the failed tests in 
Linux.

  was:
HBase tests are failing in 
[linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
 for 2 reasons: 
 * incorrect afterClass;
 * not defined KeyProviderTokenIssuer.

While in windows are failing for the previous 2 reasons plus * missing 
afterClass.

This Jira tracks the effort to fix the NPE failures in HBase-tests and reduces 
the failed tests in Linux.


> Incorrect and missing AfterClass in HBase-tests to fix NPE failures
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This JIRA fixes the NPE failures in HBase-tests and reduces the failed tests 
> in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8344) Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync

2018-05-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488079#comment-16488079
 ] 

Hudson commented on YARN-8344:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14274 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14274/])
YARN-8344. Missing nm.stop() in TestNodeManagerResync to fix (inigoiri: rev 
e99e5bf104e9664bc1b43a2639d87355d47a77e2)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java


> Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync
> 
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Fix For: 2.10.0, 3.2.0, 3.1.1, 2.9.2, 3.0.3
>
> Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-23 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488073#comment-16488073
 ] 

Eric Badger commented on YARN-8342:
---

{quote}Looks like the name {{docker.privileged-containers.registries}} is very 
misleading. It doesn't apply only for Docker Privileged Containers, right? If 
so, we should fix this name.
{quote}
I 100% agree with this. 

bq. With YARN-7654 changes to use execvp, this concern has been nullified. It 
is safe to preserve launch command even for untrusted images.
If we're going to allow random (untrusted) images to execute, then the command 
with which they start doesn't really matter, user-specified or image-supplied. 
The image could start with any CMD, so we already have to assume that it's 
untrusted/possibly malicious code that is executing right off the bat. I don't 
see any added risk here by letting the user define what they want to run.

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Priority: Critical
>  Labels: Docker
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests to fix NPE failures

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8348:
---
Description: 
HBase tests are failing in 
[linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
 for 2 reasons: 
 * incorrect afterClass;
 * not defined KeyProviderTokenIssuer.

While in windows are failing for the previous 2 reasons plus * missing 
afterClass.

This Jira tracks the effort to fix the NPE failures in HBase-tests and reduces 
the failed tests in Linux.

  was:
HBase tests are failing in 
[linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
 for 2 reasons: 
 * incorrect afterClass;
 * not defined KeyProviderTokenIssuer.

While in windows are failing for the previous 2 reasons plus * missing 
afterClass.

This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
tests in Linux.


> Incorrect and missing AfterClass in HBase-tests to fix NPE failures
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This Jira tracks the effort to fix the NPE failures in HBase-tests and 
> reduces the failed tests in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests to fix NPE failures

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8348:
---
Summary: Incorrect and missing AfterClass in HBase-tests to fix NPE 
failures  (was: Incorrect and missing AfterClass in HBase-tests)

> Incorrect and missing AfterClass in HBase-tests to fix NPE failures
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
> tests in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6919) Add default volume mount list

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488071#comment-16488071
 ] 

genericqa commented on YARN-6919:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
11s{color} | {color:red} root in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn in branch-3.1 failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 7s{color} | {color:green} branch-3.1 passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
11s{color} | {color:red} hadoop-yarn-api in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-common in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-3.1 failed. 
{color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m 
46s{color} | {color:red} branch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-api in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-common in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-3.1 failed. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-api in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-common in branch-3.1 failed. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-3.1 failed. 
{color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m  
8s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 10s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
10s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m  
9s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m  
9s{color} | {color:red} patch has errors when building and testing our client 
artifacts. 

[jira] [Updated] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-05-23 Thread Chandni Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-7530:

Attachment: YARN-7530-branch-3.1.001.patch

> hadoop-yarn-services-api should be part of hadoop-yarn-services
> ---
>
> Key: YARN-7530
> URL: https://issues.apache.org/jira/browse/YARN-7530
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Chandni Singh
>Priority: Blocker
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-7530-branch-3.1.001.patch, YARN-7530.001.patch, 
> YARN-7530.002.patch
>
>
> Hadoop-yarn-services-api is currently a parallel project to 
> hadoop-yarn-services project.  It would be better if hadoop-yarn-services-api 
> is part of hadoop-yarn-services for correctness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488066#comment-16488066
 ] 

Íñigo Goiri commented on YARN-8348:
---

Do you mind updating the description to make clear we leave 
KeyProviderTokenIssuer open but we fix the NPEs?

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
> tests in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013
 ] 

Hsin-Liang Huang edited comment on YARN-8326 at 5/23/18 9:19 PM:
-

Hi  [~eyang]

   I ran the sample job,

{color:#14892c}time hadoop jar 
/usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar
 Client -classpath simple-yarn-app-1.1.0.jar -cmd "java 
com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"{color}

with the changed settings, it still ran 15 seconds compared to 6 or 7 seconds 
in 2.6 environment.  So I am not sure if the significant performance role that 
these two  monitoring setting would play in this. The major issue could still 
be in the exiting container that in 3.0 environment is much slower than 2.6 
environment.  Can someone from yarn team look into this? This is a general yarn 
application performance issue in 3.0. 

 


was (Author: hlhu...@us.ibm.com):
Hi  [~eyang]

   I ran the sample job, with the changed settings, it still ran 15 seconds 
compared to 6 or 7 seconds in 2.6 environment.  So I am not sure if the 
significant performance role that these two  monitoring setting would play in 
this. The major issue could still be in the exiting container that in 3.0 
environment is much slower than 2.6 environment.  Can someone from yarn team 
look into this? This is a general yarn application performance issue in 3.0. 

 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  

[jira] [Issue Comment Deleted] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hsin-Liang Huang updated YARN-8326:
---
Comment: was deleted

(was: HI Eric, 

   I tried the suggestion and changed the setting.  The result on running 

{color:#14892c}time hadoop jar 
/usr/hdp/3.0.0.0-829/hadoop-yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0.3.0.0.0-829.jar
 Client -classpath simple-yarn-app-1.1.0.jar -cmd "java 
com.hortonworks.simpleyarnapp.ApplicationMaster /bin/date 8"{color}

 is 20s, 15s and 15s (I ran it 3 times).   It didn't get better if it's not 
worse.  (It was 14, 15 seconds before). )

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  

[jira] [Comment Edited] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013
 ] 

Hsin-Liang Huang edited comment on YARN-8326 at 5/23/18 9:15 PM:
-

Hi  [~eyang]

   I ran the sample job, with the changed settings, it still ran 15 seconds 
compared to 6 or 7 seconds in 2.6 environment.  So I am not sure if the 
significant performance role that these two  monitoring setting would play in 
this. The major issue could still be in the exiting container that in 3.0 
environment is much slower than 2.6 environment.  Can someone from yarn team 
look into this? This is a general yarn application performance issue in 3.0. 

 


was (Author: hlhu...@us.ibm.com):
Hi  [~eyang]

   Here is another update.  Even though the simple job that I ran with the 
suggested setting changed, the performance was improved.   However, I ran our 
unit testcases, and it still ran 14 hours compared to 7 hours in 2.6 
environment.  I also ran another sample job, with the changed settings, it 
still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment.  So I think 
even though monitoring setting might affect the performance issue, but it only 
plays a little part,  the major issue could still be in the exiting container 
that in 3.0 environment is much slower than 2.6 environment.  Is there anyone 
looking into this area?   Thanks!

 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  

[jira] [Updated] (YARN-8344) Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8344:
---
Summary: Missing nm.stop() in TestNodeManagerResync to fix 
testKillContainersOnResync  (was: Missing nm.close() in TestNodeManagerResync 
to fix testKillContainersOnResync)

> Missing nm.stop() in TestNodeManagerResync to fix testKillContainersOnResync
> 
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hsin-Liang Huang updated YARN-8326:
---
Comment: was deleted

(was: [~eyang]   this afternoon,  I tried the command and the performance was 
dramatically improved.  It used to run 8 seconds, now it ran 3 seconds 
consistently, then I compared with the other 3.0 cluster which I didn't make 
the properties changes that you suggested, and it still ran 8 seconds 
consistently.   I am going to run our testcases to see if the performance is 
also improved there. )

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  yarn.nodemanager.remote-app-log-dir-suffix
>  logs
>  
> 
>  yarn.nodemanager.resource-plugins
>  
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices
>  auto
>  
> 
>  yarn.nodemanager.resource-plugins.gpu.docker-plugin
>  nvidia-docker-v1
>  
> 
>  

[jira] [Commented] (YARN-6919) Add default volume mount list

2018-05-23 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488049#comment-16488049
 ] 

Eric Badger commented on YARN-6919:
---

Hey [~shaneku...@gmail.com], I think this should go into 3.1 as well, so I just 
put up a patch. Note, however, that YARN-7530 has broken branch-3.1 compilation 
from a clean .m2 repo. I'm not sure what genericqa will do.

> Add default volume mount list
> -
>
> Key: YARN-6919
> URL: https://issues.apache.org/jira/browse/YARN-6919
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
> Attachments: YARN-6919-branch-3.1.002.patch, YARN-6919.001.patch, 
> YARN-6919.002.patch
>
>
> Piggybacking on YARN-5534, we should create a default list that bind mounts 
> selected volumes into all docker containers. This list will be empty by 
> default 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6919) Add default volume mount list

2018-05-23 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-6919:
--
Attachment: YARN-6919-branch-3.1.002.patch

> Add default volume mount list
> -
>
> Key: YARN-6919
> URL: https://issues.apache.org/jira/browse/YARN-6919
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
>  Labels: Docker
> Attachments: YARN-6919-branch-3.1.002.patch, YARN-6919.001.patch, 
> YARN-6919.002.patch
>
>
> Piggybacking on YARN-5534, we should create a default list that bind mounts 
> selected volumes into all docker containers. This list will be empty by 
> default 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7899) [AMRMProxy] Stateful FederationInterceptor for pending requests

2018-05-23 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-7899:
---
Attachment: YARN-7899.v2.patch

> [AMRMProxy] Stateful FederationInterceptor for pending requests
> ---
>
> Key: YARN-7899
> URL: https://issues.apache.org/jira/browse/YARN-7899
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-7899.v1.patch, YARN-7899.v2.patch
>
>
> Today FederationInterceptor (in AMRMProxy for YARN Federation) is stateless 
> in terms of pending (outstanding) requests. Whenever AM issues new requests, 
> FI simply splits and sends them to sub-cluster YarnRMs and forget about them. 
> This JIRA attempts to make FI stateful so that it remembers the pending 
> requests in all relevant sub-clusters. This has two major benefits: 
> 1. It is a prerequisite for FI to be able to cancel pending request in one 
> sub-cluster and re-send it to other sub-clusters. This is needed for load 
> balancing and to fully comply with the relax locality fallback to ANY 
> semantic. When we send a request to one sub-cluster, we have effectively 
> restrained the allocation for this request to be within this sub-cluster 
> rather than everywhere. If the cluster capacity in this sub-cluster for this 
> app is full or this YarnRM is overloaded and slow, the request will be stuck 
> there for a long time even if there is free capacity in other sub-clusters. 
> We need FI to remember and adjust the pending requests on the fly. 
> 2. This makes pending request recovery easier when YarnRM fails over. Today 
> whenever one sub-cluster RM fails over, in order to recover lost pending 
> requests for this sub-cluster, 
> we have to propagate the ApplicationMasterNotRegisteredException from the 
> YarnRM back to AM, triggering a full pending resend from AM. This contains 
> pending for not only the failing-over sub-cluster, but everyone. Since our 
> split-merge (AMRMProxyPolicy) does not guarantee idempotency, the same 
> request we sent to sub-cluster-1 earlier might be resent to sub-cluster-2. If 
> both these YarnRMs have not failed over, they will both allocate for this 
> request, leading to over-allocation. Also, these full pending resends also 
> puts unnecessary load on every YarnRM in the cluster everytime one YarnRM 
> fails over. With stateful FederationInterceptor, since we remember pending 
> requests we have sent out earlier, we can shield the 
> ApplicationMasterNotRegisteredException for AM and resend the pending only to 
> the failed over YarnRM. This eliminates over-allocation and minimizes the 
> recovery overhead. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-05-23 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7530:
-
Fix Version/s: 3.2.0

> hadoop-yarn-services-api should be part of hadoop-yarn-services
> ---
>
> Key: YARN-7530
> URL: https://issues.apache.org/jira/browse/YARN-7530
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Chandni Singh
>Priority: Blocker
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-7530.001.patch, YARN-7530.002.patch
>
>
> Hadoop-yarn-services-api is currently a parallel project to 
> hadoop-yarn-services project.  It would be better if hadoop-yarn-services-api 
> is part of hadoop-yarn-services for correctness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-05-23 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7530:
-
Priority: Blocker  (was: Trivial)

> hadoop-yarn-services-api should be part of hadoop-yarn-services
> ---
>
> Key: YARN-7530
> URL: https://issues.apache.org/jira/browse/YARN-7530
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Chandni Singh
>Priority: Blocker
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-7530.001.patch, YARN-7530.002.patch
>
>
> Hadoop-yarn-services-api is currently a parallel project to 
> hadoop-yarn-services project.  It would be better if hadoop-yarn-services-api 
> is part of hadoop-yarn-services for correctness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-05-23 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7530:
-
Fix Version/s: (was: 3.2.0)

> hadoop-yarn-services-api should be part of hadoop-yarn-services
> ---
>
> Key: YARN-7530
> URL: https://issues.apache.org/jira/browse/YARN-7530
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Chandni Singh
>Priority: Blocker
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-7530.001.patch, YARN-7530.002.patch
>
>
> Hadoop-yarn-services-api is currently a parallel project to 
> hadoop-yarn-services project.  It would be better if hadoop-yarn-services-api 
> is part of hadoop-yarn-services for correctness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8326) Yarn 3.0 seems runs slower than Yarn 2.6

2018-05-23 Thread Hsin-Liang Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488013#comment-16488013
 ] 

Hsin-Liang Huang commented on YARN-8326:


Hi  [~eyang]

   Here is another update.  Even though the simple job that I ran with the 
suggested setting changed, the performance was improved.   However, I ran our 
unit testcases, and it still ran 14 hours compared to 7 hours in 2.6 
environment.  I also ran another sample job, with the changed settings, it 
still ran 15 seconds compared to 6 or 7 seconds in 2.6 environment.  So I think 
even though monitoring setting might affect the performance issue, but it only 
plays a little part,  the major issue could still be in the exiting container 
that in 3.0 environment is much slower than 2.6 environment.  Is there anyone 
looking into this area?   Thanks!

 

> Yarn 3.0 seems runs slower than Yarn 2.6
> 
>
> Key: YARN-8326
> URL: https://issues.apache.org/jira/browse/YARN-8326
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0
> Environment: This is the yarn-site.xml for 3.0. 
>  
> 
> 
>  hadoop.registry.dns.bind-port
>  5353
>  
> 
>  hadoop.registry.dns.domain-name
>  hwx.site
>  
> 
>  hadoop.registry.dns.enabled
>  true
>  
> 
>  hadoop.registry.dns.zone-mask
>  255.255.255.0
>  
> 
>  hadoop.registry.dns.zone-subnet
>  172.17.0.0
>  
> 
>  manage.include.files
>  false
>  
> 
>  yarn.acl.enable
>  false
>  
> 
>  yarn.admin.acl
>  yarn
>  
> 
>  yarn.client.nodemanager-connect.max-wait-ms
>  6
>  
> 
>  yarn.client.nodemanager-connect.retry-interval-ms
>  1
>  
> 
>  yarn.http.policy
>  HTTP_ONLY
>  
> 
>  yarn.log-aggregation-enable
>  false
>  
> 
>  yarn.log-aggregation.retain-seconds
>  2592000
>  
> 
>  yarn.log.server.url
>  
> [http://xx:19888/jobhistory/logs|http://whiny2.fyre.ibm.com:19888/jobhistory/logs]
>  
> 
>  yarn.log.server.web-service.url
>  
> [http://xx:8188/ws/v1/applicationhistory|http://whiny2.fyre.ibm.com:8188/ws/v1/applicationhistory]
>  
> 
>  yarn.node-labels.enabled
>  false
>  
> 
>  yarn.node-labels.fs-store.retry-policy-spec
>  2000, 500
>  
> 
>  yarn.node-labels.fs-store.root-dir
>  /system/yarn/node-labels
>  
> 
>  yarn.nodemanager.address
>  0.0.0.0:45454
>  
> 
>  yarn.nodemanager.admin-env
>  MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX
>  
> 
>  yarn.nodemanager.aux-services
>  mapreduce_shuffle,spark2_shuffle,timeline_collector
>  
> 
>  yarn.nodemanager.aux-services.mapreduce_shuffle.class
>  org.apache.hadoop.mapred.ShuffleHandler
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.spark2_shuffle.classpath
>  /usr/spark2/aux/*
>  
> 
>  yarn.nodemanager.aux-services.spark_shuffle.class
>  org.apache.spark.network.yarn.YarnShuffleService
>  
> 
>  yarn.nodemanager.aux-services.timeline_collector.class
>  
> org.apache.hadoop.yarn.server.timelineservice.collector.PerNodeTimelineCollectorsAuxService
>  
> 
>  yarn.nodemanager.bind-host
>  0.0.0.0
>  
> 
>  yarn.nodemanager.container-executor.class
>  
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor
>  
> 
>  yarn.nodemanager.container-metrics.unregister-delay-ms
>  6
>  
> 
>  yarn.nodemanager.container-monitor.interval-ms
>  3000
>  
> 
>  yarn.nodemanager.delete.debug-delay-sec
>  0
>  
> 
>  
> yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage
>  90
>  
> 
>  yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
>  1000
>  
> 
>  yarn.nodemanager.disk-health-checker.min-healthy-disks
>  0.25
>  
> 
>  yarn.nodemanager.health-checker.interval-ms
>  135000
>  
> 
>  yarn.nodemanager.health-checker.script.timeout-ms
>  6
>  
> 
>  
> yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
>  false
>  
> 
>  yarn.nodemanager.linux-container-executor.group
>  hadoop
>  
> 
>  
> yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users
>  false
>  
> 
>  yarn.nodemanager.local-dirs
>  /hadoop/yarn/local
>  
> 
>  yarn.nodemanager.log-aggregation.compression-type
>  gz
>  
> 
>  yarn.nodemanager.log-aggregation.debug-enabled
>  false
>  
> 
>  yarn.nodemanager.log-aggregation.num-log-files-per-app
>  30
>  
> 
>  
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds
>  3600
>  
> 
>  yarn.nodemanager.log-dirs
>  /hadoop/yarn/log
>  
> 
>  yarn.nodemanager.log.retain-seconds
>  604800
>  
> 
>  yarn.nodemanager.pmem-check-enabled
>  false
>  
> 
>  yarn.nodemanager.recovery.dir
>  /var/log/hadoop-yarn/nodemanager/recovery-state
>  
> 
>  yarn.nodemanager.recovery.enabled
>  true
>  
> 
>  yarn.nodemanager.recovery.supervised
>  true
>  
> 
>  yarn.nodemanager.remote-app-log-dir
>  /app-logs
>  
> 
>  

[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488002#comment-16488002
 ] 

Giovanni Matteo Fumarola commented on YARN-8348:


Thanks [~elgoiri] for the review. I will open a follow-up Jira for 
KeyProviderTokenIssuer. As I said before, this patch will bring the failed test 
from 21 to 16 in Linux.

As HDFS-13558 that you and [~huanbang1993] fixed by closing the cluster, the 
patch will fix failures in Windows for TestHBaseTimelineStorageDomain and 
TestHBaseTimelineStorageSchema.

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
> tests in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-23 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487992#comment-16487992
 ] 

Shane Kumpf commented on YARN-8342:
---

This sounds like a reasonable proposal. In cases where the current behavior is 
desired, the user can set "launch_command" to an empty string I guess?

To be clear, there is no replacement with an "empty bash". The current 
"untrusted" mode leaves it up to the Docker image to specify the 
ENTRYPOINT/CMD. Nothing is overwritten by YARN in this "untrusted" mode. It is 
very common for images to use "bash" as the CMD. When an image does this and 
YARN runs in this "untrusted" mode, a non-interactive "bash" shell starts in 
the container and immediately exits with success. YARN reports that the 
container ran successfully, but this is confusing to the user because the code 
they expected to run did not run. The launch script depends on mounts and 
"untrusted" mode strips all mounts, meaning we flat out can't use a 
launch_script in this mode as we would in "trusted" mode. Allowing the 
"launch_command" supplied by the user, without embedding that "launch_command" 
in the launch script seems like a viable way to support both. Confused yet? :)

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Priority: Critical
>  Labels: Docker
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487987#comment-16487987
 ] 

Íñigo Goiri commented on YARN-8344:
---

+1 on  [^YARN-8344.v2.patch].
We still need to figure out the proper fix for the path length issue on Windows.
[~giovanni.fumarola], please link this JIRA once opening the Windows fix.

> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync
> -
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487984#comment-16487984
 ] 

Íñigo Goiri commented on YARN-8348:
---

Good news is that the NPE is gone.
However, the original NoClassDefFoundError surfaces clarly now.
I'm fine committing this as is but I'd like to have a follow-up JIRA on why 
KeyProviderTokenIssuer is not found.

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
> tests in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8333) Load balance YARN services using RegistryDNS multiple A records

2018-05-23 Thread Eric Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang reassigned YARN-8333:
---

Assignee: Eric Yang

> Load balance YARN services using RegistryDNS multiple A records
> ---
>
> Key: YARN-8333
> URL: https://issues.apache.org/jira/browse/YARN-8333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>
> For scaling stateless containers, it would be great to support DNS round 
> robin for fault tolerance and load balancing.  The current DNS record format 
> for RegistryDNS is 
> [container-instance].[application-name].[username].[domain].  For example:
> {code}
> appcatalog-0.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog-1.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog-2.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog-3.appname.hbase.ycluster. IN A 123.123.123.123
> {code}
> It would be nice to add multi-A record that contains all IP addresses of the 
> same component in addition to the instance based records.  For example:
> {code}
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.120
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.121
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.122
> appcatalog.appname.hbase.ycluster. IN A 123.123.123.123
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8342) Using docker image from a non-privileged registry, the launch_command is not honored

2018-05-23 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487952#comment-16487952
 ] 

Eric Yang commented on YARN-8342:
-

[~vinodkv] Launch command was dropped in YARN-7516 due to concerns of shell 
expansion to cause the commands to run as root user via popen.  With YARN-7654 
changes to use execvp, this concern has been nullified.  It is safe to preserve 
launch command even for untrusted images.

[~shaneku...@gmail.com] [~ebadger] [~jlowe] Do you agree with this change?

> Using docker image from a non-privileged registry, the launch_command is not 
> honored
> 
>
> Key: YARN-8342
> URL: https://issues.apache.org/jira/browse/YARN-8342
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Priority: Critical
>  Labels: Docker
>
> During test of the Docker feature, I found that if a container comes from 
> non-privileged docker registry, the specified launch command will be ignored. 
> Container will success without any log, which is very confusing to end users. 
> And this behavior is inconsistent to containers from privileged docker 
> registries.
> cc: [~eyang], [~shaneku...@gmail.com], [~ebadger], [~jlowe]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487951#comment-16487951
 ] 

genericqa commented on YARN-8348:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 45s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase-tests
 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 27s{color} 
| {color:red} hadoop-yarn-server-timelineservice-hbase-tests in the patch 
failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities |
|   | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageSchema 
|
|   | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps |
|   | 
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 |
|   | 
hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity |
|   | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageDomain 
|
|   | 
hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction
 |
|   | hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce 

[jira] [Assigned] (YARN-8349) Remove YARN registry entries when a service is killed by the RM

2018-05-23 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf reassigned YARN-8349:
-

Assignee: Billie Rinaldi

> Remove YARN registry entries when a service is killed by the RM
> ---
>
> Key: YARN-8349
> URL: https://issues.apache.org/jira/browse/YARN-8349
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.2.0, 3.1.1
>Reporter: Shane Kumpf
>Assignee: Billie Rinaldi
>Priority: Major
>
> As the title states, when a service is killed by the RM (for exceeding its 
> lifetime for example), the YARN registry entries should be cleaned up.
> Without cleanup, DNS can contain multiple hostnames for a single IP address 
> in the case where IPs are reused. This impacts reverse lookups, which breaks 
> services, such as kerberos, that depend on those lookups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8349) Remove YARN registry entries when a service is killed by the RM

2018-05-23 Thread Shane Kumpf (JIRA)
Shane Kumpf created YARN-8349:
-

 Summary: Remove YARN registry entries when a service is killed by 
the RM
 Key: YARN-8349
 URL: https://issues.apache.org/jira/browse/YARN-8349
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn-native-services
Affects Versions: 3.2.0, 3.1.1
Reporter: Shane Kumpf


As the title states, when a service is killed by the RM (for exceeding its 
lifetime for example), the YARN registry entries should be cleaned up.

Without cleanup, DNS can contain multiple hostnames for a single IP address in 
the case where IPs are reused. This impacts reverse lookups, which breaks 
services, such as kerberos, that depend on those lookups.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-8334) [GPG] Fix potential connection leak in GPGUtils

2018-05-23 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang resolved YARN-8334.

Resolution: Fixed

> [GPG] Fix potential connection leak in GPGUtils
> ---
>
> Key: YARN-8334
> URL: https://issues.apache.org/jira/browse/YARN-8334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Minor
> Attachments: YARN-8334-YARN-7402.v1.patch, 
> YARN-8334-YARN-7402.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8334) [GPG] Fix potential connection leak in GPGUtils

2018-05-23 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487945#comment-16487945
 ] 

Botong Huang commented on YARN-8334:


Committed to YARN-7402 as db183f2ea. Thanks [~giovanni.fumarola] for the patch 
and [~elgoiri] for the review! 

> [GPG] Fix potential connection leak in GPGUtils
> ---
>
> Key: YARN-8334
> URL: https://issues.apache.org/jira/browse/YARN-8334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Minor
> Attachments: YARN-8334-YARN-7402.v1.patch, 
> YARN-8334-YARN-7402.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-7530) hadoop-yarn-services-api should be part of hadoop-yarn-services

2018-05-23 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger reopened YARN-7530:
---

This change breaks branch-3.1 compilation if the .m2 directory is cleaned.

{noformat}
[ERROR] [ERROR] Some problems were encountered while processing the POMs:
[WARNING] 'parent.relativePath' of POM 
org.apache.hadoop:hadoop-yarn-services-api:[unknown-version] 
(/Users/ebadger/apachehadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml)
 points at org.apache.hadoop:hadoop-yarn-services instead of 
org.apache.hadoop:hadoop-yarn-applications, please verify your project 
structure @ line 19, column 11
[FATAL] Non-resolvable parent POM for 
org.apache.hadoop:hadoop-yarn-services-api:[unknown-version]: Could not find 
artifact org.apache.hadoop:hadoop-yarn-applications:pom:3.1.1-SNAPSHOT and 
'parent.relativePath' points at wrong local POM @ line 19, column 11
[WARNING] 'build.plugins.plugin.version' for 
org.apache.maven.plugins:maven-gpg-plugin is missing. @ line 133, column 15
 @
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]
[ERROR]   The project 
org.apache.hadoop:hadoop-yarn-services-api:[unknown-version] 
(/Users/ebadger/apachehadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml)
 has 1 error
[ERROR] Non-resolvable parent POM for 
org.apache.hadoop:hadoop-yarn-services-api:[unknown-version]: Could not find 
artifact org.apache.hadoop:hadoop-yarn-applications:pom:3.1.1-SNAPSHOT and 
'parent.relativePath' points at wrong local POM @ line 19, column 11 -> [Help 2]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
[ERROR] [Help 2] 
http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
{noformat}


Here's the difference between branch-3.1 and trunk. The artifactId was updated 
correctly in trunk, but not branch-3.1
{noformat}
diff --git 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml
 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml
index 45168a9fbc4..d45da093102 100644
--- 
a/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml
+++ 
b/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-api/pom.xml
@@ -18,8 +18,8 @@
   4.0.0
   
 org.apache.hadoop
-hadoop-yarn-services
-3.2.0-SNAPSHOT
+hadoop-yarn-applications
+3.1.1-SNAPSHOT
   
   hadoop-yarn-services-api
   Apache Hadoop YARN Services API
{noformat}

> hadoop-yarn-services-api should be part of hadoop-yarn-services
> ---
>
> Key: YARN-7530
> URL: https://issues.apache.org/jira/browse/YARN-7530
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Affects Versions: 3.1.0
>Reporter: Eric Yang
>Assignee: Chandni Singh
>Priority: Trivial
> Fix For: 3.2.0, 3.1.1
>
> Attachments: YARN-7530.001.patch, YARN-7530.002.patch
>
>
> Hadoop-yarn-services-api is currently a parallel project to 
> hadoop-yarn-services project.  It would be better if hadoop-yarn-services-api 
> is part of hadoop-yarn-services for correctness.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8334) [] Fix potential connection leak in GPGUtils

2018-05-23 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8334:
---
Summary: [] Fix potential connection leak in GPGUtils  (was: Fix potential 
connection leak in GPGUtils)

> [] Fix potential connection leak in GPGUtils
> 
>
> Key: YARN-8334
> URL: https://issues.apache.org/jira/browse/YARN-8334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Minor
> Attachments: YARN-8334-YARN-7402.v1.patch, 
> YARN-8334-YARN-7402.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8334) [GPG] Fix potential connection leak in GPGUtils

2018-05-23 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8334:
---
Summary: [GPG] Fix potential connection leak in GPGUtils  (was: [] Fix 
potential connection leak in GPGUtils)

> [GPG] Fix potential connection leak in GPGUtils
> ---
>
> Key: YARN-8334
> URL: https://issues.apache.org/jira/browse/YARN-8334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Minor
> Attachments: YARN-8334-YARN-7402.v1.patch, 
> YARN-8334-YARN-7402.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.

2018-05-23 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487902#comment-16487902
 ] 

Eric Payne commented on YARN-4781:
--

bq.  FairOrdering policy could be used with weights?
[~sunilg], the fair ordering preemption will generally select the 
smaller-weigted users first even when those containers are older. It's a 
hierarchy of priority ordering, though, and it does still try to be "fair," so 
you could have a situation where the youngest containers are selected even 
though they are owned by a more heavily-weighted user.

> Support intra-queue preemption for fairness ordering policy.
> 
>
> Key: YARN-4781
> URL: https://issues.apache.org/jira/browse/YARN-4781
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Wangda Tan
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-4781.001.patch, YARN-4781.002.patch, 
> YARN-4781.003.patch, YARN-4781.004.patch, YARN-4781.005.patch
>
>
> We introduced fairness queue policy since YARN-3319, which will let large 
> applications make progresses and not starve small applications. However, if a 
> large application takes the queue’s resources, and containers of the large 
> app has long lifespan, small applications could still wait for resources for 
> long time and SLAs cannot be guaranteed.
> Instead of wait for application release resources on their own, we need to 
> preempt resources of queue with fairness policy enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487894#comment-16487894
 ] 

genericqa commented on YARN-8344:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 29 unchanged - 2 fixed = 29 total (was 31) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 45s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
56s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 76m 25s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8344 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924796/YARN-8344.v2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux db23130e69a9 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 51ce02b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20841/testReport/ |
| Max. process+thread count | 306 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20841/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT  

[jira] [Commented] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils

2018-05-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487895#comment-16487895
 ] 

Hudson commented on YARN-8336:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14272 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14272/])
YARN-8336. Fix potential connection leak in SchedConfCLI and (inigoiri: rev 
e30938af1270e079587e7bc06b755f9e93e660a5)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/SchedConfCLI.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/YarnWebServiceUtils.java


> Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
> -
>
> Key: YARN-8336
> URL: https://issues.apache.org/jira/browse/YARN-8336
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: YARN-8336.v1.patch, YARN-8336.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487886#comment-16487886
 ] 

Íñigo Goiri commented on YARN-8348:
---

Technically the null check in AfterClass shouldn't be needed as a failure in 
BeforeClass should trigger the error everywhere else.
In any case, is good to not have a NPE if the BeforeClass fails.
So in the output we went from a double NoClassDefFound+NPE to just 
NoClassDefFound.
I think this is an improvement but we need to figure out the reason for the 
NoClassDefFound (probably a separate JIRA).

The real fix here would be the one in TestHBaseTimelineStorageDomain which 
leaves the mini cluster open.
 [^YARN-8348.v1.patch] LGTM.
Let's wait for Yetus.

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
> tests in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"

2018-05-23 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487887#comment-16487887
 ] 

genericqa commented on YARN-8346:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
6s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 61m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd |
| JIRA Issue | YARN-8346 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12924799/YARN-8346.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 2a58b0d4306d 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 51ce02b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/20842/testReport/ |
| Max. process+thread count | 303 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/20842/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Upgrading to 3.1 kills running containers with error "Opportunistic container 
> queue is full"
> 

[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487861#comment-16487861
 ] 

Giovanni Matteo Fumarola commented on YARN-8348:


[^YARN-8348.v1.patch] will bring test failed from 21 to 16.

[Link to failed 
tests|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
> tests in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8348:
---
Description: 
HBase tests are failing in 
[linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
 for 2 reasons: 
 * incorrect afterClass;
 * not defined KeyProviderTokenIssuer.

While in windows are failing for the previous 2 reasons plus * missing 
afterClass.

This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
tests in Linux.

  was:
HBase tests are failing in 
[linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
 for 2 reasons: 
* incorrect 


> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
>  * incorrect afterClass;
>  * not defined KeyProviderTokenIssuer.
> While in windows are failing for the previous 2 reasons plus * missing 
> afterClass.
> This Jira tracks the effort to fix part of HBase-tests and reduces the failed 
> tests in Linux.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8348:
---
Description: 
HBase tests are failing in 
[linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
 for 2 reasons: 
* incorrect 

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>
> HBase tests are failing in 
> [linux|https://builds.apache.org/view/H-L/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86/789/testReport/]
>  for 2 reasons: 
> * incorrect 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487848#comment-16487848
 ] 

Giovanni Matteo Fumarola commented on YARN-8348:


Before my patch:
[ERROR] Errors: 
[ERROR] 
TestTimelineReaderWebServicesHBaseStorage.setupBeforeClass:79->AbstractTimelineReaderHBaseTestBase.setup:60
 » NoClassDefFound
[ERROR] 
org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps.org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageApps
[ERROR] Run 1: TestHBaseTimelineStorageApps.setupBeforeClass:97 » 
NoClassDefFound org/apache/...
[ERROR] Run 2: TestHBaseTimelineStorageApps.tearDownAfterClass:1939 NullPointer
[INFO] 
[ERROR] TestHBaseTimelineStorageDomain.setupBeforeClass:51 » NoClassDefFound 
org/apach...
[ERROR] 
org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities.org.apache.hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineStorageEntities
[ERROR] Run 1: TestHBaseTimelineStorageEntities.setupBeforeClass:110 » 
NoClassDefFound org/ap...
[ERROR] Run 2: TestHBaseTimelineStorageEntities.tearDownAfterClass:1882 
NullPointer
[INFO] 
[ERROR] TestHBaseTimelineStorageSchema.setupBeforeClass:49 » NoClassDefFound 
org/apach...
[ERROR] 
org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity.org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity
[ERROR] Run 1: TestHBaseStorageFlowActivity.setupBeforeClass:71 » 
NoClassDefFound org/apache/...
[ERROR] Run 2: TestHBaseStorageFlowActivity.tearDownAfterClass:495 NullPointer
[INFO] 
[ERROR] 
org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun.org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRun
[ERROR] Run 1: TestHBaseStorageFlowRun.setupBeforeClass:83 » NoClassDefFound 
org/apache/hadoo...
[ERROR] Run 2: TestHBaseStorageFlowRun.tearDownAfterClass:1078 NullPointer
[INFO] 
[ERROR] 
org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction.org.apache.hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowRunCompaction
[ERROR] Run 1: TestHBaseStorageFlowRunCompaction.setupBeforeClass:82 » 
NoClassDefFound org/ap...
[ERROR] Run 2: TestHBaseStorageFlowRunCompaction.tearDownAfterClass:853 
NullPointer

After my patch:
[ERROR] Errors: 
[ERROR] 
TestTimelineReaderWebServicesHBaseStorage.setupBeforeClass:79->AbstractTimelineReaderHBaseTestBase.setup:60
 » NoClassDefFound
[ERROR] TestHBaseTimelineStorageApps.setupBeforeClass:97 » NoClassDefFound 
org/apache/...
[ERROR] TestHBaseTimelineStorageDomain.setupBeforeClass:52 » NoClassDefFound 
org/apach...
[ERROR] TestHBaseTimelineStorageEntities.setupBeforeClass:110 » NoClassDefFound 
org/ap...
[ERROR] TestHBaseTimelineStorageSchema.setupBeforeClass:50 » NoClassDefFound 
org/apach...
[ERROR] TestHBaseStorageFlowActivity.setupBeforeClass:71 » NoClassDefFound 
org/apache/...
[ERROR] TestHBaseStorageFlowRun.setupBeforeClass:83 » NoClassDefFound 
org/apache/hadoo...
[ERROR] TestHBaseStorageFlowRunCompaction.setupBeforeClass:82 » NoClassDefFound 
org/ap...

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola reassigned YARN-8348:
--

Assignee: Giovanni Matteo Fumarola

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8348:
---
Attachment: YARN-8348.v1.patch

> Incorrect and missing AfterClass in HBase-tests
> ---
>
> Key: YARN-8348
> URL: https://issues.apache.org/jira/browse/YARN-8348
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8348.v1.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487775#comment-16487775
 ] 

Íñigo Goiri edited comment on YARN-8336 at 5/23/18 6:54 PM:


Both 
[TestLogsCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestLogsCLI/]
 and 
[TestSchedConfCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestSchedConfCLI/]
 pass.
+1
Committing to trunk.


was (Author: elgoiri):
Both 
[TestLogsCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestLogsCLI/]
 and 
[TestSchedConfCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestSchedConfCLI/]
 pass.
+1
Feel free to commit.

> Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
> -
>
> Key: YARN-8336
> URL: https://issues.apache.org/jira/browse/YARN-8336
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8336.v1.patch, YARN-8336.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-8348) Incorrect and missing AfterClass in HBase-tests

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)
Giovanni Matteo Fumarola created YARN-8348:
--

 Summary: Incorrect and missing AfterClass in HBase-tests
 Key: YARN-8348
 URL: https://issues.apache.org/jira/browse/YARN-8348
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Giovanni Matteo Fumarola






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8108) RM metrics rest API throws GSSException in kerberized environment

2018-05-23 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487825#comment-16487825
 ] 

Eric Yang commented on YARN-8108:
-

[~yzhangal] My preference is to fix this in 3.0.3 release.  If consensus is not 
reached, release manager can push this out of 3.0.3 release, and release note 
this as an known issue.  I am fine with the plan.

> RM metrics rest API throws GSSException in kerberized environment
> -
>
> Key: YARN-8108
> URL: https://issues.apache.org/jira/browse/YARN-8108
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Kshitij Badani
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-8108.001.patch
>
>
> Test is trying to pull up metrics data from SHS after kiniting as 'test_user'
> It is throwing GSSException as follows
> {code:java}
> b2b460b80713|RUNNING: curl --silent -k -X GET -D 
> /hwqe/hadoopqe/artifacts/tmp-94845 --negotiate -u : 
> http://rm_host:8088/proxy/application_1518674952153_0070/metrics/json2018-02-15
>  07:15:48,757|INFO|MainThread|machine.py:194 - 
> run()||GUID=fc5a3266-28f8-4eed-bae2-b2b460b80713|Exit Code: 0
> 2018-02-15 07:15:48,758|INFO|MainThread|spark.py:1757 - 
> getMetricsJsonData()|metrics:
> 
> 
> 
> Error 403 GSSException: Failure unspecified at GSS-API level 
> (Mechanism level: Request is a replay (34))
> 
> HTTP ERROR 403
> Problem accessing /proxy/application_1518674952153_0070/metrics/json. 
> Reason:
>  GSSException: Failure unspecified at GSS-API level (Mechanism level: 
> Request is a replay (34))
> 
> 
> {code}
> Rootcausing : proxyserver on RM can't be supported for Kerberos enabled 
> cluster because AuthenticationFilter is applied twice in Hadoop code (once in 
> httpServer2 for RM, and another instance from AmFilterInitializer for proxy 
> server). This will require code changes to hadoop-yarn-server-web-proxy 
> project



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"

2018-05-23 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487810#comment-16487810
 ] 

Yongjun Zhang commented on YARN-8346:
-

Thanks a lot for the quick turnaround [~jlowe] and [~kkaranasos].


> Upgrading to 3.1 kills running containers with error "Opportunistic container 
> queue is full"
> 
>
> Key: YARN-8346
> URL: https://issues.apache.org/jira/browse/YARN-8346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0, 3.0.2
>Reporter: Rohith Sharma K S
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-8346.001.patch
>
>
> It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the 
> running containers are killed and second attempt is launched for that 
> application. The diagnostics message is "Opportunistic container queue is 
> full" which is the reason for container killed. 
> In NM log, I see below logs for after container is recovered.
> {noformat}
> 2018-05-23 17:18:50,655 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler:
>  Opportunistic container [container_e06_1527075664705_0001_01_01] will 
> not be queued at the NMsince max queue length [0] has been reached
> {noformat}
> Following steps are executed for rolling upgrade
> # Install 2.8.4 cluster and launch a MR job with distributed cache enabled.
> # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration.
> # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8108) RM metrics rest API throws GSSException in kerberized environment

2018-05-23 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487806#comment-16487806
 ] 

Yongjun Zhang commented on YARN-8108:
-

Hi [~eyang],

It seems the issue also exists in 3.0.2 release. The above discussion indicates 
that it might take some time for the solution to converge, should we move 3.0.3 
out of the target release and list this jira as a known issue for 3.0.3? or we 
should fix this issue in 3.0.3?

Thanks.


> RM metrics rest API throws GSSException in kerberized environment
> -
>
> Key: YARN-8108
> URL: https://issues.apache.org/jira/browse/YARN-8108
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Kshitij Badani
>Assignee: Eric Yang
>Priority: Blocker
> Attachments: YARN-8108.001.patch
>
>
> Test is trying to pull up metrics data from SHS after kiniting as 'test_user'
> It is throwing GSSException as follows
> {code:java}
> b2b460b80713|RUNNING: curl --silent -k -X GET -D 
> /hwqe/hadoopqe/artifacts/tmp-94845 --negotiate -u : 
> http://rm_host:8088/proxy/application_1518674952153_0070/metrics/json2018-02-15
>  07:15:48,757|INFO|MainThread|machine.py:194 - 
> run()||GUID=fc5a3266-28f8-4eed-bae2-b2b460b80713|Exit Code: 0
> 2018-02-15 07:15:48,758|INFO|MainThread|spark.py:1757 - 
> getMetricsJsonData()|metrics:
> 
> 
> 
> Error 403 GSSException: Failure unspecified at GSS-API level 
> (Mechanism level: Request is a replay (34))
> 
> HTTP ERROR 403
> Problem accessing /proxy/application_1518674952153_0070/metrics/json. 
> Reason:
>  GSSException: Failure unspecified at GSS-API level (Mechanism level: 
> Request is a replay (34))
> 
> 
> {code}
> Rootcausing : proxyserver on RM can't be supported for Kerberos enabled 
> cluster because AuthenticationFilter is applied twice in Hadoop code (once in 
> httpServer2 for RM, and another instance from AmFilterInitializer for proxy 
> server). This will require code changes to hadoop-yarn-server-web-proxy 
> project



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"

2018-05-23 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487803#comment-16487803
 ] 

Konstantinos Karanasos commented on YARN-8346:
--

Thanks for the patch, [~jlowe].

Indeed you are right –  the problem is the lack of execution type. The queue 
size should remain 0 given that opportunistic containers are disabled in this 
case.

+1 for the patch.

> Upgrading to 3.1 kills running containers with error "Opportunistic container 
> queue is full"
> 
>
> Key: YARN-8346
> URL: https://issues.apache.org/jira/browse/YARN-8346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0, 3.0.2
>Reporter: Rohith Sharma K S
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-8346.001.patch
>
>
> It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the 
> running containers are killed and second attempt is launched for that 
> application. The diagnostics message is "Opportunistic container queue is 
> full" which is the reason for container killed. 
> In NM log, I see below logs for after container is recovered.
> {noformat}
> 2018-05-23 17:18:50,655 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler:
>  Opportunistic container [container_e06_1527075664705_0001_01_01] will 
> not be queued at the NMsince max queue length [0] has been reached
> {noformat}
> Following steps are executed for rolling upgrade
> # Install 2.8.4 cluster and launch a MR job with distributed cache enabled.
> # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration.
> # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487778#comment-16487778
 ] 

Giovanni Matteo Fumarola edited comment on YARN-8344 at 5/23/18 6:22 PM:
-

Attached v2 with the fix for Check style warning.

If any test in this class fails all the other tests will fail (same behavior in 
Windows or Linux).

testContainerResourceIncreaseIsSynchronizedWithRMResync fails in Windows - due 
the length of log directory. This patch will fix testKillContainersOnResync

java.io.IOException: Cannot launch container using script at path 
F:/short/hadoop-trunk-win/s/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerResync/nm0/usercache/nobody/appcache/application_0_/container_0__01_00/default_container_executor.cmd,
 because it exceeds the maximum supported path length of 260 characters. 
Consider configuring shorter directories in yarn.nodemanager.local-dirs.

I saw a bunch of tests failing in windows for this reason. I will open a Jira 
to track this fix.


was (Author: giovanni.fumarola):
Attached v2 with the fix for Check style warning.

If any test in this class fails all the other tests will fail (same behavior in 
Windows or Linux).

testContainerResourceIncreaseIsSynchronizedWithRMResync fails in Windows - 
still figuring out the root cause. This patch will fix 
testKillContainersOnResync

> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync
> -
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8344:
---
Summary: Missing nm.close() in TestNodeManagerResync to fix 
testKillContainersOnResync  (was: Missing nm.close() in TestNodeManagerResync 
to fix testKillContainersOnResync on Windows)

> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync
> -
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"

2018-05-23 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-8346:
-
Attachment: YARN-8346.001.patch

> Upgrading to 3.1 kills running containers with error "Opportunistic container 
> queue is full"
> 
>
> Key: YARN-8346
> URL: https://issues.apache.org/jira/browse/YARN-8346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0, 3.0.2
>Reporter: Rohith Sharma K S
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-8346.001.patch
>
>
> It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the 
> running containers are killed and second attempt is launched for that 
> application. The diagnostics message is "Opportunistic container queue is 
> full" which is the reason for container killed. 
> In NM log, I see below logs for after container is recovered.
> {noformat}
> 2018-05-23 17:18:50,655 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler:
>  Opportunistic container [container_e06_1527075664705_0001_01_01] will 
> not be queued at the NMsince max queue length [0] has been reached
> {noformat}
> Following steps are executed for rolling upgrade
> # Install 2.8.4 cluster and launch a MR job with distributed cache enabled.
> # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration.
> # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487778#comment-16487778
 ] 

Giovanni Matteo Fumarola commented on YARN-8344:


Attached v2 with the fix for Check style warning.

If any test in this class fails all the other tests will fail (same behavior in 
Windows or Linux).

testContainerResourceIncreaseIsSynchronizedWithRMResync fails in Windows - 
still figuring out the root cause. This patch will fix 
testKillContainersOnResync

> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows
> 
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8336) Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487775#comment-16487775
 ] 

Íñigo Goiri commented on YARN-8336:
---

Both 
[TestLogsCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestLogsCLI/]
 and 
[TestSchedConfCLI|https://builds.apache.org/job/PreCommit-YARN-Build/20832/testReport/org.apache.hadoop.yarn.client.cli/TestSchedConfCLI/]
 pass.
+1
Feel free to commit.

> Fix potential connection leak in SchedConfCLI and YarnWebServiceUtils
> -
>
> Key: YARN-8336
> URL: https://issues.apache.org/jira/browse/YARN-8336
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8336.v1.patch, YARN-8336.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-8334) Fix potential connection leak in GPGUtils

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487766#comment-16487766
 ] 

Íñigo Goiri edited comment on YARN-8334 at 5/23/18 6:06 PM:


The TestPolicyGenerator unit test runs 
[here|https://builds.apache.org/job/PreCommit-YARN-Build/20833/testReport/org.apache.hadoop.yarn.server.globalpolicygenerator.policygenerator/TestPolicyGenerator/].
+1
Feel free to commit to the branch.


was (Author: elgoiri):
The TestPolicyGenerator unit test runs 
[here|https://builds.apache.org/job/PreCommit-YARN-Build/20833/testReport/org.apache.hadoop.yarn.server.globalpolicygenerator.policygenerator/TestPolicyGenerator/].
+1
Committing.

> Fix potential connection leak in GPGUtils
> -
>
> Key: YARN-8334
> URL: https://issues.apache.org/jira/browse/YARN-8334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Minor
> Attachments: YARN-8334-YARN-7402.v1.patch, 
> YARN-8334-YARN-7402.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8334) Fix potential connection leak in GPGUtils

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487766#comment-16487766
 ] 

Íñigo Goiri commented on YARN-8334:
---

The TestPolicyGenerator unit test runs 
[here|https://builds.apache.org/job/PreCommit-YARN-Build/20833/testReport/org.apache.hadoop.yarn.server.globalpolicygenerator.policygenerator/TestPolicyGenerator/].
+1
Committing.

> Fix potential connection leak in GPGUtils
> -
>
> Key: YARN-8334
> URL: https://issues.apache.org/jira/browse/YARN-8334
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Minor
> Attachments: YARN-8334-YARN-7402.v1.patch, 
> YARN-8334-YARN-7402.v2.patch
>
>
> Missing ClientResponse.close and Client.destroy can lead to a connection leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows

2018-05-23 Thread Giovanni Matteo Fumarola (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giovanni Matteo Fumarola updated YARN-8344:
---
Attachment: YARN-8344.v2.patch

> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows
> 
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8344.v1.patch, YARN-8344.v2.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows

2018-05-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487762#comment-16487762
 ] 

Íñigo Goiri commented on YARN-8344:
---

Why does this fail on Windows specifically?

> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows
> 
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8344.v1.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8344) Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync on Windows

2018-05-23 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/YARN-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated YARN-8344:
--
Description: Missing nm.close() in TestNodeManagerResync to fix 
testKillContainersOnResync on Windows.

> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows
> 
>
> Key: YARN-8344
> URL: https://issues.apache.org/jira/browse/YARN-8344
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Giovanni Matteo Fumarola
>Assignee: Giovanni Matteo Fumarola
>Priority: Major
> Attachments: YARN-8344.v1.patch
>
>
> Missing nm.close() in TestNodeManagerResync to fix testKillContainersOnResync 
> on Windows.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-8346) Upgrading to 3.1 kills running containers with error "Opportunistic container queue is full"

2018-05-23 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-8346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe reassigned YARN-8346:


Assignee: Jason Lowe

> Upgrading to 3.1 kills running containers with error "Opportunistic container 
> queue is full"
> 
>
> Key: YARN-8346
> URL: https://issues.apache.org/jira/browse/YARN-8346
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.0, 3.0.2
>Reporter: Rohith Sharma K S
>Assignee: Jason Lowe
>Priority: Blocker
>
> It is observed while rolling upgrade from 2.8.4 to 3.1 release, all the 
> running containers are killed and second attempt is launched for that 
> application. The diagnostics message is "Opportunistic container queue is 
> full" which is the reason for container killed. 
> In NM log, I see below logs for after container is recovered.
> {noformat}
> 2018-05-23 17:18:50,655 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.scheduler.ContainerScheduler:
>  Opportunistic container [container_e06_1527075664705_0001_01_01] will 
> not be queued at the NMsince max queue length [0] has been reached
> {noformat}
> Following steps are executed for rolling upgrade
> # Install 2.8.4 cluster and launch a MR job with distributed cache enabled.
> # Stop 2.8.4 RM. Start 3.1.0 RM with same configuration.
> # Stop 2.8.4 NM batch by batch. Start 3.1.0 NM batch by batch. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4599) Set OOM control for memory cgroups

2018-05-23 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487749#comment-16487749
 ] 

Haibo Chen commented on YARN-4599:
--

+1 on the latest patch. Will check it in later today if no objections

> Set OOM control for memory cgroups
> --
>
> Key: YARN-4599
> URL: https://issues.apache.org/jira/browse/YARN-4599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 2.9.0
>Reporter: Karthik Kambatla
>Assignee: Miklos Szegedi
>Priority: Major
>  Labels: oct16-medium
> Attachments: Elastic Memory Control in YARN.pdf, YARN-4599.000.patch, 
> YARN-4599.001.patch, YARN-4599.002.patch, YARN-4599.003.patch, 
> YARN-4599.004.patch, YARN-4599.005.patch, YARN-4599.006.patch, 
> YARN-4599.007.patch, YARN-4599.008.patch, YARN-4599.009.patch, 
> YARN-4599.010.patch, YARN-4599.011.patch, YARN-4599.012.patch, 
> YARN-4599.013.patch, YARN-4599.014.patch, YARN-4599.015.patch, 
> YARN-4599.016.patch, YARN-4599.sandflee.patch, yarn-4599-not-so-useful.patch
>
>
> YARN-1856 adds memory cgroups enforcing support. We should also explicitly 
> set OOM control so that containers are not killed as soon as they go over 
> their usage. Today, one could set the swappiness to control this, but 
> clusters with swap turned off exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >