[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-08-16 Thread Manikandan R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909600#comment-16909600
 ] 

Manikandan R commented on YARN-6492:


[~eepayne] Observations mentioned earlier are important ones which had come up 
as part of iterative development. I think this whole PartitionQueueMetrics 
feature won't be in usable state without these fixes.

At the same time, I am totally OK with having separate JIRA's for ease of 
tracking, assuming that we would be marking this whole feature as complete only 
after this new JIRA related to issues has been fixed.

Reg the structure, 

Yes, we would like to sync with UI, Rest API etc like discussed very earlier in 
this JIRA.

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909556#comment-16909556
 ] 

Hadoop QA commented on YARN-9728:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  7s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 256 unchanged - 0 fixed = 257 total (was 256) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
23s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 48s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
38s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 79m 
31s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}170m 38s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Useless condition:< 65536 (0x1) because variable type is char  At 
AppInfo.java:[line 376] |
| 

[jira] [Commented] (YARN-9677) Make FpgaDevice and GpuDevice classes more similar to each other

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909549#comment-16909549
 ] 

Hadoop QA commented on YARN-9677:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
12s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 4 new + 74 unchanged - 2 fixed = 78 total (was 76) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 22m 
58s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 85m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9677 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977849/YARN-9677.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 07832c5157bb 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8d754c2 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24589/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24589/testReport/ |
| Max. process+thread count | 323 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 

[jira] [Updated] (YARN-9677) Make FpgaDevice and GpuDevice classes more similar to each other

2019-08-16 Thread kevin su (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kevin su updated YARN-9677:
---
Attachment: YARN-9677.001.patch

> Make FpgaDevice and GpuDevice classes more similar to each other
> 
>
> Key: YARN-9677
> URL: https://issues.apache.org/jira/browse/YARN-9677
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: kevin su
>Priority: Major
>  Labels: newbie, newbie++
> Attachments: YARN-9677.001.patch
>
>
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.fpga.FpgaResourceAllocator.FpgaDevice
>  is an inner class of FpgaResourceAllocator.
> It is not only being used from its parent class but from other classes as 
> well so we are losing the purpose of the inner class, it does not really make 
> sense.
> We also have 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDevice
>  which is a similar class, but for GPU devices.
> What we could do here is to make FpgaDevice a single class and harmonize the 
> packages of these 2 classes, meaning they should be "closer" to each other in 
> terms of packaging.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909468#comment-16909468
 ] 

Hadoop QA commented on YARN-9217:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
30s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
54s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
18s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  0s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 224 unchanged - 2 fixed = 225 total (was 226) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
19s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
57s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
32s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}105m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:63396be |
| JIRA Issue | YARN-9217 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977802/YARN-9217.branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux b0cf36296635 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 

[jira] [Commented] (YARN-9509) Capped cpu usage with cgroup strict-resource-usage based on a mulitplier

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909345#comment-16909345
 ] 

Hadoop QA commented on YARN-9509:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
43s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 49s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
19s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  6s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 5 new + 220 unchanged - 0 fixed = 225 total (was 220) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
4s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
48s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 43s{color} 
| {color:red} hadoop-yarn-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 10s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.api.impl.TestTimelineClientV2Impl |
|   | 

[jira] [Updated] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-08-16 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9728:

Description: 
When a spark job throws an exception with a message containing a character out 
of the range supported by xml 1.0, then
 the application fails and the stack trace will be stored into the 
{{diagnostics}} field. So far, so good.

But the issue occurred when we try to get application information with the 
ResourceManager REST API
 The xml response will contain the illegal xml 1.0 char and will be invalid.

 *+Examples of illegals characters in xml 1.0 :+* 
 * {{\u}}
 * {{\u0001}}
 * {{\u0002}}
 * {{\u0003}}
 * {{\u0004}}

_For more information about supported characters :_
 [https://www.w3.org/TR/xml/#charsets]

*+Example of illegal response from the Ressource Manager API :+* 
{code:xml}


  application_1326821518301_0005
  user1
  job
  a1
  FINISHED
  FAILED
  100.0
  History
  
http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
  Exception in thread "main" java.lang.Exception: \u0001
at com..main(JobWithSpecialCharMain.java:6)

  [...]


{code}
 

*+Example of job to reproduce :+*
{code:java}
public class JobWithSpecialCharMain {

 public static void main(String[] args) throws Exception {
  throw new Exception("\u0001");
 }

}
{code}

javac -d . JobWithSpecialCharMain.java
jar cvf repro.jar com/
spark-submit --class com.JobWithSpecialCharMain --master yarn-cluster repro.jar



!IllegalResponseChrome.png!

  was:
When a spark job throws an exception with a message containing a character out 
of the range supported by xml 1.0, then
 the application fails and the stack trace will be stored into the 
{{diagnostics}} field. So far, so good.

But the issue occurred when we try to get application information with the 
ResourceManager REST API
 The xml response will contain the illegal xml 1.0 char and will be invalid.

 *+Examples of illegals characters in xml 1.0 :+* 
 * {{\u}}
 * {{\u0001}}
 * {{\u0002}}
 * {{\u0003}}
 * {{\u0004}}

_For more information about supported characters :_
 [https://www.w3.org/TR/xml/#charsets]

*+Example of illegal response from the Ressource Manager API :+* 
{code:xml}


  application_1326821518301_0005
  user1
  job
  a1
  FINISHED
  FAILED
  100.0
  History
  
http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
  Exception in thread "main" java.lang.Exception: \u0001
at com..main(JobWithSpecialCharMain.java:6)

  [...]


{code}
 

*+Example of job to reproduce :+*
{code:java}
public class JobWithSpecialCharMain {

 public static void main(String[] args) throws Exception {
  throw new Exception("\u0001");
 }

}
{code}
!IllegalResponseChrome.png!


>  ResourceManager REST API can produce an illegal xml response
> -
>
> Key: YARN-9728
> URL: https://issues.apache.org/jira/browse/YARN-9728
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, resourcemanager
>Affects Versions: 2.7.3
>Reporter: Thomas
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: IllegalResponseChrome.png
>
>
> When a spark job throws an exception with a message containing a character 
> out of the range supported by xml 1.0, then
>  the application fails and the stack trace will be stored into the 
> {{diagnostics}} field. So far, so good.
> But the issue occurred when we try to get application information with the 
> ResourceManager REST API
>  The xml response will contain the illegal xml 1.0 char and will be invalid.
>  *+Examples of illegals characters in xml 1.0 :+* 
>  * {{\u}}
>  * {{\u0001}}
>  * {{\u0002}}
>  * {{\u0003}}
>  * {{\u0004}}
> _For more information about supported characters :_
>  [https://www.w3.org/TR/xml/#charsets]
> *+Example of illegal response from the Ressource Manager API :+* 
> {code:xml}
> 
> 
>   application_1326821518301_0005
>   user1
>   job
>   a1
>   FINISHED
>   FAILED
>   100.0
>   History
>   
> http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
>   Exception in thread "main" java.lang.Exception: \u0001
>   at com..main(JobWithSpecialCharMain.java:6)
>   [...]
> 
> {code}
>  
> *+Example of job to reproduce :+*
> {code:java}
> public class JobWithSpecialCharMain {
>  public static void main(String[] args) throws Exception {
>   throw new Exception("\u0001");
>  }
> }
> {code}
> javac -d . JobWithSpecialCharMain.java
> jar cvf repro.jar com/
> spark-submit --class com.JobWithSpecialCharMain --master yarn-cluster 
> repro.jar
> !IllegalResponseChrome.png!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (YARN-9728)  ResourceManager REST API can produce an illegal xml response

2019-08-16 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9728:

Attachment: YARN-9728-001.patch

>  ResourceManager REST API can produce an illegal xml response
> -
>
> Key: YARN-9728
> URL: https://issues.apache.org/jira/browse/YARN-9728
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: api, resourcemanager
>Affects Versions: 2.7.3
>Reporter: Thomas
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: IllegalResponseChrome.png, YARN-9728-001.patch
>
>
> When a spark job throws an exception with a message containing a character 
> out of the range supported by xml 1.0, then
>  the application fails and the stack trace will be stored into the 
> {{diagnostics}} field. So far, so good.
> But the issue occurred when we try to get application information with the 
> ResourceManager REST API
>  The xml response will contain the illegal xml 1.0 char and will be invalid.
>  *+Examples of illegals characters in xml 1.0 :+* 
>  * {{\u}}
>  * {{\u0001}}
>  * {{\u0002}}
>  * {{\u0003}}
>  * {{\u0004}}
> _For more information about supported characters :_
>  [https://www.w3.org/TR/xml/#charsets]
> *+Example of illegal response from the Ressource Manager API :+* 
> {code:xml}
> 
> 
>   application_1326821518301_0005
>   user1
>   job
>   a1
>   FINISHED
>   FAILED
>   100.0
>   History
>   
> http://host.domain.com:8088/proxy/application_1326821518301_0005/jobhistory/job/job_1326821518301_5_5
>   Exception in thread "main" java.lang.Exception: \u0001
>   at com..main(JobWithSpecialCharMain.java:6)
>   [...]
> 
> {code}
>  
> *+Example of job to reproduce :+*
> {code:java}
> public class JobWithSpecialCharMain {
>  public static void main(String[] args) throws Exception {
>   throw new Exception("\u0001");
>  }
> }
> {code}
> javac -d . JobWithSpecialCharMain.java
> jar cvf repro.jar com/
> spark-submit --class com.JobWithSpecialCharMain --master yarn-cluster 
> repro.jar
> !IllegalResponseChrome.png!



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9009) Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909326#comment-16909326
 ] 

Hadoop QA commented on YARN-9009:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m 12s{color} 
| {color:red} https://github.com/apache/hadoop/pull/438 does not apply to 
trunk. Rebase required? Wrong Branch? See 
https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| GITHUB PR | https://github.com/apache/hadoop/pull/438 |
| JIRA Issue | YARN-9009 |
| Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-438/5/console |
| versions | git=2.7.4 |
| Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |


This message was automatically generated.



> Fix flaky test TestEntityGroupFSTimelineStore.testCleanLogs
> ---
>
> Key: YARN-9009
> URL: https://issues.apache.org/jira/browse/YARN-9009
> Project: Hadoop YARN
>  Issue Type: Bug
> Environment: Ubuntu 18.04
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
>  
> Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 
> 2018-06-17T13:33:14-05:00)
>Reporter: OrDTesters
>Assignee: OrDTesters
>Priority: Minor
> Fix For: 3.0.4, 3.1.2, 3.3.0, 3.2.1
>
> Attachments: YARN-9009-trunk-001.patch
>
>
> In TestEntityGroupFSTimelineStore, testCleanLogs fails when run after 
> testMoveToDone.
> testCleanLogs fails because testMoveToDone moves a file into the same 
> directory that testCleanLogs cleans, causing testCleanLogs to clean 3 files, 
> instead of 2 as testCleanLogs expects.
> To fix the failure of testCleanLogs, we can delete the file after the file is 
> moved by testMoveToDone.
> Pull request link: [https://github.com/apache/hadoop/pull/438]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9468) Fix inaccurate documentations in Placement Constraints

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909322#comment-16909322
 ] 

Hadoop QA commented on YARN-9468:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
34s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
33m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 50s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-717/6/artifact/out/Dockerfile
 |
| GITHUB PR | https://github.com/apache/hadoop/pull/717 |
| JIRA Issue | YARN-9468 |
| Optional Tests | dupname asflicense mvnsite |
| uname | Linux 5177e514a60f 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 
10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / e356e4f |
| Max. process+thread count | 308 (vs. ulimit of 5500) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site |
| Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-717/6/console |
| versions | git=2.7.4 maven=3.3.9 |
| Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |


This message was automatically generated.



> Fix inaccurate documentations in Placement Constraints
> --
>
> Key: YARN-9468
> URL: https://issues.apache.org/jira/browse/YARN-9468
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.2.0
>Reporter: hunshenshi
>Assignee: hunshenshi
>Priority: Major
>
> Document Placement Constraints
> *First* 
> {code:java}
> zk=3,NOTIN,NODE,zk:hbase=5,IN,RACK,zk:spark=7,CARDINALITY,NODE,hbase,1,3{code}
>  * place 5 containers with tag “hbase” with affinity to a rack on which 
> containers with tag “zk” are running (i.e., an “hbase” container 
> should{color:#ff} not{color} be placed at a rack where an “zk” container 
> is running, given that “zk” is the TargetTag of the second constraint);
> The _*not*_ word in brackets should be delete.
>  
> *Second*
> {code:java}
> PlacementSpec => "" | KeyVal;PlacementSpec
> {code}
> The semicolon should be replaced by colon
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-9756) Create metric that sums total memory/vcores preempted per round

2019-08-16 Thread Eric Payne (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne reassigned YARN-9756:


Assignee: Eric Payne

> Create metric that sums total memory/vcores preempted per round
> ---
>
> Key: YARN-9756
> URL: https://issues.apache.org/jira/browse/YARN-9756
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Affects Versions: 3.2.0, 2.9.2, 3.0.3, 2.8.5, 3.1.2
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9579) the property of sharedcache in mapred-default.xml

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909290#comment-16909290
 ] 

Hadoop QA commented on YARN-9579:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
45s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
29m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  5m 
21s{color} | {color:green} hadoop-mapreduce-client-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-848/7/artifact/out/Dockerfile
 |
| GITHUB PR | https://github.com/apache/hadoop/pull/848 |
| JIRA Issue | YARN-9579 |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient xml |
| uname | Linux 80e03ac9b1de 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / e356e4f |
| Default Java | 1.8.0_222 |
|  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-848/7/testReport/ |
| Max. process+thread count | 1641 (vs. ulimit of 5500) |
| modules | C: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core 
U: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core |
| Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-848/7/console |
| versions | git=2.7.4 maven=3.3.9 |
| Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |


This message was automatically generated.



> the property of sharedcache in mapred-default.xml
> -
>
> Key: YARN-9579
> URL: https://issues.apache.org/jira/browse/YARN-9579
> Project: Hadoop 

[jira] [Commented] (YARN-9479) Change String.equals to Objects.equals(String,String) to avoid possible NullPointerException

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909259#comment-16909259
 ] 

Hadoop QA commented on YARN-9479:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  9s{color} 
| {color:red} https://github.com/apache/hadoop/pull/738 does not apply to 
trunk. Rebase required? Wrong Branch? See 
https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| GITHUB PR | https://github.com/apache/hadoop/pull/738 |
| JIRA Issue | YARN-9479 |
| Console output | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-738/8/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.10.0 http://yetus.apache.org |


This message was automatically generated.



> Change String.equals to Objects.equals(String,String) to avoid possible 
> NullPointerException
> 
>
> Key: YARN-9479
> URL: https://issues.apache.org/jira/browse/YARN-9479
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: bd2019us
>Priority: Major
> Attachments: 1.patch
>
>
> Hello,
> I found that the String "queueName"  may cause potential risk of 
> NullPointerException since it is immediately used after initialization and 
> there is no null checker. One recommended API is 
> Objects.equals(String,String) which can avoid this exception.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9756) Create metric that sums total memory/vcores preempted per round

2019-08-16 Thread Eric Payne (JIRA)
Eric Payne created YARN-9756:


 Summary: Create metric that sums total memory/vcores preempted per 
round
 Key: YARN-9756
 URL: https://issues.apache.org/jira/browse/YARN-9756
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacity scheduler
Affects Versions: 3.1.2, 2.8.5, 3.0.3, 2.9.2, 3.2.0
Reporter: Eric Payne






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9755) RM fails to start with FileSystemBasedConfigurationProvider

2019-08-16 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909204#comment-16909204
 ] 

Prabhu Joseph commented on YARN-9755:
-

[~eyang] Can you review this Jira when you get time. This fixes RM failing to 
start when FileSystemBasedConfigurationProvider is configured. Thanks.

> RM fails to start with FileSystemBasedConfigurationProvider
> ---
>
> Key: YARN-9755
> URL: https://issues.apache.org/jira/browse/YARN-9755
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9755-001.patch
>
>
> RM fails to start with below exception when 
> FileSystemBasedConfigurationProvider is used.
> *Exception:*
> {code}
> 2019-08-16 12:05:33,802 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
> java.io.IOException: Filesystem closed
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567)
> Caused by: java.io.IOException: java.io.IOException: Filesystem closed
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> ... 14 more
> Caused by: java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701)
> at 
> org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56)
> {code}
> FileSystemBasedConfigurationProvider uses the cached FileSystem causing the 
> issue.
> *Configs:*
> {code}
> yarn.resourcemanager.configuration.provider-classorg.apache.hadoop.yarn.FileSystemBasedConfigurationProvider
> yarn.resourcemanager.configuration.file-system-based-store/yarn/conf
> [yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
> -rw-r--r--   3 yarn supergroup   4138 

[jira] [Commented] (YARN-9755) RM fails to start with FileSystemBasedConfigurationProvider

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909128#comment-16909128
 ] 

Hadoop QA commented on YARN-9755:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 50s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
36s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9755 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977799/YARN-9755-001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 08c2353965ad 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9b8359b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24586/testReport/ |
| Max. process+thread count | 413 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24586/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RM fails to start 

[jira] [Updated] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing

2019-08-16 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9217:
---
Attachment: YARN-9217.branch-3.2.001.patch

> Nodemanager will fail to start if GPU is misconfigured on the node or GPU 
> drivers missing
> -
>
> Key: YARN-9217
> URL: https://issues.apache.org/jira/browse/YARN-9217
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9217.001.patch, YARN-9217.002.patch, 
> YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, 
> YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, 
> YARN-9217.009.patch, YARN-9217.010.patch, YARN-9217.011.patch, 
> YARN-9217.012.patch, YARN-9217.branch-3.2.001.patch
>
>
> Nodemanager will not start
> 1. If Autodiscovery is enabled:
>  * If nvidia-smi path is misconfigured or the file does not exist.
>  * There is 0 GPU found
>  * If the file exists but it is not pointing to an nvidia-smi
>  * if the binary is ok but there is an IOException
> 2. If the manually configured GPU devices are misconfigured
>  * Any index:minor number format failure will cause a problem
>  * 0 configured device will cause a problem
>  * NumberFormatException is not handled
> It would be a better option to add warnings about the configuration, set 0 
> available GPUs and let the node work and run non-gpu jobs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909076#comment-16909076
 ] 

Hadoop QA commented on YARN-9217:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
0s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  8s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 225 unchanged - 2 fixed = 226 total (was 227) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
57s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
44s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m  
2s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}111m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-9217 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977794/YARN-9217.012.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 7bce830ab4ae 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 

[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9100:
-
Hadoop Flags: Reviewed

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.1.002.patch, YARN-9100.branch-3.2.001.patch, 
> YARN-9100.branch-3.2.002.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909050#comment-16909050
 ] 

Szilard Nemeth commented on YARN-9100:
--

Thanks for the latest patches [~pbacsko]!
Committed them to 3.2 / 3.1 branches.

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.1.002.patch, YARN-9100.branch-3.2.001.patch, 
> YARN-9100.branch-3.2.002.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9748) Allow capacity-scheduler configuration on HDFS

2019-08-16 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909049#comment-16909049
 ] 

Prabhu Joseph commented on YARN-9748:
-

[~cane] FileSystemBasedConfigurationProvider allows capacity-scheduler 
configuration on HDFS. Can you check if that serves this. 
{code:java}
yarn.resourcemanager.configuration.provider-class = 
org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider

yarn.resourcemanager.configuration.file-system-based-store=/yarn/conf

[yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
-rw-r--r--   3 yarn supergroup   4138 2019-08-16 13:09 
/yarn/conf/capacity-scheduler.xml
-rw-r--r--   3 yarn supergroup494 2019-08-16 11:41 
/yarn/conf/core-site.xml
-rw-r--r--   3 yarn supergroup  11392 2019-08-16 11:52 
/yarn/conf/hadoop-policy.xml
-rw-r--r--   3 yarn supergroup  11492 2019-08-16 11:41 
/yarn/conf/yarn-site.xml
{code}

> Allow capacity-scheduler configuration on HDFS
> --
>
> Key: YARN-9748
> URL: https://issues.apache.org/jira/browse/YARN-9748
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler, capacityscheduler
>Affects Versions: 3.1.2
>Reporter: zhoukang
>Assignee: Prabhu Joseph
>Priority: Major
>
> Improvement:
> Support auto reload from hdfs



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9752) Add support for allocation id in SLS.

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909047#comment-16909047
 ] 

Hadoop QA commented on YARN-9752:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 20s{color} | {color:orange} hadoop-tools/hadoop-sls: The patch generated 1 
new + 44 unchanged - 2 fixed = 45 total (was 46) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 49s{color} 
| {color:red} hadoop-sls in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.sls.appmaster.TestAMSimulator |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-9752 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977786/YARN-9752.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 20c534f260cb 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 
08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 9b8359b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/24583/artifact/out/diff-checkstyle-hadoop-tools_hadoop-sls.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/24583/artifact/out/patch-unit-hadoop-tools_hadoop-sls.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24583/testReport/ |
| Max. process+thread count | 439 (vs. ulimit of 5500) |
| modules | C: hadoop-tools/hadoop-sls U: 

[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909044#comment-16909044
 ] 

Hadoop QA commented on YARN-9100:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 0s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 9 unchanged - 6 fixed = 9 total (was 15) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
43s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 69m 46s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:63396be |
| JIRA Issue | YARN-9100 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977787/YARN-9100.branch-3.2.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4f50a46f6a30 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / df61637 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24585/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/24585/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 413 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 

[jira] [Updated] (YARN-9755) RM fails to start with FileSystemBasedConfigurationProvider

2019-08-16 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9755:

Attachment: YARN-9755-001.patch

> RM fails to start with FileSystemBasedConfigurationProvider
> ---
>
> Key: YARN-9755
> URL: https://issues.apache.org/jira/browse/YARN-9755
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9755-001.patch
>
>
> RM fails to start with below exception when 
> FileSystemBasedConfigurationProvider is used.
> *Exception:*
> {code}
> 2019-08-16 12:05:33,802 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
> java.io.IOException: Filesystem closed
> at 
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
> at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379)
> at 
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567)
> Caused by: java.io.IOException: java.io.IOException: Filesystem closed
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445)
> at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
> ... 14 more
> Caused by: java.io.IOException: Filesystem closed
> at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701)
> at 
> org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56)
> {code}
> FileSystemBasedConfigurationProvider uses the cached FileSystem causing the 
> issue.
> *Configs:*
> {code}
> yarn.resourcemanager.configuration.provider-classorg.apache.hadoop.yarn.FileSystemBasedConfigurationProvider
> yarn.resourcemanager.configuration.file-system-based-store/yarn/conf
> [yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
> -rw-r--r--   3 yarn supergroup   4138 2019-08-16 13:09 
> /yarn/conf/capacity-scheduler.xml
> -rw-r--r--   3 yarn supergroup494 2019-08-16 11:41 
> /yarn/conf/core-site.xml
> -rw-r--r--   3 

[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909040#comment-16909040
 ] 

Hadoop QA commented on YARN-9100:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
41s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 9 unchanged - 6 fixed = 9 total (was 15) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m  
1s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
37s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 68m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:63396be |
| JIRA Issue | YARN-9100 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977787/YARN-9100.branch-3.2.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0c2b596601c3 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / df61637 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24582/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-YARN-Build/24582/artifact/out/patch-asflicense-problems.txt
 |
| Max. process+thread count | 412 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 

[jira] [Commented] (YARN-2599) Standby RM should also expose some jmx and metrics

2019-08-16 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909039#comment-16909039
 ] 

Sunil Govindan commented on YARN-2599:
--

Thanks [~cheersyang] for the review. Thanks [~rohithsharma] for the original 
patch. I jus rebased it.

I will commit later today if there are no objections.

 

> Standby RM should also expose some jmx and metrics
> --
>
> Key: YARN-2599
> URL: https://issues.apache.org/jira/browse/YARN-2599
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.5.1, 2.7.3, 3.0.0-alpha1
>Reporter: Karthik Kambatla
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-2599.002.patch, YARN-2599.patch
>
>
> YARN-1898 redirects jmx and metrics to the Active. As discussed there, we 
> need to separate out metrics displayed so the Standby RM can also be 
> monitored. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9755) RM fails to start with FileSystemBasedConfigurationProvider

2019-08-16 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created YARN-9755:
---

 Summary: RM fails to start with 
FileSystemBasedConfigurationProvider
 Key: YARN-9755
 URL: https://issues.apache.org/jira/browse/YARN-9755
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 3.3.0
Reporter: Prabhu Joseph
Assignee: Prabhu Joseph


RM fails to start with below exception when 
FileSystemBasedConfigurationProvider is used.

{code}
yarn.resourcemanager.configuration.provider-classorg.apache.hadoop.yarn.FileSystemBasedConfigurationProvider
yarn.resourcemanager.configuration.file-system-based-store/yarn/conf

[yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
-rw-r--r--   3 yarn supergroup   4138 2019-08-16 13:09 
/yarn/conf/capacity-scheduler.xml
-rw-r--r--   3 yarn supergroup494 2019-08-16 11:41 
/yarn/conf/core-site.xml
-rw-r--r--   3 yarn supergroup  11392 2019-08-16 11:52 
/yarn/conf/hadoop-policy.xml
-rw-r--r--   3 yarn supergroup  11492 2019-08-16 11:41 
/yarn/conf/yarn-site.xml

{code}


*Exception:*

{code}
2019-08-16 12:05:33,802 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
ResourceManager
org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
java.io.IOException: Filesystem closed
at 
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567)
Caused by: java.io.IOException: java.io.IOException: Filesystem closed
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
... 14 more
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701)
at 
org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56)
{code}

FileSystemBasedConfigurationProvider uses the cached FileSystem causing the 
issue.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9755) RM fails to start with FileSystemBasedConfigurationProvider

2019-08-16 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9755:

Description: 
RM fails to start with below exception when 
FileSystemBasedConfigurationProvider is used.

*Exception:*

{code}
2019-08-16 12:05:33,802 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
ResourceManager
org.apache.hadoop.service.ServiceStateException: java.io.IOException: 
java.io.IOException: Filesystem closed
at 
org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:173)
at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:109)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:868)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1281)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.reinitialize(ResourceManager.java:1312)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1335)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1328)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1328)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1379)
at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1567)
Caused by: java.io.IOException: java.io.IOException: Filesystem closed
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:64)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:346)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:445)
at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
... 14 more
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:475)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1682)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1586)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1583)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1598)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1701)
at 
org.apache.hadoop.yarn.FileSystemBasedConfigurationProvider.getConfigurationInputStream(FileSystemBasedConfigurationProvider.java:62)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.conf.FileBasedCSConfigurationProvider.loadConfiguration(FileBasedCSConfigurationProvider.java:56)
{code}

FileSystemBasedConfigurationProvider uses the cached FileSystem causing the 
issue.


*Configs:*

{code}
yarn.resourcemanager.configuration.provider-classorg.apache.hadoop.yarn.FileSystemBasedConfigurationProvider
yarn.resourcemanager.configuration.file-system-based-store/yarn/conf

[yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
-rw-r--r--   3 yarn supergroup   4138 2019-08-16 13:09 
/yarn/conf/capacity-scheduler.xml
-rw-r--r--   3 yarn supergroup494 2019-08-16 11:41 
/yarn/conf/core-site.xml
-rw-r--r--   3 yarn supergroup  11392 2019-08-16 11:52 
/yarn/conf/hadoop-policy.xml
-rw-r--r--   3 yarn supergroup  11492 2019-08-16 11:41 
/yarn/conf/yarn-site.xml

{code}

  was:
RM fails to start with below exception when 
FileSystemBasedConfigurationProvider is used.

{code}
yarn.resourcemanager.configuration.provider-classorg.apache.hadoop.yarn.FileSystemBasedConfigurationProvider
yarn.resourcemanager.configuration.file-system-based-store/yarn/conf

[yarn@yarndocker-1 yarn]$ hadoop fs -ls /yarn/conf
-rw-r--r--   3 yarn supergroup   4138 2019-08-16 13:09 
/yarn/conf/capacity-scheduler.xml
-rw-r--r--   3 yarn supergroup494 2019-08-16 11:41 

[jira] [Commented] (YARN-2599) Standby RM should also expose some jmx and metrics

2019-08-16 Thread Weiwei Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909036#comment-16909036
 ] 

Weiwei Yang commented on YARN-2599:
---

LGTM, that minor check-style issue can be fixed during commit. +1.

> Standby RM should also expose some jmx and metrics
> --
>
> Key: YARN-2599
> URL: https://issues.apache.org/jira/browse/YARN-2599
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.5.1, 2.7.3, 3.0.0-alpha1
>Reporter: Karthik Kambatla
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-2599.002.patch, YARN-2599.patch
>
>
> YARN-1898 redirects jmx and metrics to the Active. As discussed there, we 
> need to separate out metrics displayed so the Standby RM can also be 
> monitored. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909003#comment-16909003
 ] 

Hadoop QA commented on YARN-9100:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
51s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 12 unchanged - 6 fixed = 12 total (was 18) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 27s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
28s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}103m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:080e9d0f9b3 |
| JIRA Issue | YARN-9100 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/1298/YARN-9100.branch-3.1.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux da2d92c5d249 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 
08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 0a379e9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24580/testReport/ |
| Max. process+thread count | 307 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 

[jira] [Updated] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing

2019-08-16 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9217:
---
Attachment: YARN-9217.012.patch

> Nodemanager will fail to start if GPU is misconfigured on the node or GPU 
> drivers missing
> -
>
> Key: YARN-9217
> URL: https://issues.apache.org/jira/browse/YARN-9217
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9217.001.patch, YARN-9217.002.patch, 
> YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, 
> YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, 
> YARN-9217.009.patch, YARN-9217.010.patch, YARN-9217.011.patch, 
> YARN-9217.012.patch
>
>
> Nodemanager will not start
> 1. If Autodiscovery is enabled:
>  * If nvidia-smi path is misconfigured or the file does not exist.
>  * There is 0 GPU found
>  * If the file exists but it is not pointing to an nvidia-smi
>  * if the binary is ok but there is an IOException
> 2. If the manually configured GPU devices are misconfigured
>  * Any index:minor number format failure will cause a problem
>  * 0 configured device will cause a problem
>  * NumberFormatException is not handled
> It would be a better option to add warnings about the configuration, set 0 
> available GPUs and let the node work and run non-gpu jobs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909000#comment-16909000
 ] 

Hadoop QA commented on YARN-9100:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
59s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 38s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
51s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 12 unchanged - 6 fixed = 12 total (was 18) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
23s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 83m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:080e9d0 |
| JIRA Issue | YARN-9100 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/1298/YARN-9100.branch-3.1.002.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 21553a34cf2c 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 0a379e9 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24581/testReport/ |
| Max. process+thread count | 448 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 

[jira] [Commented] (YARN-2599) Standby RM should also expose some jmx and metrics

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908989#comment-16908989
 ] 

Hadoop QA commented on YARN-2599:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
0s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 17s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 13 unchanged - 0 fixed = 14 total (was 13) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 28s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 91m 
11s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m 
49s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}207m 36s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.0 Server=19.03.0 Image:yetus/hadoop:bdbca0e53b4 |
| JIRA Issue | YARN-2599 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977752/YARN-2599.002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ca12a538a784 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 
22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2216ec5 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Created] (YARN-9754) Add support for arbitrary DAG AM Simulator.

2019-08-16 Thread Abhishek Modi (JIRA)
Abhishek Modi created YARN-9754:
---

 Summary: Add support for arbitrary DAG AM Simulator.
 Key: YARN-9754
 URL: https://issues.apache.org/jira/browse/YARN-9754
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Abhishek Modi
Assignee: Abhishek Modi


Currently, all map containers are requests as soon as Application master comes 
up and then all reducer containers are requested. This doesn't get flexibility 
to simulate behavior of DAG where various number of containers would be 
requested at different time.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-9734) LogAggregationIndexedFileController fails to upload logs in rolling fashion

2019-08-16 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph resolved YARN-9734.
-
Resolution: Not A Problem

> LogAggregationIndexedFileController fails to upload logs in rolling fashion
> ---
>
> Key: YARN-9734
> URL: https://issues.apache.org/jira/browse/YARN-9734
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> LogAggregationIndexedFileController fails to upload logs in rolling fashion.
> *Configs:*
> {code}
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds = 60
> yarn.nodemanager.log-aggregation.debug-enabled = true
> yarn.log-aggregation.file-formats=IFile
> yarn.log-aggregation.file-controller.IFile.class=org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController
> {code}
> *Initialize writer fails with below error:*
> {code}
> 2019-08-09 07:46:12,411 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
>  Cannot create writer for app application_1565102314214_0007. Skip log upload 
> this time.
> java.io.IOException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.RecoveryInProgressException):
>  Failed to APPEND_FILE 
> /app-logs/ambari-qa/bucket-logs-ifile/0007/application_1565102314214_0007/yarnDocker-1_45454_1565335809907
>  for DFSClient_NONMAPREDUCE_-1185242013_202 on 172.26.86.24 because lease 
> recovery is in progress. Try again later.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2697)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:125)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2745)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:823)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1001)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:929)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2920)
> at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:227)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:312)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:482)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:449)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:295)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)  
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.RecoveryInProgressException):
>  Failed to APPEND_FILE 
> /app-logs/ambari-qa/bucket-logs-ifile/0007/application_1565102314214_0007/yarnDocker-1_45454_1565335809907
>  for DFSClient_NONMAPREDUCE_-1185242013_202 on 172.26.86.24 because lease 
> recovery is in progress. Try again later.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2697)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:125)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2745)
> at 
> 

[jira] [Commented] (YARN-9734) LogAggregationIndexedFileController fails to upload logs in rolling fashion

2019-08-16 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908977#comment-16908977
 ] 

Prabhu Joseph commented on YARN-9734:
-

It works fine after setting 
dfs.client.block.write.replace-datanode-on-failure.policy = NEVER. Looks this 
config is needed for a small cluster as per the Hdfs Config Doc.

{code}
When the cluster size is extremely small, e.g. 3 nodes or less, cluster
administrators may want to set the policy to NEVER in the default
configuration file or disable this feature.  Otherwise, users may
experience an unusually high rate of pipeline failures since it is
impossible to find new datanodes for replacement.
{code}


> LogAggregationIndexedFileController fails to upload logs in rolling fashion
> ---
>
> Key: YARN-9734
> URL: https://issues.apache.org/jira/browse/YARN-9734
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> LogAggregationIndexedFileController fails to upload logs in rolling fashion.
> *Configs:*
> {code}
> yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds = 60
> yarn.nodemanager.log-aggregation.debug-enabled = true
> yarn.log-aggregation.file-formats=IFile
> yarn.log-aggregation.file-controller.IFile.class=org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController
> {code}
> *Initialize writer fails with below error:*
> {code}
> 2019-08-09 07:46:12,411 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
>  Cannot create writer for app application_1565102314214_0007. Skip log upload 
> this time.
> java.io.IOException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.RecoveryInProgressException):
>  Failed to APPEND_FILE 
> /app-logs/ambari-qa/bucket-logs-ifile/0007/application_1565102314214_0007/yarnDocker-1_45454_1565335809907
>  for DFSClient_NONMAPREDUCE_-1185242013_202 on 172.26.86.24 because lease 
> recovery is in progress. Try again later.
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2697)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirAppendOp.appendFile(FSDirAppendOp.java:125)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2745)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:823)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:529)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1001)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:929)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2920)
> at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.ifile.LogAggregationIndexedFileController.initializeWriter(LogAggregationIndexedFileController.java:227)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:312)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:482)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:449)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$1.run(LogAggregationService.java:295)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)  
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.RecoveryInProgressException):
>  Failed to APPEND_FILE 
> 

[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9100:
---
Attachment: YARN-9100.branch-3.2.002.patch

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.1.002.patch, YARN-9100.branch-3.2.001.patch, 
> YARN-9100.branch-3.2.002.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8586) Extract log aggregation related fields and methods from RMAppImpl

2019-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908965#comment-16908965
 ] 

Hudson commented on YARN-8586:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17136 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17136/])
YARN-8586. Extract log aggregation related fields and methods from (snemeth: 
rev 4456ea67b949553b85e101e866b4b3f4b335f1f0)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppImpl.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmapp/RMAppLogAggregation.java


> Extract log aggregation related fields and methods from RMAppImpl
> -
>
> Key: YARN-8586
> URL: https://issues.apache.org/jira/browse/YARN-8586
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-8586-branch-3.1.001.patch, YARN-8586.001.patch, 
> YARN-8586.002.patch, YARN-8586.002.patch, YARN-8586.003.patch, 
> YARN-8586.004.patch, YARN-8586.branch-3.2.001.patch
>
>
> Given that RMAppImpl is already above 2000 lines and it is very complex, as a 
> very simple 
> and straightforward step, all Log aggregation related fields and methods 
> could be extracted to a new class.
> The clients of RMAppImpl may access the same methods and RMAppImpl would 
> delegate all those calls to the newly introduced class.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9461) TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken fails with HTTP 400

2019-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908966#comment-16908966
 ] 

Hudson commented on YARN-9461:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17136 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17136/])
YARN-9461. (snemeth: rev 9b8359bb085b29b868ec9e704bf655b58d3e36a7)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesDelegationTokenAuthentication.java


> TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken 
> fails with HTTP 400
> ---
>
> Key: YARN-9461
> URL: https://issues.apache.org/jira/browse/YARN-9461
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9461-001.patch
>
>
> The test 
> {{TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken}}
>  sometimes fails with the following error:
> {noformat}
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8088/ws/v1/cluster/delegation-token
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.cancelDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:462)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:283)
> {noformat}
> The problem is that for whatever reason, Jetty seems to execute the token 
> cancellation REST call twice. First we get HTTP 200 OK, but the second 
> request fails with HTTP 400 Bad Request.
> The {{MockRM}} instance is static. Something could be a problem in this class 
> and it turned out that using separate {{MockRM}} instances solves the 
> flakiness.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9461) TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken fails with HTTP 400

2019-08-16 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9461:
-
Fix Version/s: 3.3.0

> TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken 
> fails with HTTP 400
> ---
>
> Key: YARN-9461
> URL: https://issues.apache.org/jira/browse/YARN-9461
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9461-001.patch
>
>
> The test 
> {{TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken}}
>  sometimes fails with the following error:
> {noformat}
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8088/ws/v1/cluster/delegation-token
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.cancelDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:462)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:283)
> {noformat}
> The problem is that for whatever reason, Jetty seems to execute the token 
> cancellation REST call twice. First we get HTTP 200 OK, but the second 
> request fails with HTTP 400 Bad Request.
> The {{MockRM}} instance is static. Something could be a problem in this class 
> and it turned out that using separate {{MockRM}} instances solves the 
> flakiness.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9461) TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken fails with HTTP 400

2019-08-16 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908949#comment-16908949
 ] 

Szilard Nemeth commented on YARN-9461:
--

Hi [~pbacsko]!
Thanks for this patch, committed to trunk!

> TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken 
> fails with HTTP 400
> ---
>
> Key: YARN-9461
> URL: https://issues.apache.org/jira/browse/YARN-9461
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: YARN-9461-001.patch
>
>
> The test 
> {{TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken}}
>  sometimes fails with the following error:
> {noformat}
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8088/ws/v1/cluster/delegation-token
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.cancelDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:462)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:283)
> {noformat}
> The problem is that for whatever reason, Jetty seems to execute the token 
> cancellation REST call twice. First we get HTTP 200 OK, but the second 
> request fails with HTTP 400 Bad Request.
> The {{MockRM}} instance is static. Something could be a problem in this class 
> and it turned out that using separate {{MockRM}} instances solves the 
> flakiness.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9461) TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken fails with HTTP 400

2019-08-16 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908948#comment-16908948
 ] 

Szilard Nemeth commented on YARN-9461:
--

Hi [~pbacsko]!
Patch looks good, committing shortly!

> TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken 
> fails with HTTP 400
> ---
>
> Key: YARN-9461
> URL: https://issues.apache.org/jira/browse/YARN-9461
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, test
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Minor
> Attachments: YARN-9461-001.patch
>
>
> The test 
> {{TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken}}
>  sometimes fails with the following error:
> {noformat}
> java.io.IOException: Server returned HTTP response code: 400 for URL: 
> http://localhost:8088/ws/v1/cluster/delegation-token
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.cancelDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:462)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesDelegationTokenAuthentication.testCancelledDelegationToken(TestRMWebServicesDelegationTokenAuthentication.java:283)
> {noformat}
> The problem is that for whatever reason, Jetty seems to execute the token 
> cancellation REST call twice. First we get HTTP 200 OK, but the second 
> request fails with HTTP 400 Bad Request.
> The {{MockRM}} instance is static. Something could be a problem in this class 
> and it turned out that using separate {{MockRM}} instances solves the 
> flakiness.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8586) Extract log aggregation related fields and methods from RMAppImpl

2019-08-16 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908944#comment-16908944
 ] 

Szilard Nemeth commented on YARN-8586:
--

Hi [~pbacsko]!
+1 on the latest patches, committed to trunk, branch-3.2 and branch-3.1
Thanks for your patch!
Thanks [~bsteinbach] for the review!

> Extract log aggregation related fields and methods from RMAppImpl
> -
>
> Key: YARN-8586
> URL: https://issues.apache.org/jira/browse/YARN-8586
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-8586-branch-3.1.001.patch, YARN-8586.001.patch, 
> YARN-8586.002.patch, YARN-8586.002.patch, YARN-8586.003.patch, 
> YARN-8586.004.patch, YARN-8586.branch-3.2.001.patch
>
>
> Given that RMAppImpl is already above 2000 lines and it is very complex, as a 
> very simple 
> and straightforward step, all Log aggregation related fields and methods 
> could be extracted to a new class.
> The clients of RMAppImpl may access the same methods and RMAppImpl would 
> delegate all those calls to the newly introduced class.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9738) Remove lock on ClusterNodeTracker#getNodeReport as it blocks application submission

2019-08-16 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908908#comment-16908908
 ] 

Bibin A Chundatt commented on YARN-9738:


[~sunilg]

Only for the nodes currently its changed to concurrentHashMap will  give bucket 
level locking.
Also the read lock is removed only for getNodeReport invoked from 
ClientRmService,NodeInfo,ApplicationMasterService ,explicitly null check is 
handled too.

Patch looks good to me , can we go ahead ??



> Remove lock on ClusterNodeTracker#getNodeReport as it blocks application 
> submission
> ---
>
> Key: YARN-9738
> URL: https://issues.apache.org/jira/browse/YARN-9738
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9738-001.patch, YARN-9738-002.patch
>
>
> *Env :*
> Server OS :- UBUNTU
> No. of Cluster Node:- 9120 NMs
> Env Mode:- [Secure / Non secure]Secure
> *Preconditions:*
> ~9120 NM's was running
> ~1250 applications was in running state 
> 35K applications was in pending state
> *Test Steps:*
> 1. Submit the application from 5 clients, each client 2 threads and total 10 
> queues
> 2. Once application submittion increases (for each application of 
> distributted shell will call getClusterNodes)
> *ClientRMservice#getClusterNodes tries to get 
> ClusterNodeTracker#getNodeReport where map nodes is locked.*
> {quote}
> "IPC Server handler 36 on 45022" #246 daemon prio=5 os_prio=0 
> tid=0x7f75095de000 nid=0x1949c waiting on condition [0x7f74cff78000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x7f759f6d8858> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.getNodeReport(ClusterNodeTracker.java:123)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getNodeReport(AbstractYarnScheduler.java:449)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.createNodeReports(ClientRMService.java:1067)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getClusterNodes(ClientRMService.java:992)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getClusterNodes(ApplicationClientProtocolPBServiceImpl.java:313)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:589)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:530)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:863)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2792)
> {quote}
> *Instead we can make nodes as concurrentHashMap and remove readlock*



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Peter Bacsko (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908905#comment-16908905
 ] 

Peter Bacsko commented on YARN-9100:


Branch-3.1 failures are due to compilation errors. Fixed in branch-3.1-v2.

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.1.002.patch, YARN-9100.branch-3.2.001.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Peter Bacsko (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9100:
---
Attachment: YARN-9100.branch-3.1.002.patch

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.1.002.patch, YARN-9100.branch-3.2.001.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908887#comment-16908887
 ] 

Hadoop QA commented on YARN-9100:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
47s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 33s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
28s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
53s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 53s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 12 unchanged - 6 fixed = 12 total (was 18) {color} 
|
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  3m 
54s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 56s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:080e9d0 |
| JIRA Issue | YARN-9100 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977757/YARN-9100.branch-3.1.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7b8e1acc8246 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 9411437 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/24579/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| compile | 

[jira] [Commented] (YARN-6539) Create SecureLogin inside Router

2019-08-16 Thread Xie YiFan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908885#comment-16908885
 ] 

Xie YiFan commented on YARN-6539:
-

hi [~yzzjjyy] You should set hadoop.security.authorization to false in 
core-site.xml.

> Create SecureLogin inside Router
> 
>
> Key: YARN-6539
> URL: https://issues.apache.org/jira/browse/YARN-6539
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Xie YiFan
>Priority: Minor
> Attachments: YARN-6359_1.patch, YARN-6359_2.patch, YARN-6539_3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908882#comment-16908882
 ] 

Hadoop QA commented on YARN-5857:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
20s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 71m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:bdbca0e |
| JIRA Issue | YARN-5857 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977754/YARN-5857-002.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 0f31a2378bd1 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2216ec5 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_212 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/24578/testReport/ |
| Max. process+thread count | 441 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/24578/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> TestLogAggregationService.testFixedSizeThreadPool fails 

[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908878#comment-16908878
 ] 

Hadoop QA commented on YARN-9100:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 36m  
2s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
 7s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
56s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
27s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
51s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 51s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 12 unchanged - 6 fixed = 12 total (was 18) {color} 
|
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  4m 
14s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 54s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 87m 12s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.1 Server=19.03.1 Image:yetus/hadoop:080e9d0 |
| JIRA Issue | YARN-9100 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12977757/YARN-9100.branch-3.1.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f57b1c774ca5 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 9411437 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/24576/artifact/out/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| compile | 

[jira] [Commented] (YARN-6539) Create SecureLogin inside Router

2019-08-16 Thread qiwei huang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908849#comment-16908849
 ] 

qiwei huang commented on YARN-6539:
---

Hello, I merged this patch, but I got the error from client "Protocol interface 
org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB is not known.", I think 
there should be an ACL refresh before the server starts to resolve this 
problem. And I want to ask that is there any progress for secure login in 
AMRMProxy? I can use the router to forward the request from the client to RM, 
but AMRMProxy cannot.

> Create SecureLogin inside Router
> 
>
> Key: YARN-6539
> URL: https://issues.apache.org/jira/browse/YARN-6539
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Xie YiFan
>Priority: Minor
> Attachments: YARN-6359_1.patch, YARN-6359_2.patch, YARN-6539_3.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9217) Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing

2019-08-16 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908822#comment-16908822
 ] 

Szilard Nemeth commented on YARN-9217:
--

Hi [~pbacsko]!
Some comments on the latest patch: 
1. 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDiscoverer#getGpuDeviceInformation:
 You added IOException to the javadoc and to the signature of the method but 
this kind of exception is not thrown, apparently. Could you please verify this? 
2. There's a message: 

{code:java}
LOG.warn("Cannot properly parse GPU device numbers: {}", device);
{code}
I think this should be rephrased to "Cannot parse..." so the word "properly" is 
superfluous.

3. 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDiscoverer#parseGpuDevice:
 Parameters "device" and "allowedDevicesStr" is unused. Please check this code 
part!
4. In 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuResourcePlugin#getNMResourceInfo:
 catch blocks contain the same code for YarnException and IOException. Please 
use multicatch.
5. For 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.NvidiaBinaryHelper#MAX_EXEC_TIMEOUT_MS:
 Please use javadoc instead of simple comment above.
6. 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.NvidiaBinaryHelper#getGpuDeviceInformation
 can be package-private instead of public.
7. In 
org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.NvidiaBinaryHelper#getGpuDeviceInformation:
 We could use if (pathOfGpuBinary == null) form for the null check.


> Nodemanager will fail to start if GPU is misconfigured on the node or GPU 
> drivers missing
> -
>
> Key: YARN-9217
> URL: https://issues.apache.org/jira/browse/YARN-9217
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.0.0, 3.1.0
>Reporter: Antal Bálint Steinbach
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9217.001.patch, YARN-9217.002.patch, 
> YARN-9217.003.patch, YARN-9217.004.patch, YARN-9217.005.patch, 
> YARN-9217.006.patch, YARN-9217.007.patch, YARN-9217.008.patch, 
> YARN-9217.009.patch, YARN-9217.010.patch, YARN-9217.011.patch
>
>
> Nodemanager will not start
> 1. If Autodiscovery is enabled:
>  * If nvidia-smi path is misconfigured or the file does not exist.
>  * There is 0 GPU found
>  * If the file exists but it is not pointing to an nvidia-smi
>  * if the binary is ok but there is an IOException
> 2. If the manually configured GPU devices are misconfigured
>  * Any index:minor number format failure will cause a problem
>  * 0 configured device will cause a problem
>  * NumberFormatException is not handled
> It would be a better option to add warnings about the configuration, set 0 
> available GPUs and let the node work and run non-gpu jobs.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk

2019-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908815#comment-16908815
 ] 

Hudson commented on YARN-9749:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17135 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17135/])
YARN-9749. TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk. 
(snemeth: rev 2a05e0ff3b5ab3be8654e9e96c6556865ef26096)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java


> TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
> 
>
> Key: YARN-9749
> URL: https://issues.apache.org/jira/browse/YARN-9749
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation, test
>Reporter: Peter Bacsko
>Assignee: Adam Antal
>Priority: Major
> Fix For: 3.3.0, 3.2.1
>
> Attachments: YARN-9749.001.patch
>
>
> TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It 
> was most likely introduced by YARN-9676 (resetting HEAD to the previous 
> commit and then re-running the test passes).
> {noformat}
> [INFO] Running 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl
> [ERROR] 
> testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl)
>   Time elapsed: 2.361 s  <<< FAILURE!
> java.lang.AssertionError: The set of paths for deletion are not the same as 
> expected: actual size: 0 vs expected size: 1
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319)
>   at 
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39)
>   at 
> org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96)
>   at 
> org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)
>   at 
> org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown
>  Source)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908816#comment-16908816
 ] 

Hudson commented on YARN-9100:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17135 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17135/])
YARN-9100. Add tests for GpuResourceAllocator and do minor code cleanup. 
(snemeth: rev 2216ec54e58e24ff09620fc2efa2f1733391d0c3)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/gpu/GpuResourceAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/gpu/TestGpuResourceHandlerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/gpu/GpuResourceHandlerImpl.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/resources/gpu/TestGpuResourceAllocator.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/resourceplugin/gpu/GpuResourcePlugin.java


> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.2.001.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9100:
-
Fix Version/s: 3.3.0

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.2.001.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908809#comment-16908809
 ] 

Szilard Nemeth commented on YARN-9100:
--

Hi [~pbacsko]!
Never mind, I resolved the conflicts for 3.2 / 3.1 and uploaded new patches. 

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.2.001.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9100:
-
Attachment: YARN-9100.branch-3.1.001.patch

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.1.001.patch, 
> YARN-9100.branch-3.2.001.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Szilard Nemeth (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szilard Nemeth updated YARN-9100:
-
Attachment: YARN-9100.branch-3.2.001.patch

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch, YARN-9100.branch-3.2.001.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9100) Add tests for GpuResourceAllocator and do minor code cleanup

2019-08-16 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908802#comment-16908802
 ] 

Szilard Nemeth commented on YARN-9100:
--

Hi [~pbacsko]!
+1 on the latest patch!
Committing this to trunk shortly!
Are you planning to add branch-3.2 / branch-3.1 patches as well? 

> Add tests for GpuResourceAllocator and do minor code cleanup
> 
>
> Key: YARN-9100
> URL: https://issues.apache.org/jira/browse/YARN-9100
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Szilard Nemeth
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9100-004.patch, YARN-9100-005.patch, 
> YARN-9100-006.patch, YARN-9100-007.patch, YARN-9100-008.patch, 
> YARN-9100-009.patch, YARN-9100.001.patch, YARN-9100.002.patch, 
> YARN-9100.003.patch
>
>
> Add tests for GpuResourceAllocator and do minor code cleanup
> - Improved log and exception messages
> - Added some new debug logs
> - Some methods are named like *Copy, these are returning copies of internal 
> data structures. The word "copy" is just a noise in their name, so they have 
> been renamed. Additionally, the copied data structures modified to be 
> immutable.
> - The waiting loop in method assignGpus were decoupled into a new class, 
> RetryCommand. 
> Some more words about the new class RetryCommand: 
> There are some similar waiting loops in the code in: AMRMClient, 
> AMRMClientAsync and even in GenericTestUtils (see waitFor method). 
> RetryCommand could be a future replacement of these duplicated code, as it 
> gives a solution to this waiting loop problem in a generic way.
> The only downside of the usage of RetryCommand in GpuResourceAllocator 
> (startGpuAssignmentLoop) is the ugly exception handling part, but that's 
> solely because how Java deals with checked exceptions vs. lambdas. If there's 
> a cleaner way to solve the exception handling, I'm open for any suggestions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-9750) TestAppLogAggregatorImpl.verifyFilesToDelete fails

2019-08-16 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-9750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph resolved YARN-9750.
-
Resolution: Duplicate

> TestAppLogAggregatorImpl.verifyFilesToDelete fails
> --
>
> Key: YARN-9750
> URL: https://issues.apache.org/jira/browse/YARN-9750
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation, test
>Affects Versions: 3.3.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
>
> *TestAppLogAggregatorImpl.verifyFilesToDelete fails*
> {code}
> [ERROR] 
> testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl)
>   Time elapsed: 2.252 s  <<< FAILURE!
> java.lang.AssertionError: The set of paths for deletion are not the same as 
> expected: actual size: 0 vs expected size: 1
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319)
>   at 
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39)
>   at 
> org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96)
>   at 
> org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)
>   at 
> org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1136724178.delete(Unknown
>  Source)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> 

[jira] [Commented] (YARN-9749) TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk

2019-08-16 Thread Szilard Nemeth (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908792#comment-16908792
 ] 

Szilard Nemeth commented on YARN-9749:
--

Thanks [~adam.antal] for this fix, commited to trunk and to branch-3.2
Thanks [~pbacsko] for the review!

> TestAppLogAggregatorImpl#testDFSQuotaExceeded fails on trunk
> 
>
> Key: YARN-9749
> URL: https://issues.apache.org/jira/browse/YARN-9749
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation, test
>Reporter: Peter Bacsko
>Assignee: Adam Antal
>Priority: Major
> Attachments: YARN-9749.001.patch
>
>
> TestAppLogAggregatorImpl#testDFSQuotaExceeded currently fails on trunk. It 
> was most likely introduced by YARN-9676 (resetting HEAD to the previous 
> commit and then re-running the test passes).
> {noformat}
> [INFO] Running 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl
> [ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.781 
> s <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl
> [ERROR] 
> testDFSQuotaExceeded(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl)
>   Time elapsed: 2.361 s  <<< FAILURE!
> java.lang.AssertionError: The set of paths for deletion are not the same as 
> expected: actual size: 0 vs expected size: 1
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.verifyFilesToDelete(TestAppLogAggregatorImpl.java:344)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.access$000(TestAppLogAggregatorImpl.java:82)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:330)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl$1.answer(TestAppLogAggregatorImpl.java:319)
>   at 
> org.mockito.internal.stubbing.StubbedInvocationMatcher.answer(StubbedInvocationMatcher.java:39)
>   at 
> org.mockito.internal.handler.MockHandlerImpl.handle(MockHandlerImpl.java:96)
>   at 
> org.mockito.internal.handler.NullResultGuardian.handle(NullResultGuardian.java:29)
>   at 
> org.mockito.internal.handler.InvocationNotifierHandler.handle(InvocationNotifierHandler.java:35)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:61)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor.doIntercept(MockMethodInterceptor.java:49)
>   at 
> org.mockito.internal.creation.bytebuddy.MockMethodInterceptor$DispatcherDefaultingToRealMethod.interceptSuperCallable(MockMethodInterceptor.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.DeletionService$MockitoMock$1879282050.delete(Unknown
>  Source)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregationPostCleanUp(AppLogAggregatorImpl.java:556)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:476)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestAppLogAggregatorImpl.testDFSQuotaExceeded(TestAppLogAggregatorImpl.java:469)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk

2019-08-16 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908791#comment-16908791
 ] 

Bibin A Chundatt commented on YARN-5857:


Thank you [~BilwaST] for updating the patch.

+1 LGTM.  Will wait for [~adam.antal] comments too.

> TestLogAggregationService.testFixedSizeThreadPool fails intermittently on 
> trunk
> ---
>
> Key: YARN-5857
> URL: https://issues.apache.org/jira/browse/YARN-5857
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-5857-001.patch, YARN-5857-002.patch, 
> testFixedSizeThreadPool failure reproduction
>
>
> {noformat}
> testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService)
>   Time elapsed: 0.11 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139)
> {noformat}
> Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk

2019-08-16 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-5857:

Attachment: YARN-5857-002.patch

> TestLogAggregationService.testFixedSizeThreadPool fails intermittently on 
> trunk
> ---
>
> Key: YARN-5857
> URL: https://issues.apache.org/jira/browse/YARN-5857
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-5857-001.patch, YARN-5857-002.patch, 
> testFixedSizeThreadPool failure reproduction
>
>
> {noformat}
> testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService)
>   Time elapsed: 0.11 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139)
> {noformat}
> Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2599) Standby RM should also expose some jmx and metrics

2019-08-16 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-2599:
-
Release Note: YARN /jmx URL end points will be accessible per resource 
manager process. Hence there will not be any redirection to active resource 
manager while accessing /jmx endpoints.

> Standby RM should also expose some jmx and metrics
> --
>
> Key: YARN-2599
> URL: https://issues.apache.org/jira/browse/YARN-2599
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.5.1, 2.7.3, 3.0.0-alpha1
>Reporter: Karthik Kambatla
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-2599.002.patch, YARN-2599.patch
>
>
> YARN-1898 redirects jmx and metrics to the Active. As discussed there, we 
> need to separate out metrics displayed so the Standby RM can also be 
> monitored. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk

2019-08-16 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-5857:

Attachment: (was: YARN-5857-002.patch)

> TestLogAggregationService.testFixedSizeThreadPool fails intermittently on 
> trunk
> ---
>
> Key: YARN-5857
> URL: https://issues.apache.org/jira/browse/YARN-5857
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-5857-001.patch, testFixedSizeThreadPool failure 
> reproduction
>
>
> {noformat}
> testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService)
>   Time elapsed: 0.11 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139)
> {noformat}
> Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2599) Standby RM should also expose some jmx and metrics

2019-08-16 Thread Sunil Govindan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908779#comment-16908779
 ] 

Sunil Govindan commented on YARN-2599:
--

cc [~cheersyang] New patch is added.

> Standby RM should also expose some jmx and metrics
> --
>
> Key: YARN-2599
> URL: https://issues.apache.org/jira/browse/YARN-2599
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.5.1, 2.7.3, 3.0.0-alpha1
>Reporter: Karthik Kambatla
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-2599.002.patch, YARN-2599.patch
>
>
> YARN-1898 redirects jmx and metrics to the Active. As discussed there, we 
> need to separate out metrics displayed so the Standby RM can also be 
> monitored. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2599) Standby RM should also expose some jmx and metrics

2019-08-16 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-2599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-2599:
-
Attachment: YARN-2599.002.patch

> Standby RM should also expose some jmx and metrics
> --
>
> Key: YARN-2599
> URL: https://issues.apache.org/jira/browse/YARN-2599
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.5.1, 2.7.3, 3.0.0-alpha1
>Reporter: Karthik Kambatla
>Assignee: Rohith Sharma K S
>Priority: Major
> Attachments: YARN-2599.002.patch, YARN-2599.patch
>
>
> YARN-1898 redirects jmx and metrics to the Active. As discussed there, we 
> need to separate out metrics displayed so the Standby RM can also be 
> monitored. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk

2019-08-16 Thread Bilwa S T (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16908771#comment-16908771
 ] 

Bilwa S T commented on YARN-5857:
-

Hi [~adam.antal] latch-countDown should be put before tryLock as tryLock 
retries for 35secs. 

> TestLogAggregationService.testFixedSizeThreadPool fails intermittently on 
> trunk
> ---
>
> Key: YARN-5857
> URL: https://issues.apache.org/jira/browse/YARN-5857
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-5857-001.patch, YARN-5857-002.patch, 
> testFixedSizeThreadPool failure reproduction
>
>
> {noformat}
> testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService)
>   Time elapsed: 0.11 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139)
> {noformat}
> Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5857) TestLogAggregationService.testFixedSizeThreadPool fails intermittently on trunk

2019-08-16 Thread Bilwa S T (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-5857:

Attachment: YARN-5857-002.patch

> TestLogAggregationService.testFixedSizeThreadPool fails intermittently on 
> trunk
> ---
>
> Key: YARN-5857
> URL: https://issues.apache.org/jira/browse/YARN-5857
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Assignee: Bilwa S T
>Priority: Minor
> Attachments: YARN-5857-001.patch, YARN-5857-002.patch, 
> testFixedSizeThreadPool failure reproduction
>
>
> {noformat}
> testFixedSizeThreadPool(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService)
>   Time elapsed: 0.11 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<3> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at org.junit.Assert.assertEquals(Assert.java:542)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testFixedSizeThreadPool(TestLogAggregationService.java:1139)
> {noformat}
> Refer to https://builds.apache.org/job/PreCommit-YARN-Build/13829/testReport/



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org