[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076920#comment-17076920
 ] 

Hadoop QA commented on YARN-10209:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} docker {color} | {color:blue}  0m 
10s{color} | {color:blue} Dockerfile 
'/home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/sourcedir/dev-support/docker/Dockerfile'
 not found, falling back to built-in. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 12m  
1s{color} | {color:red} Docker failed to build yetus/hadoop:date2020-04-07. 
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-10209 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12999193/YARN-10209.branch-2.6.0.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25818/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 2.6.0
>
> Attachments: YARN-10209.001.patch, YARN-10209.branch-2.6.0.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-04-06 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10209:
-
Attachment: YARN-10209.branch-2.6.0.patch

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 2.6.0
>
> Attachments: YARN-10209.001.patch, YARN-10209.branch-2.6.0.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10208) Add metric in CapacityScheduler for evaluating the time difference between node heartbeats

2020-04-06 Thread Bibin Chundatt (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076865#comment-17076865
 ] 

Bibin Chundatt commented on YARN-10208:
---

Thank you [~adam.antal] for additional review. Will wait for a day before 
commit.

> Add metric in CapacityScheduler for evaluating the time difference between 
> node heartbeats
> --
>
> Key: YARN-10208
> URL: https://issues.apache.org/jira/browse/YARN-10208
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Pranjal Protim Borah
>Assignee: Pranjal Protim Borah
>Priority: Minor
> Attachments: YARN-10208.001.patch, YARN-10208.002.patch, 
> YARN-10208.003.patch, YARN-10208.004.patch, YARN-10208.005.patch
>
>
> Metric measuring average time interval between node heartbeats in capacity 
> scheduler on node update event.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2710) RM HA tests failed intermittently on trunk

2020-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076819#comment-17076819
 ] 

Hadoop QA commented on YARN-2710:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 
32s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
50s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 28s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
20s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 14s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 1 new + 
13 unchanged - 1 fixed = 14 total (was 14) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 22s{color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}157m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.TestApplicationClientProtocolOnHA |
|   | hadoop.yarn.client.cli.TestSchedConfCLI |
|   | hadoop.yarn.client.TestResourceTrackerOnHA |
|   | hadoop.yarn.client.TestApplicationMasterServiceProtocolOnHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:11aff6c269f |
| JIRA Issue | YARN-2710 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12999176/YARN-2710-branch-3.2.003.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 15b489061366 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / 11aff6c |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/25817/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| unit | 

[jira] [Commented] (YARN-10219) YARN service placement constraints is broken

2020-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076785#comment-17076785
 ] 

Hadoop QA commented on YARN-10219:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
9s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m 15s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 57s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 4 new + 40 unchanged - 0 fixed = 44 total (was 40) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
32s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
22s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
48s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}119m  8s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:e6455cc864d |
| JIRA Issue | YARN-10219 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12999174/YARN-10219.003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| 

[jira] [Updated] (YARN-2710) RM HA tests failed intermittently on trunk

2020-04-06 Thread Eric Badger (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-2710:
--
Attachment: (was: YARN-2710-branch-3.2.003.patch)

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710-branch-2.10.003.patch, YARN-2710-branch-3.2.003.patch, 
> YARN-2710.001.patch, YARN-2710.002.patch, YARN-2710.003.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2710) RM HA tests failed intermittently on trunk

2020-04-06 Thread Eric Badger (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-2710:
--
Attachment: YARN-2710-branch-3.2.003.patch

> RM HA tests failed intermittently on trunk
> --
>
> Key: YARN-2710
> URL: https://issues.apache.org/jira/browse/YARN-2710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
> Environment: Java 8, jenkins
>Reporter: Wangda Tan
>Assignee: Ahmed Hussein
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: TestResourceTrackerOnHA-output.2.txt, 
> YARN-2710-branch-2.10.001.patch, YARN-2710-branch-2.10.002.patch, 
> YARN-2710-branch-2.10.003.patch, YARN-2710-branch-3.2.003.patch, 
> YARN-2710-branch-3.2.003.patch, YARN-2710.001.patch, YARN-2710.002.patch, 
> YARN-2710.003.patch, 
> org.apache.hadoop.yarn.client.TestResourceTrackerOnHA-output.txt
>
>
> Failure like, it can be happened in TestApplicationClientProtocolOnHA, 
> TestResourceTrackerOnHA, etc.
> {code}
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA
> testGetApplicationAttemptsOnHA(org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA)
>   Time elapsed: 9.491 sec  <<< ERROR!
> java.net.ConnectException: Call From asf905.gq1.ygridcore.net/67.195.81.149 
> to asf905.gq1.ygridcore.net:28032 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
>   at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1438)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1399)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
>   at com.sun.proxy.$Proxy17.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationAttempts(ApplicationClientProtocolPBClientImpl.java:372)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
>   at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:101)
>   at com.sun.proxy.$Proxy18.getApplicationAttempts(Unknown Source)
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationAttempts(YarnClientImpl.java:583)
>   at 
> org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA.testGetApplicationAttemptsOnHA(TestApplicationClientProtocolOnHA.java:137)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10219) YARN service placement constraints is broken

2020-04-06 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076724#comment-17076724
 ] 

Eric Yang commented on YARN-10219:
--

I am unsure the reason that unit test failed for patch 002.  It only fixed 
checkstyle issue in patch 1.  The exact anti affinity test pass with success in 
my cluster environment, and I am unable to get it to fail locally.  I suspect 
it maybe dynamic detection of number of vcore to use per node manager in 
Jenkins environment which is different from my laptop.  My laptop is saturated 
at 4 cpu cores, which may prevent additional containers to start and allowed 
test case to pass.  I resubmit patch 2 as patch 3 for retest.  If this fails 
again, I will add vcore restriction to this test case to prevent fail on more 
powerful hardware.

> YARN service placement constraints is broken
> 
>
> Key: YARN-10219
> URL: https://issues.apache.org/jira/browse/YARN-10219
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.2.0, 3.1.1, 3.1.2, 3.3.0, 3.2.1, 3.1.3
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-10219.001.patch, YARN-10219.002.patch, 
> YARN-10219.003.patch
>
>
> YARN service placement constraint does not work with node label nor node 
> attributes. Example of placement constraints: 
> {code} 
>   "placement_policy": {
> "constraints": [
>   {
> "type": "AFFINITY",
> "scope": "NODE",
> "node_attributes": {
>   "label":["genfile"]
> },
> "target_tags": [
>   "ping"
> ] 
>   }
> ]
>   },
> {code}
> Node attribute added: 
> {code} ./bin/yarn nodeattributes -add "host-3.example.com:label=genfile" 
> {code} 
> Scheduling activities shows: 
> {code}  Node does not match partition or placement constraints, 
> unsatisfied PC expression="in,node,ping", target-type=ALLOCATION_TAG 
> 
>  1
>  host-3.example.com:45454{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10219) YARN service placement constraints is broken

2020-04-06 Thread Eric Yang (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-10219:
-
Attachment: YARN-10219.003.patch

> YARN service placement constraints is broken
> 
>
> Key: YARN-10219
> URL: https://issues.apache.org/jira/browse/YARN-10219
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.2.0, 3.1.1, 3.1.2, 3.3.0, 3.2.1, 3.1.3
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-10219.001.patch, YARN-10219.002.patch, 
> YARN-10219.003.patch
>
>
> YARN service placement constraint does not work with node label nor node 
> attributes. Example of placement constraints: 
> {code} 
>   "placement_policy": {
> "constraints": [
>   {
> "type": "AFFINITY",
> "scope": "NODE",
> "node_attributes": {
>   "label":["genfile"]
> },
> "target_tags": [
>   "ping"
> ] 
>   }
> ]
>   },
> {code}
> Node attribute added: 
> {code} ./bin/yarn nodeattributes -add "host-3.example.com:label=genfile" 
> {code} 
> Scheduling activities shows: 
> {code}  Node does not match partition or placement constraints, 
> unsatisfied PC expression="in,node,ping", target-type=ALLOCATION_TAG 
> 
>  1
>  host-3.example.com:45454{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076532#comment-17076532
 ] 

Hadoop QA commented on YARN-10209:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} YARN-10209 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-10209 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12999089/YARN-10209.001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/25815/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 2.6.0
>
> Attachments: YARN-10209.001.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-04-06 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076530#comment-17076530
 ] 

Bilwa S T commented on YARN-10209:
--

Thanks [~bteke] for clarification. I have uploaded patch. Please take a look at 
it

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Fix For: 2.6.0
>
> Attachments: YARN-10209.001.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-04-06 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10209:
-
Attachment: YARN-10209.001.patch

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10209.001.patch
>
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10151) Disable Capacity Scheduler's move app between queue functionality

2020-04-06 Thread Wangda Tan (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan resolved YARN-10151.
---
Resolution: Won't Fix

Thanks folks for commenting about YARN-9838. I think we don't need this change 
now given we have a fix of the reported issue already.

> Disable Capacity Scheduler's move app between queue functionality
> -
>
> Key: YARN-10151
> URL: https://issues.apache.org/jira/browse/YARN-10151
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Priority: Critical
>
> Saw this happened in many clusters: Capacity Scheduler cannot work correctly 
> with the move app between queue features. It will cause weird JMX issue, 
> resource accounting issue, etc. In a lot of causes it will cause RM 
> completely hung and available resource became negative, nothing can be 
> allocated after that. We should turn off CapacityScheduler's move app between 
> queue feature. (see: 
> {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler#moveApplication}}
>  )



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-04-06 Thread Benjamin Teke (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076416#comment-17076416
 ] 

Benjamin Teke commented on YARN-10209:
--

Hi [~BilwaST],

Thanks for looking at the issue. Sorry, I wasn't clear in the description. The 
ApplicationMaster instantiates the TimelineClient unconditionally. In 
ApplicationMaster#init:
{code:java}
timelineClient = TimelineClient.createTimelineClient();
timelineClient.init(conf);
timelineClient.start();
{code}

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10216) Utility to dynamically reload Configuration on the disk

2020-04-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri reassigned YARN-10216:
--

Assignee: Cyrus Jackson

> Utility to dynamically reload Configuration on the disk
> ---
>
> Key: YARN-10216
> URL: https://issues.apache.org/jira/browse/YARN-10216
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Cyrus Jackson
>Assignee: Cyrus Jackson
>Priority: Major
> Attachments: image-2020-04-06-09-50-51-948.png
>
>
> There should be a way to dynamically reload the configuration properties from 
> the disk. The purpose of this feature is to let individual classes which are 
> interested in observing this configuration changes, to be notified when the 
> conf is reloaded from the disk. This is similar to how Hbase has done.
> *Class Diagram*
>   !image-2020-04-06-09-50-51-948.png!
>  
> *APPROACH DETAILS* 
> The approach is based on the adaption of Hbase Online Configuration. In this 
> case, the configuration file is monitored for any changes on the disk. If the 
> file has changed, the properties of Configuration is reloaded and all the 
> observers are notified. 
> The classes that implements the observers updates the necessary values if 
> required.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-04-06 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076285#comment-17076285
 ] 

Bilwa S T commented on YARN-10209:
--

Hi [~bteke] Thanks for reporting this issue. I checked 2.6.0 DS class. I can 
see that TimelineClient is created only if Timeline service is enabled. 
{code:java}
TimelineClient timelineClient = null;
if (conf.getBoolean(YarnConfiguration.TIMELINE_SERVICE_ENABLED,
YarnConfiguration.DEFAULT_TIMELINE_SERVICE_ENABLED)) {
  timelineClient = TimelineClient.createTimelineClient();
  timelineClient.init(conf);
  timelineClient.start();
} else {
  LOG.warn("Cannot put the domain " + domainId +
  " because the timeline service is not enabled");
  return;
}
{code}

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10209) DistributedShell should initialize TimelineClient conditionally

2020-04-06 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10209:


Assignee: Bilwa S T

> DistributedShell should initialize TimelineClient conditionally
> ---
>
> Key: YARN-10209
> URL: https://issues.apache.org/jira/browse/YARN-10209
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Benjamin Teke
>Assignee: Bilwa S T
>Priority: Major
>
> YarnConfiguration was changed along with the introduction of newer Timeline 
> Service versions to include configuration about the used version. In Hadoop 
> 2.6.0 the distributed shell instantiates Timeline Client whether if it's 
> enabled in the configuration or not. Running this distributed shell on newer 
> Hadoop versions (where the new Timeline Service is available) causes an 
> exception, because the bundled YarnConfiguration doesn't have the necessary 
> version configuration property. Making the Timeline Client initialization 
> conditional the distributed shell would run at least with disabled Timeline 
> Service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10219) YARN service placement constraints is broken

2020-04-06 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076269#comment-17076269
 ] 

Prabhu Joseph commented on YARN-10219:
--

Thanks [~eyang] for the patch. Have started reviewing the patch. Will update.

> YARN service placement constraints is broken
> 
>
> Key: YARN-10219
> URL: https://issues.apache.org/jira/browse/YARN-10219
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.2.0, 3.1.1, 3.1.2, 3.3.0, 3.2.1, 3.1.3
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
> Attachments: YARN-10219.001.patch, YARN-10219.002.patch
>
>
> YARN service placement constraint does not work with node label nor node 
> attributes. Example of placement constraints: 
> {code} 
>   "placement_policy": {
> "constraints": [
>   {
> "type": "AFFINITY",
> "scope": "NODE",
> "node_attributes": {
>   "label":["genfile"]
> },
> "target_tags": [
>   "ping"
> ] 
>   }
> ]
>   },
> {code}
> Node attribute added: 
> {code} ./bin/yarn nodeattributes -add "host-3.example.com:label=genfile" 
> {code} 
> Scheduling activities shows: 
> {code}  Node does not match partition or placement constraints, 
> unsatisfied PC expression="in,node,ping", target-type=ALLOCATION_TAG 
> 
>  1
>  host-3.example.com:45454{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10215) Endpoint for obtaining direct URL for the logs

2020-04-06 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076083#comment-17076083
 ] 

Andras Gyori commented on YARN-10215:
-

Thanks [~adam.antal] for the fast review.
The checkstyle issues have been resolved apart from a too many arguments issue, 
for which I am afraid this patch is out of scope. The findbug issue was not 
introduced by this patch, hence I would keep it as is. The unnecessary log 
entries were removed. 

> Endpoint for obtaining direct URL for the logs
> --
>
> Key: YARN-10215
> URL: https://issues.apache.org/jira/browse/YARN-10215
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-10025.001.patch, YARN-10025.002.patch, 
> YARN-10025.003.patch
>
>
> If CORS protected UIs are set up, there is an issue when the browser tries to 
> access the logs of a running container in the RM web UIv2.
> Assuming ATS is not up, the browser follows the following call chain:
> - Tries to access ATS, it fails, falls back to JHS
> - From RM the browser received basic app info, we know that the application 
> is running
> - From the JHS we got the list of containers and their log files.
> - When we try to access a specific log file, the JHS redirects the request to 
> the NM's UI (on which node the container is running). This redirect is 
> performed by the browser automatically. In this setup the host is considered 
> as a protected information, thus the browser omits the "Origin" field from 
> the request when this redirect is done. The browser then denies access to the 
> NodeManager's web UI due to the CORS header set up for NM, but the Origin is 
> null in the redirect request. 
> - Finally, "Logs are unavailable" message is shown in the RM web UIv2 due to 
> the CORS violation.
> We should fix this. As an approach we can expose another endpoints which only 
> returns the URL of the NodeManager what we should call directly from the UIv2 
> in order to receive the log. This adds a bit of a complexity, but will enable 
> users to keep the CORS protected setup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10207) CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated logs on the JobHistoryServer Web UI

2020-04-06 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076078#comment-17076078
 ] 

Adam Antal commented on YARN-10207:
---

+1 (non-binding). Thanks for the patch [~sahuja].

> CLOSE_WAIT socket connection leaks during rendering of (corrupted) aggregated 
> logs on the JobHistoryServer Web UI
> -
>
> Key: YARN-10207
> URL: https://issues.apache.org/jira/browse/YARN-10207
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Major
> Attachments: YARN-10207.001.patch, YARN-10207.002.patch, 
> YARN-10207.003.patch, YARN-10207.004.patch
>
>
> File descriptor leaks are observed coming from the JobHistoryServer process 
> while it tries to render a "corrupted" aggregated log on the JHS Web UI.
> Issue reproduced using the following steps:
> # Ran a sample Hadoop MR Pi job, it had the id - 
> application_1582676649923_0026.
> # Copied an aggregated log file from HDFS to local FS:
> {code}
> hdfs dfs -get 
> /tmp/logs/systest/logs/application_1582676649923_0026/_8041
> {code}
> # Updated the TFile metadata at the bottom of this file with some junk to 
> corrupt the file :
> *Before:*
> {code}
>   
> ^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáP
> {code}
> *After:*
> {code}
>   
> ^@^GVERSION*(^@_1582676649923_0026_01_03^F^Dnone^A^Pª5²ª5²^C^Qdata:BCFile.index^Dnoneª5þ^M^M^Pdata:TFile.index^Dnoneª5È66^Odata:TFile.meta^Dnoneª5Â^F^F^@^@^@^@^@^B6^K^@^A^@^@Ñ^QÓh<91>µ×¶9ßA@<92>ºáPblah
> {code}
> Notice "blah" (junk) added at the very end.
> # Remove the existing aggregated log file that will need to be replaced by 
> our modified copy from step 3 (as otherwise HDFS will prevent it from placing 
> the file with the same name as it already exists):
> {code}
> hdfs dfs -rm -r -f 
> /tmp/logs/systest/logs/application_1582676649923_0026/_8041
> {code}
> # Upload the corrupted aggregated file back to HDFS:
> {code}
> hdfs dfs -put _8041 
> /tmp/logs/systest/logs/application_1582676649923_0026
> {code}
> # Visit HistoryServer Web UI
> # Click on job_1582676649923_0026
> # Click on "logs" link against the AM (assuming the AM ran on nm_hostname)
> # Review the JHS logs, following exception will be seen:
> {code}
>   2020-03-24 20:03:48,484 ERROR org.apache.hadoop.yarn.webapp.View: Error 
> getting logs for job_1582676649923_0026
>   java.io.IOException: Not a valid BCFile.
>   at 
> org.apache.hadoop.io.file.tfile.BCFile$Magic.readAndVerify(BCFile.java:927)
>   at 
> org.apache.hadoop.io.file.tfile.BCFile$Reader.(BCFile.java:628)
>   at 
> org.apache.hadoop.io.file.tfile.TFile$Reader.(TFile.java:804)
>   at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogReader.(AggregatedLogFormat.java:588)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.TFileAggregatedLogsBlock.render(TFileAggregatedLogsBlock.java:111)
>   at 
> org.apache.hadoop.yarn.logaggregation.filecontroller.tfile.LogAggregationTFileController.renderAggregatedLogsBlock(LogAggregationTFileController.java:341)
>   at 
> org.apache.hadoop.yarn.webapp.log.AggregatedLogsBlock.render(AggregatedLogsBlock.java:117)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79)
>   at org.apache.hadoop.yarn.webapp.View.render(View.java:235)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49)
>   at 
> org.apache.hadoop.yarn.webapp.hamlet2.HamletImpl$EImp._v(HamletImpl.java:117)
>   at 
> org.apache.hadoop.yarn.webapp.hamlet2.Hamlet$TD.__(Hamlet.java:848)
>   at 
> org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71)
>   at 
> org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82)
>   at 
> org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212)
>   at 
> org.apache.hadoop.mapreduce.v2.hs.webapp.HsController.logs(HsController.java:202)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at