[jira] [Commented] (YARN-10228) Yarn Service fails if am java opts contains ZK authentication file path

2020-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111783#comment-17111783
 ] 

Bilwa S T commented on YARN-10228:
--

Thanks [~eyang]  for reviewing.

> Yarn Service fails if am java opts contains ZK authentication file path
> ---
>
> Key: YARN-10228
> URL: https://issues.apache.org/jira/browse/YARN-10228
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10228.001.patch
>
>
> If i configure 
> {code:java}
> yarn.service.am.java.opts=-Xmx768m 
> -Djava.security.auth.login.config=/opt/hadoop/etc/jaas-zk.conf
> {code}
> Invalid character error is getting printed .
> This is due to jvm opts validation added in YARN-9718



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111778#comment-17111778
 ] 

Akira Ajisaka commented on YARN-9606:
-

+1

bq. But still this old issue is coming. Not sure why

No problem. This error is in trunk Compile Tests (i.e. before applying patch). 
Hadoop precommit jobs first execute some checks before applying the patch to 
calculate the diffs. That way the following output shows "1 fixed".
{noformat}
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1)
{noformat}

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111766#comment-17111766
 ] 

Bilwa S T commented on YARN-9606:
-

cc [~aajisaka]

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10228) Yarn Service fails if am java opts contains ZK authentication file path

2020-05-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111396#comment-17111396
 ] 

Hadoop QA commented on YARN-10228:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 18s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
53s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
32s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m  2s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26044/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-10228 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13003419/YARN-10228.001.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux 7e2f66591687 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/hadoop.sh |
| git revision | trunk / d4e36409d40 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/26044/testReport/ |
| Max. process+thread count | 777 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 U: 

[jira] [Commented] (YARN-10228) Yarn Service fails if am java opts contains ZK authentication file path

2020-05-19 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111369#comment-17111369
 ] 

Eric Yang commented on YARN-10228:
--

[~BilwaST] Thank you for the patch.  +1 LGTM, pending Jenkins reports.

> Yarn Service fails if am java opts contains ZK authentication file path
> ---
>
> Key: YARN-10228
> URL: https://issues.apache.org/jira/browse/YARN-10228
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10228.001.patch
>
>
> If i configure 
> {code:java}
> yarn.service.am.java.opts=-Xmx768m 
> -Djava.security.auth.login.config=/opt/hadoop/etc/jaas-zk.conf
> {code}
> Invalid character error is getting printed .
> This is due to jvm opts validation added in YARN-9718



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10228) Yarn Service fails if am java opts contains ZK authentication file path

2020-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111332#comment-17111332
 ] 

Bilwa S T commented on YARN-10228:
--

Thanks [~eyang] for suggestion. I have uploaded patch for it. Please check

> Yarn Service fails if am java opts contains ZK authentication file path
> ---
>
> Key: YARN-10228
> URL: https://issues.apache.org/jira/browse/YARN-10228
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10228.001.patch
>
>
> If i configure 
> {code:java}
> yarn.service.am.java.opts=-Xmx768m 
> -Djava.security.auth.login.config=/opt/hadoop/etc/jaas-zk.conf
> {code}
> Invalid character error is getting printed .
> This is due to jvm opts validation added in YARN-9718



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10228) Yarn Service fails if am java opts contains ZK authentication file path

2020-05-19 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-10228:
-
Attachment: YARN-10228.001.patch

> Yarn Service fails if am java opts contains ZK authentication file path
> ---
>
> Key: YARN-10228
> URL: https://issues.apache.org/jira/browse/YARN-10228
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10228.001.patch
>
>
> If i configure 
> {code:java}
> yarn.service.am.java.opts=-Xmx768m 
> -Djava.security.auth.login.config=/opt/hadoop/etc/jaas-zk.conf
> {code}
> Invalid character error is getting printed .
> This is due to jvm opts validation added in YARN-9718



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10228) Yarn Service fails if am java opts contains ZK authentication file path

2020-05-19 Thread Eric Yang (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111307#comment-17111307
 ] 

Eric Yang commented on YARN-10228:
--

[~BilwaST] I think excessive config management for validating one character is 
not good usability design and prone to more mistakes. "/" character is mostly 
safe, unless there is incorrect file permission on the file system.  I am more 
comfortable to allow "/" character after more thinking.

> Yarn Service fails if am java opts contains ZK authentication file path
> ---
>
> Key: YARN-10228
> URL: https://issues.apache.org/jira/browse/YARN-10228
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> If i configure 
> {code:java}
> yarn.service.am.java.opts=-Xmx768m 
> -Djava.security.auth.login.config=/opt/hadoop/etc/jaas-zk.conf
> {code}
> Invalid character error is getting printed .
> This is due to jvm opts validation added in YARN-9718



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111260#comment-17111260
 ] 

Bilwa S T commented on YARN-9606:
-

I have fixed findbug issue. But still this old issue is coming. Not sure why

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111254#comment-17111254
 ] 

Hadoop QA commented on YARN-9606:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
22m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
35s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
58s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
30s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in 
trunk has 1 extant findbugs warnings. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
45s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 131 unchanged - 2 fixed = 131 total (was 133) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 33s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} hadoop-yarn-server-router in the patch passed. 
{color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
41s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
0s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit 

[jira] [Commented] (YARN-10228) Yarn Service fails if am java opts contains ZK authentication file path

2020-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111251#comment-17111251
 ] 

Bilwa S T commented on YARN-10228:
--

Thank you [~eyang] for looking into this issue.
Can we add a new conf for which we can skip this validation?  In that case we 
dont have to remove "/" from pattern. What do you think?

> Yarn Service fails if am java opts contains ZK authentication file path
> ---
>
> Key: YARN-10228
> URL: https://issues.apache.org/jira/browse/YARN-10228
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
>
> If i configure 
> {code:java}
> yarn.service.am.java.opts=-Xmx768m 
> -Djava.security.auth.login.config=/opt/hadoop/etc/jaas-zk.conf
> {code}
> Invalid character error is getting printed .
> This is due to jvm opts validation added in YARN-9718



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10271) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei resolved YARN-10271.
---
Resolution: Duplicate

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10271
> URL: https://issues.apache.org/jira/browse/YARN-10271
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: shilongfei
>Priority: Major
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9606:

Attachment: YARN-9606.008.patch

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch, YARN-9606.008.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10279) Avoid unnecessary QueueMappingEntity creations

2020-05-19 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10279:


Assignee: (was: Bilwa S T)

> Avoid unnecessary QueueMappingEntity creations
> --
>
> Key: YARN-10279
> URL: https://issues.apache.org/jira/browse/YARN-10279
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Priority: Minor
>
> In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes 
> we create new instances of QueueMappingEntity class. In some cases we simply 
> copy the already received class, so we just duplicate it, which is 
> unnecessary since the class is immutable.
> This is just a minor improvement, probably doesn't have much impact, but 
> still puts some unnecessary load on GC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10274) Merge QueueMapping and QueueMappingEntity

2020-05-19 Thread Gergely Pollak (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Pollak reassigned YARN-10274:
-

Assignee: Gergely Pollak  (was: Bilwa S T)

> Merge QueueMapping and QueueMappingEntity
> -
>
> Key: YARN-10274
> URL: https://issues.apache.org/jira/browse/YARN-10274
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: yarn
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
>
> The role, usage and internal behaviour of these classes are almost identical, 
> but it makes no sense to keep both of them. One is used by UserGroup 
> placement rule definitions the other is used by Application placement rules.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-05-19 Thread Gergely Pollak (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Pollak reassigned YARN-10281:
-

Assignee: Gergely Pollak

> Redundant QueuePath usage in UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> --
>
> Key: YARN-10281
> URL: https://issues.apache.org/jira/browse/YARN-10281
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
>
> We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
> aforementioned classes, but these technically store the same kind of 
> information, yet we keep converting between them, let's examine if we can use 
> only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10280) Find a better way to pass queue manager for UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-05-19 Thread Gergely Pollak (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Pollak reassigned YARN-10280:
-

Assignee: Gergely Pollak

> Find a better way to pass queue manager for UserGroupMappingPlacementRule and 
> AppNameMappingPlacementRule
> -
>
> Key: YARN-10280
> URL: https://issues.apache.org/jira/browse/YARN-10280
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
>
> Since YARN-9789 (Allow multiple leaf queues with the same name in 
> CapacityScheduler) we need to use full queue paths internally, however for 
> backwards compatibility reasons we still allow users to reference queues by 
> their leaf queue name. This  means we need to look up queues in queue manager 
> by their short name to get the full path of the queue. This is why we need 
> the queue manager in certain methods in these classes.
> Currently the qm instance is passed via method arguments wherever necessary, 
> but since the Placement rule classes depend on queue manager, it should be 
> passed and stored as a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10281) Redundant QueuePath usage in UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10281:
-

 Summary: Redundant QueuePath usage in 
UserGroupMappingPlacementRule and AppNameMappingPlacementRule
 Key: YARN-10281
 URL: https://issues.apache.org/jira/browse/YARN-10281
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Gergely Pollak


We use the QueuePath and QueueMapping (or QueueMappingEntity) objects in the 
aforementioned classes, but these technically store the same kind of 
information, yet we keep converting between them, let's examine if we can use 
only the QueueMapping(Entity) instead, since that holds more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10279) Avoid unnecessary QueueMappingEntity creations

2020-05-19 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10279:


Assignee: Bilwa S T

> Avoid unnecessary QueueMappingEntity creations
> --
>
> Key: YARN-10279
> URL: https://issues.apache.org/jira/browse/YARN-10279
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Bilwa S T
>Priority: Minor
>
> In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes 
> we create new instances of QueueMappingEntity class. In some cases we simply 
> copy the already received class, so we just duplicate it, which is 
> unnecessary since the class is immutable.
> This is just a minor improvement, probably doesn't have much impact, but 
> still puts some unnecessary load on GC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10280) Find a better way to pass queue manager for UserGroupMappingPlacementRule and AppNameMappingPlacementRule

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10280:
-

 Summary: Find a better way to pass queue manager for 
UserGroupMappingPlacementRule and AppNameMappingPlacementRule
 Key: YARN-10280
 URL: https://issues.apache.org/jira/browse/YARN-10280
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Gergely Pollak


Since YARN-9789 (Allow multiple leaf queues with the same name in 
CapacityScheduler) we need to use full queue paths internally, however for 
backwards compatibility reasons we still allow users to reference queues by 
their leaf queue name. This  means we need to look up queues in queue manager 
by their short name to get the full path of the queue. This is why we need the 
queue manager in certain methods in these classes.

Currently the qm instance is passed via method arguments wherever necessary, 
but since the Placement rule classes depend on queue manager, it should be 
passed and stored as a field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10279) Avoid unnecessary QueueMappingEntity creations

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10279:
-

 Summary: Avoid unnecessary QueueMappingEntity creations
 Key: YARN-10279
 URL: https://issues.apache.org/jira/browse/YARN-10279
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Gergely Pollak


In CS UserGroupMappingPlacementRule and AppNameMappingPlacementRule classes we 
create new instances of QueueMappingEntity class. In some cases we simply copy 
the already received class, so we just duplicate it, which is unnecessary since 
the class is immutable.

This is just a minor improvement, probably doesn't have much impact, but still 
puts some unnecessary load on GC.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111066#comment-17111066
 ] 

Hadoop QA commented on YARN-9606:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
29s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
47s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
16s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in 
trunk has 1 extant findbugs warnings. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
59s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 27s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 131 unchanged - 2 fixed = 132 total (was 133) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 45s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common 
generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
0s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-yarn-server-router in the patch passed. 
{color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m  
7s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
1s{color} | {color:green} 

[jira] [Assigned] (YARN-10275) CapacityScheduler QueuePath object should be able to parse paths

2020-05-19 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10275:


Assignee: Bilwa S T

> CapacityScheduler QueuePath object should be able to parse paths
> 
>
> Key: YARN-10275
> URL: https://issues.apache.org/jira/browse/YARN-10275
> Project: Hadoop YARN
>  Issue Type: Task
>Reporter: Gergely Pollak
>Assignee: Bilwa S T
>Priority: Major
>
> Currently QueuePlacementRuleUtils has an extractQueuePath method, which is 
> used to split full paths to parent path +leafqueue name, all instances of 
> QueuePath are created via this method, this suggest this behaviour should be 
> part of the QueuePath object.
> We should create a constructor, which implements this logic, and remove the 
> extractQueuePath method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10278) CapacityScheduler test framework ProportionalCapacityPreemptionPolicyMockFramework need some review

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10278:
-

 Summary: CapacityScheduler test framework 
ProportionalCapacityPreemptionPolicyMockFramework need some review
 Key: YARN-10278
 URL: https://issues.apache.org/jira/browse/YARN-10278
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Gergely Pollak


This test framework class mocks a bit too heavily, and simulates CS internal 
behaviour with the mock methods over a point it is reasonably maintainable, any 
internal change in CS is a major headscratch.

A lot of tests depend on this class, so we should approach it carefully, but I 
think it's wroth to examine this class if it can be made a bit more resilient 
to changes, and easier to maintain. Or at least document it better.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10277) CapacityScheduler test TestUserGroupMappingPlacementRule should build proper hierarchy

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10277:
-

 Summary: CapacityScheduler test TestUserGroupMappingPlacementRule 
should build proper hierarchy
 Key: YARN-10277
 URL: https://issues.apache.org/jira/browse/YARN-10277
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Gergely Pollak


Since the CapacityScheduler internal implementation depends more and more on 
queue being hierarchical, the test gets really hard to maintain. A lot of test 
cases were failing because they used non existing queues, but the older 
placement rule solution ignored missing parents, but since the leaf queue 
change in CS, we must be able to get a full path for any queue, since all 
queues are referenced by their full path.

This test should reflect this and instead of creating and expecting the 
existance of fictional queues, it should create a proper queue hierarchy, with 
a way to describe it better. 

Currently we set up a bunch of mockito "when" statements to simulate the queue 
behavior, but this is a hassle to maintain, and easy to miss a few method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10274) Merge QueueMapping and QueueMappingEntity

2020-05-19 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T reassigned YARN-10274:


Assignee: Bilwa S T

> Merge QueueMapping and QueueMappingEntity
> -
>
> Key: YARN-10274
> URL: https://issues.apache.org/jira/browse/YARN-10274
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: yarn
>Reporter: Gergely Pollak
>Assignee: Bilwa S T
>Priority: Major
>
> The role, usage and internal behaviour of these classes are almost identical, 
> but it makes no sense to keep both of them. One is used by UserGroup 
> placement rule definitions the other is used by Application placement rules.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10276) Check and improve memory footprint of CapacityScheduler CSQueueStore

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10276:
-

 Summary: Check and improve memory footprint of CapacityScheduler 
CSQueueStore
 Key: YARN-10276
 URL: https://issues.apache.org/jira/browse/YARN-10276
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Gergely Pollak


The class creates a lot of Set instances, which might have a bit bigger memory 
overhead than necessary, this might be not a critical issue, but let's examine 
if we can or should create a more memory efficient solution while keeping the 
performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10275) CapacityScheduler QueuePath object should be able to parse paths

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10275:
-

 Summary: CapacityScheduler QueuePath object should be able to 
parse paths
 Key: YARN-10275
 URL: https://issues.apache.org/jira/browse/YARN-10275
 Project: Hadoop YARN
  Issue Type: Task
Reporter: Gergely Pollak


Currently QueuePlacementRuleUtils has an extractQueuePath method, which is used 
to split full paths to parent path +leafqueue name, all instances of QueuePath 
are created via this method, this suggest this behaviour should be part of the 
QueuePath object.

We should create a constructor, which implements this logic, and remove the 
extractQueuePath method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10274) Merge QueueMapping and QueueMappingEntity

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10274:
-

 Summary: Merge QueueMapping and QueueMappingEntity
 Key: YARN-10274
 URL: https://issues.apache.org/jira/browse/YARN-10274
 Project: Hadoop YARN
  Issue Type: Task
  Components: yarn
Reporter: Gergely Pollak


The role, usage and internal behaviour of these classes are almost identical, 
but it makes no sense to keep both of them. One is used by UserGroup placement 
rule definitions the other is used by Application placement rules.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10273) [Umbrella] Followup tasks for CapacityScheduler leaf queue changes (YARN-9879)

2020-05-19 Thread Gergely Pollak (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gergely Pollak updated YARN-10273:
--
Summary: [Umbrella] Followup tasks for CapacityScheduler leaf queue changes 
(YARN-9879)  (was: [Umbrella] Followup changes for CapacityScheduler leaf queue 
changes (YARN-9879))

> [Umbrella] Followup tasks for CapacityScheduler leaf queue changes (YARN-9879)
> --
>
> Key: YARN-10273
> URL: https://issues.apache.org/jira/browse/YARN-10273
> Project: Hadoop YARN
>  Issue Type: Task
> Environment: YARN-9879 (Allow multiple leaf queues with the same 
> name) Introduced a lot of core changes to the internal behaviour of the 
> capacity scheduler, during the implementations we encountered some places 
> which require additional attention, but not scope of YARN-9879, this umbrella 
> contains those issues. 
>Reporter: Gergely Pollak
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10273) [Umbrella] Followup changes for CapacityScheduler leaf queue changes (YARN-9879)

2020-05-19 Thread Gergely Pollak (Jira)
Gergely Pollak created YARN-10273:
-

 Summary: [Umbrella] Followup changes for CapacityScheduler leaf 
queue changes (YARN-9879)
 Key: YARN-10273
 URL: https://issues.apache.org/jira/browse/YARN-10273
 Project: Hadoop YARN
  Issue Type: Task
 Environment: YARN-9879 (Allow multiple leaf queues with the same name) 
Introduced a lot of core changes to the internal behaviour of the capacity 
scheduler, during the implementations we encountered some places which require 
additional attention, but not scope of YARN-9879, this umbrella contains those 
issues. 
Reporter: Gergely Pollak






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111012#comment-17111012
 ] 

shilongfei edited comment on YARN-10272 at 5/19/20, 9:42 AM:
-

*the third time, version:2.6.0*

This time it happened when the NM restarted, and the container needs to be 
recover after the NM restart. After analysis, it is suspected that it may be 
stuck in the check of isContainerAlive() in 
ContainerExector.reacquireContainer(), where a kill -0 $ pid command was 
executed, I use DefaultContainerExecutor. But there is no scene this time, and 
it is not entirely possible.

!image-2020-05-19-17-27-02-496.png!

!image-2020-05-19-17-27-35-374.png!


was (Author: shilongfei):
*the third time, version:2.6.0*

This time it happened when the NM restarted, and the container needs to be 
recover after the NM restart. After analysis, it is suspected that it may be 
stuck in the check of isContainerAlive() in 
ContainerExector.reacquireContainer(), where a kill -0 $ pid command was 
executed, I use DefaultContainerExecutor.

!image-2020-05-19-17-27-02-496.png!

!image-2020-05-19-17-27-35-374.png!

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, 
> image-2020-05-11-14-53-09-751.png, image-2020-05-19-17-27-02-496.png, 
> image-2020-05-19-17-27-35-374.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input and error streams. I have encountered this situation three 
> times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei updated YARN-10272:
--
Description: When using Shell to execute a shell script, it occasionally 
gets stuck at reading input and error streams. I have encountered this 
situation three times, I will write the three situations in the comments.  
(was: When using Shell to execute a shell script, it occasionally gets stuck at 
reading input, input and error streams. I have encountered this situation three 
times, I will write the three situations in the comments.)

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, 
> image-2020-05-11-14-53-09-751.png, image-2020-05-19-17-27-02-496.png, 
> image-2020-05-19-17-27-35-374.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input and error streams. I have encountered this situation three 
> times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111012#comment-17111012
 ] 

shilongfei edited comment on YARN-10272 at 5/19/20, 9:29 AM:
-

*the third time, version:2.6.0*

This time it happened when the NM restarted, and the container needs to be 
recover after the NM restart. After analysis, it is suspected that it may be 
stuck in the check of isContainerAlive() in 
ContainerExector.reacquireContainer(), where a kill -0 $ pid command was 
executed, I use DefaultContainerExecutor.

!image-2020-05-19-17-27-02-496.png!

!image-2020-05-19-17-27-35-374.png!


was (Author: shilongfei):
the third time, version:2.6.0

This time it happened when the NM restarted, and the container needs to be 
recover after the NM restart. After analysis, it is suspected that it may be 
stuck in the check of isContainerAlive() in 
ContainerExector.reacquireContainer(), where a kill -0 $ pid command was 
executed, I use DefaultContainerExecutor.

!image-2020-05-19-17-27-02-496.png!

!image-2020-05-19-17-27-35-374.png!

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, 
> image-2020-05-11-14-53-09-751.png, image-2020-05-19-17-27-02-496.png, 
> image-2020-05-19-17-27-35-374.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111012#comment-17111012
 ] 

shilongfei commented on YARN-10272:
---

the third time, version:2.6.0

This time it happened when the NM restarted, and the container needs to be 
recover after the NM restart. After analysis, it is suspected that it may be 
stuck in the check of isContainerAlive() in 
ContainerExector.reacquireContainer(), where a kill -0 $ pid command was 
executed, I use DefaultContainerExecutor.

!image-2020-05-19-17-27-02-496.png!

!image-2020-05-19-17-27-35-374.png!

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, 
> image-2020-05-11-14-53-09-751.png, image-2020-05-19-17-27-02-496.png, 
> image-2020-05-19-17-27-35-374.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei updated YARN-10272:
--
Attachment: image-2020-05-19-17-27-35-374.png

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, 
> image-2020-05-11-14-53-09-751.png, image-2020-05-19-17-27-02-496.png, 
> image-2020-05-19-17-27-35-374.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei updated YARN-10272:
--
Attachment: image-2020-05-19-17-27-02-496.png

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, 
> image-2020-05-11-14-53-09-751.png, image-2020-05-19-17-27-02-496.png, 
> image-2020-05-19-17-27-35-374.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110990#comment-17110990
 ] 

shilongfei edited comment on YARN-10272 at 5/19/20, 9:09 AM:
-

*The second time**, version:.3.1.0*

The initial phenomenon is the same as above, but this time jstack is not the 
same as before, This time jstack is as follows, The 
ContainersLauncher.runPreKillContainerScript() method is customized by us, it 
is to execute a script before the container exits for doing something (such as 
jstack to save the container), which uses the Shell to execute the script
{code:java}
"NM ContainerManager dispatcher" #193 prio=5 os_prio=0 tid=0x7fa79b0cc800 
nid=0x493c in Object.wait() [0x7fa5a9ac5000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1252)
- locked <0xe65dc540> (a org.apache.hadoop.util.Shell$1)
at java.lang.Thread.join(Thread.java:1326)
at org.apache.hadoop.util.Shell.joinThread(Shell.java:1057)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1037)
at org.apache.hadoop.util.Shell.run(Shell.java:902)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.runPreKillContainerScript(ContainersLauncher.java:266)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:162)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748)
{code}
Shell.joinThread () method joins the errThread, the errThread stuck on read 
error stream
{code:java}
"Thread-430" #768 prio=5 os_prio=0 tid=0x7fa5541ef800 nid=0x57a7 runnable 
[0x7fa39dcf8000]
   java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0xeb7fe618> (a 
java.lang.UNIXProcess$ProcessPipeInputStream)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0xeb8cd168> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
- locked <0xeb8cd168> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at org.apache.hadoop.util.Shell$1.run(Shell.java:970){code}
!image-2020-05-11-14-53-09-751.png!


was (Author: shilongfei):
*The second time**, version:.3.1.0***

The initial phenomenon is the same as above, but this time jstack is not the 
same as before, This time jstack is as follows, The 
ContainersLauncher.runPreKillContainerScript() method is customized by us, it 
is to execute a script before the container exits for doing something (such as 
jstack to save the container), which uses the Shell to execute the script
{code:java}
"NM ContainerManager dispatcher" #193 prio=5 os_prio=0 tid=0x7fa79b0cc800 
nid=0x493c in Object.wait() [0x7fa5a9ac5000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1252)
- locked <0xe65dc540> (a org.apache.hadoop.util.Shell$1)
at java.lang.Thread.join(Thread.java:1326)
at org.apache.hadoop.util.Shell.joinThread(Shell.java:1057)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1037)
at org.apache.hadoop.util.Shell.run(Shell.java:902)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.runPreKillContainerScript(ContainersLauncher.java:266)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:162)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 

[jira] [Comment Edited] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110981#comment-17110981
 ] 

shilongfei edited comment on YARN-10272 at 5/19/20, 9:08 AM:
-

*The first time, version:.3.1.0*

At the beginning, I found that some containers allocated on one NodeManager 
could not be scheduled. I watched the NodeManager log and found that there are 
many events in the event queue, I got NodeManager's jstack and heap, From 
jstack we can see that the dispatcher of ContainerManager has been waiting for 
<0xb67c2590>, but it is not found in jstack who is locked 
<0xb67c2590>
{code:java}
"NM ContainerManager dispatcher" #166 prio=5 os_prio=0 tid=0x7ff692e72800 
nid=0x76e2 waiting on condition [0x7ff4a3ffe000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xb67c2590> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:833)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:180)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748){code}
Look for <0xb67c2590> in the heap and find that the object is locked in 
ContainerLaunch.launchContainer(), You can see from the thread stack that it is 
stuck in the parseExecResult() method

!image-2020-04-02-18-54-13-112.png!

!image-2020-04-02-18-58-39-977.png!

!image-2020-04-02-19-00-01-387.png!


was (Author: shilongfei):
*The first time*

At the beginning, I found that some containers allocated on one NodeManager 
could not be scheduled. I watched the NodeManager log and found that there are 
many events in the event queue, I got NodeManager's jstack and heap, From 
jstack we can see that the dispatcher of ContainerManager has been waiting for 
<0xb67c2590>, but it is not found in jstack who is locked 
<0xb67c2590>
{code:java}
"NM ContainerManager dispatcher" #166 prio=5 os_prio=0 tid=0x7ff692e72800 
nid=0x76e2 waiting on condition [0x7ff4a3ffe000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xb67c2590> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:833)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:180)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748){code}
Look for <0xb67c2590> in the heap and find that the object is locked in 
ContainerLaunch.launchContainer(), You can see from the thread stack that it is 
stuck in the parseExecResult() method

!image-2020-04-02-18-54-13-112.png!

!image-2020-04-02-18-58-39-977.png!

!image-2020-04-02-19-00-01-387.png!

> Shell#runCommand() 

[jira] [Comment Edited] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110990#comment-17110990
 ] 

shilongfei edited comment on YARN-10272 at 5/19/20, 9:08 AM:
-

*The second time**, version:.3.1.0***

The initial phenomenon is the same as above, but this time jstack is not the 
same as before, This time jstack is as follows, The 
ContainersLauncher.runPreKillContainerScript() method is customized by us, it 
is to execute a script before the container exits for doing something (such as 
jstack to save the container), which uses the Shell to execute the script
{code:java}
"NM ContainerManager dispatcher" #193 prio=5 os_prio=0 tid=0x7fa79b0cc800 
nid=0x493c in Object.wait() [0x7fa5a9ac5000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1252)
- locked <0xe65dc540> (a org.apache.hadoop.util.Shell$1)
at java.lang.Thread.join(Thread.java:1326)
at org.apache.hadoop.util.Shell.joinThread(Shell.java:1057)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1037)
at org.apache.hadoop.util.Shell.run(Shell.java:902)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.runPreKillContainerScript(ContainersLauncher.java:266)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:162)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748)
{code}
Shell.joinThread () method joins the errThread, the errThread stuck on read 
error stream
{code:java}
"Thread-430" #768 prio=5 os_prio=0 tid=0x7fa5541ef800 nid=0x57a7 runnable 
[0x7fa39dcf8000]
   java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0xeb7fe618> (a 
java.lang.UNIXProcess$ProcessPipeInputStream)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0xeb8cd168> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
- locked <0xeb8cd168> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at org.apache.hadoop.util.Shell$1.run(Shell.java:970){code}
!image-2020-05-11-14-53-09-751.png!


was (Author: shilongfei):
*The second time*

The initial phenomenon is the same as above, but this time jstack is not the 
same as before, This time jstack is as follows, The 
ContainersLauncher.runPreKillContainerScript() method is customized by us, it 
is to execute a script before the container exits for doing something (such as 
jstack to save the container), which uses the Shell to execute the script
{code:java}
"NM ContainerManager dispatcher" #193 prio=5 os_prio=0 tid=0x7fa79b0cc800 
nid=0x493c in Object.wait() [0x7fa5a9ac5000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1252)
- locked <0xe65dc540> (a org.apache.hadoop.util.Shell$1)
at java.lang.Thread.join(Thread.java:1326)
at org.apache.hadoop.util.Shell.joinThread(Shell.java:1057)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1037)
at org.apache.hadoop.util.Shell.run(Shell.java:902)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.runPreKillContainerScript(ContainersLauncher.java:266)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:162)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)

[jira] [Updated] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei updated YARN-10272:
--
Attachment: image-2020-05-11-14-53-09-751.png

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, 
> image-2020-05-11-14-53-09-751.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110990#comment-17110990
 ] 

shilongfei commented on YARN-10272:
---

*The second time*

The initial phenomenon is the same as above, but this time jstack is not the 
same as before, This time jstack is as follows, The 
ContainersLauncher.runPreKillContainerScript() method is customized by us, it 
is to execute a script before the container exits for doing something (such as 
jstack to save the container), which uses the Shell to execute the script
{code:java}
"NM ContainerManager dispatcher" #193 prio=5 os_prio=0 tid=0x7fa79b0cc800 
nid=0x493c in Object.wait() [0x7fa5a9ac5000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1252)
- locked <0xe65dc540> (a org.apache.hadoop.util.Shell$1)
at java.lang.Thread.join(Thread.java:1326)
at org.apache.hadoop.util.Shell.joinThread(Shell.java:1057)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1037)
at org.apache.hadoop.util.Shell.run(Shell.java:902)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1227)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.runPreKillContainerScript(ContainersLauncher.java:266)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:162)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748)
{code}
Shell.joinThread () method joins the errThread, the errThread stuck on read 
error stream
{code:java}
"Thread-430" #768 prio=5 os_prio=0 tid=0x7fa5541ef800 nid=0x57a7 runnable 
[0x7fa39dcf8000]
   java.lang.Thread.State: RUNNABLE
at java.io.FileInputStream.readBytes(Native Method)
at java.io.FileInputStream.read(FileInputStream.java:255)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0xeb7fe618> (a 
java.lang.UNIXProcess$ProcessPipeInputStream)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
- locked <0xeb8cd168> (a java.io.InputStreamReader)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:161)
at java.io.BufferedReader.readLine(BufferedReader.java:324)
- locked <0xeb8cd168> (a java.io.InputStreamReader)
at java.io.BufferedReader.readLine(BufferedReader.java:389)
at org.apache.hadoop.util.Shell$1.run(Shell.java:970){code}
!image-2020-05-11-14-53-09-751.png!

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png, 
> image-2020-05-11-14-53-09-751.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei updated YARN-10272:
--
Attachment: image-2020-04-02-19-00-01-387.png

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png, image-2020-04-02-19-00-01-387.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110981#comment-17110981
 ] 

shilongfei edited comment on YARN-10272 at 5/19/20, 9:01 AM:
-

*The first time*

At the beginning, I found that some containers allocated on one NodeManager 
could not be scheduled. I watched the NodeManager log and found that there are 
many events in the event queue, I got NodeManager's jstack and heap, From 
jstack we can see that the dispatcher of ContainerManager has been waiting for 
<0xb67c2590>, but it is not found in jstack who is locked 
<0xb67c2590>
{code:java}
"NM ContainerManager dispatcher" #166 prio=5 os_prio=0 tid=0x7ff692e72800 
nid=0x76e2 waiting on condition [0x7ff4a3ffe000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xb67c2590> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:833)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:180)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748){code}
Look for <0xb67c2590> in the heap and find that the object is locked in 
ContainerLaunch.launchContainer(), You can see from the thread stack that it is 
stuck in the parseExecResult() method

!image-2020-04-02-18-54-13-112.png!

!image-2020-04-02-18-58-39-977.png!

!image-2020-04-02-19-00-01-387.png!


was (Author: shilongfei):
*The first time*

At the beginning, I found that some containers allocated on one NodeManager 
could not be scheduled. I watched the NodeManager log and found that there are 
many events in the event queue, I got NodeManager's jstack and heap, From 
jstack we can see that the dispatcher of ContainerManager has been waiting for 
<0xb67c2590>, but it is not found in jstack who is locked 
<0xb67c2590>
{code:java}
"NM ContainerManager dispatcher" #166 prio=5 os_prio=0 tid=0x7ff692e72800 
nid=0x76e2 waiting on condition [0x7ff4a3ffe000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xb67c2590> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:833)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:180)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748){code}
Look for <0xb67c2590> in the heap and find that the object is locked in 
ContainerLaunch.launchContainer(), You can see from the thread stack that it is 
stuck in the parseExecResult() method

!image-2020-04-02-18-54-13-112.png!

!image-2020-04-02-18-58-39-977.png!

> Shell#runCommand() executes a shell script and gets stuck when reading 

[jira] [Commented] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110981#comment-17110981
 ] 

shilongfei commented on YARN-10272:
---

*The first time*

At the beginning, I found that some containers allocated on one NodeManager 
could not be scheduled. I watched the NodeManager log and found that there are 
many events in the event queue, I got NodeManager's jstack and heap, From 
jstack we can see that the dispatcher of ContainerManager has been waiting for 
<0xb67c2590>, but it is not found in jstack who is locked 
<0xb67c2590>
{code:java}
"NM ContainerManager dispatcher" #166 prio=5 os_prio=0 tid=0x7ff692e72800 
nid=0x76e2 waiting on condition [0x7ff4a3ffe000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0xb67c2590> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:833)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:180)
at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:66)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:198)
at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
at java.lang.Thread.run(Thread.java:748){code}
Look for <0xb67c2590> in the heap and find that the object is locked in 
ContainerLaunch.launchContainer(), You can see from the thread stack that it is 
stuck in the parseExecResult() method

!image-2020-04-02-18-54-13-112.png!

!image-2020-04-02-18-58-39-977.png!

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei updated YARN-10272:
--
Attachment: image-2020-04-02-18-58-39-977.png
image-2020-04-02-18-54-13-112.png

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
> Attachments: image-2020-04-02-18-54-13-112.png, 
> image-2020-04-02-18-58-39-977.png
>
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei updated YARN-10272:
--
Affects Version/s: 2.6.0
   3.1.0

> Shell#runCommand() executes a shell script and gets stuck when reading stdout 
> and stderr
> 
>
> Key: YARN-10272
> URL: https://issues.apache.org/jira/browse/YARN-10272
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0, 3.1.0
>Reporter: shilongfei
>Priority: Major
>
> When using Shell to execute a shell script, it occasionally gets stuck at 
> reading input, input and error streams. I have encountered this situation 
> three times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10271) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)
shilongfei created YARN-10271:
-

 Summary: Shell#runCommand() executes a shell script and gets stuck 
when reading stdout and stderr
 Key: YARN-10271
 URL: https://issues.apache.org/jira/browse/YARN-10271
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: shilongfei


When using Shell to execute a shell script, it occasionally gets stuck at 
reading input, input and error streams. I have encountered this situation three 
times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10272) Shell#runCommand() executes a shell script and gets stuck when reading stdout and stderr

2020-05-19 Thread shilongfei (Jira)
shilongfei created YARN-10272:
-

 Summary: Shell#runCommand() executes a shell script and gets stuck 
when reading stdout and stderr
 Key: YARN-10272
 URL: https://issues.apache.org/jira/browse/YARN-10272
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: shilongfei


When using Shell to execute a shell script, it occasionally gets stuck at 
reading input, input and error streams. I have encountered this situation three 
times, I will write the three situations in the comments.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Bilwa S T (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110963#comment-17110963
 ] 

Bilwa S T commented on YARN-9606:
-

[~prabhujoseph] thanks for suggestion. I have updated it v7 patch

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Bilwa S T (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bilwa S T updated YARN-9606:

Attachment: YARN-9606.007.patch

> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch, YARN-9606.007.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9606) Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient

2020-05-19 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110880#comment-17110880
 ] 

Prabhu Joseph commented on YARN-9606:
-

[~BilwaST] Will suggest to move the existing WebServiceClient.java from 
hadoop-yarn-server-common into hadoop-yarn-common instead of creating one more 
WebServiceClient.java in hadoop-yarn-client. Thanks.


> Set sslfactory for AuthenticatedURL() while creating LogsCLI#webServiceClient 
> --
>
> Key: YARN-9606
> URL: https://issues.apache.org/jira/browse/YARN-9606
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-9606-001.patch, YARN-9606-002.patch, 
> YARN-9606.003.patch, YARN-9606.004.patch, YARN-9606.005.patch, 
> YARN-9606.006.patch
>
>
> Yarn logs fails for running containers    
>   
> 
>   {quote}                                                                     
>                           
>   
>
>  Unable to fetch log files list
>  Exception in thread "main" java.io.IOException: 
> com.sun.jersey.api.client.ClientHandlerException: 
> javax.net.ssl.SSLHandshakeException: Error while authenticating with 
> endpoint: 
> [https://vm2:65321/ws/v1/node/containers/container_e05_1559802125016_0001_01_08/logs]
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getContainerLogFiles(LogsCLI.java:543)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedContainerLogFiles(LogsCLI.java:1338)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.getMatchedOptionForRunningApp(LogsCLI.java:1514)
>  at 
> org.apache.hadoop.yarn.client.cli.LogsCLI.fetchContainerLogs(LogsCLI.java:1052)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.runCommand(LogsCLI.java:367)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:152)
>  at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:399)
>  {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org