[jira] [Commented] (YARN-9965) Fix NodeManager failing to start on subsequent times when Hdfs Auxillary Jar is set

2019-11-18 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977163#comment-16977163
 ] 

Prabhu Joseph commented on YARN-9965:
-

Thanks [~abmodi].

> Fix NodeManager failing to start on subsequent times when Hdfs Auxillary Jar 
> is set
> ---
>
> Key: YARN-9965
> URL: https://issues.apache.org/jira/browse/YARN-9965
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: auxservices, nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9965-001.patch, YARN-9965-addendum-01.patch
>
>
> Loading an auxiliary jar from a Hdfs location on a node manager works as 
> expected on first time. The subsequent restart fails with 
> ClassNotFoundException
> {code:java}
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> classpath: []
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> system classes: [java., javax.accessibility., javax.activation., 
> javax.activity., javax.annotation., javax.annotation.processing., 
> javax.crypto., javax.imageio., javax.jws., javax.lang.model., 
> -javax.management.j2ee., javax.management., javax.naming., javax.net., 
> javax.print., javax.rmi., javax.script., -javax.security.auth.message., 
> javax.security.auth., javax.security.cert., javax.security.sasl., 
> javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., 
> -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., 
> org.xml.sax., org.apache.commons.logging., org.apache.log4j., 
> -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, 
> hdfs-default.xml, mapred-default.xml, yarn-default.xml]
> 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
> in state INITED
> java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016)
> {code}
>  
> The issue happens when reusing the previous localized auxillary service jar. 
> The localized jar file is appended with /* when reusing which has caused the 
> issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9986) signalToContainer REST API does not work even if requested by the app owner

2019-11-18 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977161#comment-16977161
 ] 

Prabhu Joseph commented on YARN-9986:
-

[~kyungwan nam] Thanks for the patch. The patch looks good. Can you include a 
test case as well.

> signalToContainer REST API does not work even if requested by the app owner
> ---
>
> Key: YARN-9986
> URL: https://issues.apache.org/jira/browse/YARN-9986
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: restapi
>Reporter: kyungwan nam
>Assignee: kyungwan nam
>Priority: Major
> Attachments: YARN-9986.001.patch
>
>
> signalToContainer REST API introduced in YARN-8693 does not work even if 
> requested by the app owner. 
> It works well only if requested by an admin user
> {code}
> $ kinit kwnam
> Password for kw...@test.org:
> $ curl  -H 'Content-Type: application/json' --negotiate -u : -X POST 
> https://rm002.test.org:8088/ws/v1/cluster/containers/container_e58_1573625560605_29927_01_01/signal/GRACEFUL_SHUTDOWN
> {"RemoteException":{"exception":"ForbiddenException","message":"java.lang.Exception:
>  Only admins can carry out this 
> operation.","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"}}$
> $ kinit admin
> Password for ad...@test.org:
> $
> $ curl  -H 'Content-Type: application/json' --negotiate -u : -X POST 
> https://rm002.test.org:8088/ws/v1/cluster/containers/container_e58_1573625560605_29927_01_01/signal/GRACEFUL_SHUTDOWN
> $
> {code}
> in contrast, the app owner can do it using the command line as below.
> {code}
> $ kinit kwnam
> Password for kw...@test.org:
> $ yarn container -signal container_e58_1573625560605_29927_01_02  
> GRACEFUL_SHUTDOWN
> Signalling container container_e58_1573625560605_29927_01_02
> 2019-11-19 09:12:29,797 INFO impl.YarnClientImpl: Signalling container 
> container_e58_1573625560605_29927_01_02 with command GRACEFUL_SHUTDOWN
> 2019-11-19 09:12:29,920 INFO client.ConfiguredRMFailoverProxyProvider: 
> Failing over to rm2
> $
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9987) Upgrade bower to 1.8.8

2019-11-18 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977153#comment-16977153
 ] 

Akira Ajisaka commented on YARN-9987:
-

If a committer merges the pull request, dependabot will become the contributor 
of this issue. Is it okay?

> Upgrade bower to 1.8.8
> --
>
> Key: YARN-9987
> URL: https://issues.apache.org/jira/browse/YARN-9987
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn-ui-v2
>Reporter: Akira Ajisaka
>Priority: Major
>
> Merge https://github.com/apache/hadoop/pull/1683 to fix some vulnerabilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9987) Upgrade bower to 1.8.8

2019-11-18 Thread Akira Ajisaka (Jira)
Akira Ajisaka created YARN-9987:
---

 Summary: Upgrade bower to 1.8.8
 Key: YARN-9987
 URL: https://issues.apache.org/jira/browse/YARN-9987
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-ui-v2
Reporter: Akira Ajisaka


Merge https://github.com/apache/hadoop/pull/1683 to fix some vulnerabilities.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9986) signalToContainer REST API does not work even if requested by the app owner

2019-11-18 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977105#comment-16977105
 ] 

Hadoop QA commented on YARN-9986:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m  1s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}141m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestLeaderElectorService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9986 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986178/YARN-9986.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c5d52e3d5d46 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 0e22e9a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/25193/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25193/testReport/ |
| Max. process+thread count | 814 (vs. ulimit of 5500) |
| modules | C: 

[jira] [Commented] (YARN-9965) Fix NodeManager failing to start on subsequent times when Hdfs Auxillary Jar is set

2019-11-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977082#comment-16977082
 ] 

Hudson commented on YARN-9965:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17661 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17661/])
YARN-9965. Fix NodeManager failing to start on subsequent times when (abmodi: 
rev dc3f4fc2f44c22300cd0b4832469b8cd59a1f228)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestAuxServices.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java


> Fix NodeManager failing to start on subsequent times when Hdfs Auxillary Jar 
> is set
> ---
>
> Key: YARN-9965
> URL: https://issues.apache.org/jira/browse/YARN-9965
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: auxservices, nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9965-001.patch, YARN-9965-addendum-01.patch
>
>
> Loading an auxiliary jar from a Hdfs location on a node manager works as 
> expected on first time. The subsequent restart fails with 
> ClassNotFoundException
> {code:java}
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> classpath: []
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> system classes: [java., javax.accessibility., javax.activation., 
> javax.activity., javax.annotation., javax.annotation.processing., 
> javax.crypto., javax.imageio., javax.jws., javax.lang.model., 
> -javax.management.j2ee., javax.management., javax.naming., javax.net., 
> javax.print., javax.rmi., javax.script., -javax.security.auth.message., 
> javax.security.auth., javax.security.cert., javax.security.sasl., 
> javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., 
> -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., 
> org.xml.sax., org.apache.commons.logging., org.apache.log4j., 
> -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, 
> hdfs-default.xml, mapred-default.xml, yarn-default.xml]
> 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
> in state INITED
> java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016)
> {code}
>  
> The issue happens when reusing the previous localized auxillary service jar. 
> The localized jar file is appended with /* when reusing which has caused the 
> issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (YARN-9965) Fix NodeManager failing to start on subsequent times when Hdfs Auxillary Jar is set

2019-11-18 Thread Abhishek Modi (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977073#comment-16977073
 ] 

Abhishek Modi commented on YARN-9965:
-

Thanks [~prabhujoseph]. Latest addendum patch looks good to me. Committed it to 
trunk.

> Fix NodeManager failing to start on subsequent times when Hdfs Auxillary Jar 
> is set
> ---
>
> Key: YARN-9965
> URL: https://issues.apache.org/jira/browse/YARN-9965
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: auxservices, nodemanager
>Affects Versions: 3.2.0
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9965-001.patch, YARN-9965-addendum-01.patch
>
>
> Loading an auxiliary jar from a Hdfs location on a node manager works as 
> expected on first time. The subsequent restart fails with 
> ClassNotFoundException
> {code:java}
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> classpath: []
> 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: 
> system classes: [java., javax.accessibility., javax.activation., 
> javax.activity., javax.annotation., javax.annotation.processing., 
> javax.crypto., javax.imageio., javax.jws., javax.lang.model., 
> -javax.management.j2ee., javax.management., javax.naming., javax.net., 
> javax.print., javax.rmi., javax.script., -javax.security.auth.message., 
> javax.security.auth., javax.security.cert., javax.security.sasl., 
> javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., 
> -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., 
> org.xml.sax., org.apache.commons.logging., org.apache.log4j., 
> -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, 
> hdfs-default.xml, mapred-default.xml, yarn-default.xml]
> 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: 
> Service 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed 
> in state INITED
> java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
>   at 
> org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016)
> {code}
>  
> The issue happens when reusing the previous localized auxillary service jar. 
> The localized jar file is appended with /* when reusing which has caused the 
> issue.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6492) Generate queue metrics for each partition

2019-11-18 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977070#comment-16977070
 ] 

Jonathan Hung commented on YARN-6492:
-

[~maniraj...@gmail.com] thanks for working on this feature. I'm still going 
thru the patch, I have some comments though:
 * in pSourceName, how come we split partition by Q_SPLITTER? I think we don't 
need to do any splitting here (there should only be one partition)
 * I don't see PartitionQueueMetrics#forQueue invoked anywhere except a test 
case, unless I missed it? Do we need this? The metrics registration seems to 
happen in QueueMetrics#getPartitionQueueMetrics. Also, how come we call 
getQueueMetrics().put(partition, metrics)? I think it should be keyed by 
partition + queueName
 * Do we need a separate getPartitionMetrics? Can we track a partition's 
metrics via that partition + root queue?
 * For setAvailableResourcesToUser - how come we add this bit?
{noformat}
 if (parent != null) {
 parent.setAvailableResourcesToUser(partition, user, limit);
}{noformat}

 * It seems in some methods, e.g. incrPendingResources, decrPendingResources, 
etc. we need to add some metrics inheritance from child queue to parent queue 
(e.g, this is what's done for default partition:
{noformat}
_decrPendingResources(containers, res);
QueueMetrics userMetrics = getUserMetrics(user);
if (userMetrics != null) {
  userMetrics.decrPendingResources(partition, user, containers, res);
}
if (parent != null) {
  parent.decrPendingResources(partition, user, containers, res);
} {noformat}

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-6492) Generate queue metrics for each partition

2019-11-18 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977070#comment-16977070
 ] 

Jonathan Hung edited comment on YARN-6492 at 11/19/19 2:32 AM:
---

[~maniraj...@gmail.com] thanks for working on this feature. I'm still going 
thru the patch, I have some comments though:
 * in pSourceName, how come we split partition by Q_SPLITTER? I think we don't 
need to do any splitting here (there should only be one partition)
 * I don't see PartitionQueueMetrics#forQueue invoked anywhere except a test 
case, unless I missed it? Do we need this? The metrics registration seems to 
happen in QueueMetrics#getPartitionQueueMetrics. Also, how come we call 
getQueueMetrics().put(partition, metrics)? I think it should be keyed by 
partition + queueName
 * Do we need a separate getPartitionMetrics? Can we track a partition's 
metrics via that partition + root queue?
 * For setAvailableResourcesToUser - how come we add this bit?
{noformat}
 if (parent != null) {
 parent.setAvailableResourcesToUser(partition, user, limit);
}{noformat}

 * It seems in some methods, e.g. incrPendingResources, decrPendingResources, 
etc. we need to add some metrics inheritance from child queue to parent queue 
(e.g, this is what's done for default partition:
{noformat}
_decrPendingResources(containers, res);
QueueMetrics userMetrics = getUserMetrics(user);
if (userMetrics != null) {
  userMetrics.decrPendingResources(partition, user, containers, res);
}
if (parent != null) {
  parent.decrPendingResources(partition, user, containers, res);
} {noformat}
maybe we need to overwrite the {{parent}} field in QueueMetrics inside 
PartitionQueueMetrics?


was (Author: jhung):
[~maniraj...@gmail.com] thanks for working on this feature. I'm still going 
thru the patch, I have some comments though:
 * in pSourceName, how come we split partition by Q_SPLITTER? I think we don't 
need to do any splitting here (there should only be one partition)
 * I don't see PartitionQueueMetrics#forQueue invoked anywhere except a test 
case, unless I missed it? Do we need this? The metrics registration seems to 
happen in QueueMetrics#getPartitionQueueMetrics. Also, how come we call 
getQueueMetrics().put(partition, metrics)? I think it should be keyed by 
partition + queueName
 * Do we need a separate getPartitionMetrics? Can we track a partition's 
metrics via that partition + root queue?
 * For setAvailableResourcesToUser - how come we add this bit?
{noformat}
 if (parent != null) {
 parent.setAvailableResourcesToUser(partition, user, limit);
}{noformat}

 * It seems in some methods, e.g. incrPendingResources, decrPendingResources, 
etc. we need to add some metrics inheritance from child queue to parent queue 
(e.g, this is what's done for default partition:
{noformat}
_decrPendingResources(containers, res);
QueueMetrics userMetrics = getUserMetrics(user);
if (userMetrics != null) {
  userMetrics.decrPendingResources(partition, user, containers, res);
}
if (parent != null) {
  parent.decrPendingResources(partition, user, containers, res);
} {noformat}

> Generate queue metrics for each partition
> -
>
> Key: YARN-6492
> URL: https://issues.apache.org/jira/browse/YARN-6492
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler
>Reporter: Jonathan Hung
>Assignee: Manikandan R
>Priority: Major
> Attachments: PartitionQueueMetrics_default_partition.txt, 
> PartitionQueueMetrics_x_partition.txt, PartitionQueueMetrics_y_partition.txt, 
> YARN-6492.001.patch, YARN-6492.002.patch, YARN-6492.003.patch, 
> YARN-6492.004.patch, YARN-6492.005.WIP.patch, YARN-6492.006.WIP.patch, 
> partition_metrics.txt
>
>
> We are interested in having queue metrics for all partitions. Right now each 
> queue has one QueueMetrics object which captures metrics either in default 
> partition or across all partitions. (After YARN-6467 it will be in default 
> partition)
> But having the partition metrics would be very useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9986) signalToContainer REST API does not work even if requested by the app owner

2019-11-18 Thread kyungwan nam (Jira)
kyungwan nam created YARN-9986:
--

 Summary: signalToContainer REST API does not work even if 
requested by the app owner
 Key: YARN-9986
 URL: https://issues.apache.org/jira/browse/YARN-9986
 Project: Hadoop YARN
  Issue Type: Bug
  Components: restapi
Reporter: kyungwan nam
Assignee: kyungwan nam


signalToContainer REST API introduced in YARN-8693 does not work even if 
requested by the app owner. 
It works well only if requested by an admin user

{code}
$ kinit kwnam
Password for kw...@test.org:
$ curl  -H 'Content-Type: application/json' --negotiate -u : -X POST 
https://rm002.test.org:8088/ws/v1/cluster/containers/container_e58_1573625560605_29927_01_01/signal/GRACEFUL_SHUTDOWN
{"RemoteException":{"exception":"ForbiddenException","message":"java.lang.Exception:
 Only admins can carry out this 
operation.","javaClassName":"org.apache.hadoop.yarn.webapp.ForbiddenException"}}$
$ kinit admin
Password for ad...@test.org:
$
$ curl  -H 'Content-Type: application/json' --negotiate -u : -X POST 
https://rm002.test.org:8088/ws/v1/cluster/containers/container_e58_1573625560605_29927_01_01/signal/GRACEFUL_SHUTDOWN
$
{code}

in contrast, the app owner can do it using the command line as below.

{code}
$ kinit kwnam
Password for kw...@test.org:
$ yarn container -signal container_e58_1573625560605_29927_01_02  
GRACEFUL_SHUTDOWN
Signalling container container_e58_1573625560605_29927_01_02
2019-11-19 09:12:29,797 INFO impl.YarnClientImpl: Signalling container 
container_e58_1573625560605_29927_01_02 with command GRACEFUL_SHUTDOWN
2019-11-19 09:12:29,920 INFO client.ConfiguredRMFailoverProxyProvider: Failing 
over to rm2
$
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2019-11-18 Thread Wilfred Spiegelenburg (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976997#comment-16976997
 ] 

Wilfred Spiegelenburg commented on YARN-8373:
-

{quote}[~wilfreds] cud we add a test to check whether reading from nodes doesnt 
cause any pblms OR any existing tests are already covering the same.
{quote}
Existing code is already checking this and has a unit test . The set is a copy 
of the nodes and YARN-3675 covers the read path in the testing
{quote}In an another note, does continuousSchedulingAttempt need writeLock by 
any chance ?
{quote}
The write lock is taken in {{attemptScheduling}} a little later. In this case a 
read lock is enough as we are not making any changes and it impacts scheduling 
less. The method that does the real scheduling {{attemptScheduling}} must have 
the write lock and already takes one. It also does another check to see if the 
node still exists as it might have been removed in the time between making the 
list and really allocating (your concern above).

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Girish Bhat
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: newbie
> Attachments: YARN-8373.001.patch, YARN-8373.002.patch, 
> YARN-8373.003.patch, YARN-8373.004.patch, YARN-8373.005.patch
>
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime

2019-11-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976943#comment-16976943
 ] 

Hudson commented on YARN-9562:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17659 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17659/])
YARN-9562. Add Java changes for the new RuncContainerRuntime. (ebadger: rev 
0e22e9ab83438af37d821cb2f96e31f9a19ace2c)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/TestRuncContainerRuntime.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/runc/package-info.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/TestImageTagToManifestPlugin.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/TestDockerContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/runtime/ContainerRuntimeConstants.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/runc/ImageManifest.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/runc/RuncManifestToResourcesPlugin.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/LinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/LinuxContainerRuntimeConstants.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/OCIContainerRuntime.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/runc/RuncContainerExecutorConfig.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/runc/HdfsManifestToResourcesPlugin.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DelegatingLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/privileged/PrivilegedOperation.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/TestHdfsManifestToResourcesPlugin.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
* (edit) hadoop-project/src/site/site.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) 

[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-11-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976937#comment-16976937
 ] 

Hudson commented on YARN-9561:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17658 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17658/])
YARN-9561. Add C changes for the new RuncContainerRuntime. Contributed 
(ebadger: rev 289bbca8709e17f962f608154d0bfbd617a69297)
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_write_config.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test-string-utils.cc
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/file-utils.h
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/cJSON/cJSON.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/util.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/string-utils.h
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_base_ctx.c
* (edit) LICENSE.txt
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/cJSON/cJSON.c
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/file-utils.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_base_ctx.h
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_launch_cmd.c
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc.h
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc.c
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_launch_cmd.h
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_config.h
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_reap.c
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_reap.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/util.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test_main.cc
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/string-utils.c
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/runc/runc_write_config.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/CMakeLists.txt
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/utils/test_runc_util.cc


> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9561.001.patch, YARN-9561.002.patch, 
> YARN-9561.003.patch, YARN-9561.004.patch, YARN-9561.005.patch, 
> YARN-9561.006.patch, YARN-9561.007.patch, YARN-9561.008.patch, 
> YARN-9561.009.patch, 

[jira] [Updated] (YARN-9561) Add C changes for the new RuncContainerRuntime

2019-11-18 Thread Eric Badger (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9561:
--
Fix Version/s: 3.3.0

Thanks to [~Jim_Brennan], [~eyang], [~ccondit], [~shaneku...@gmail.com], and 
[~jlowe] for the reviews and comments!

I committed this to trunk!

> Add C changes for the new RuncContainerRuntime
> --
>
> Key: YARN-9561
> URL: https://issues.apache.org/jira/browse/YARN-9561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9561.001.patch, YARN-9561.002.patch, 
> YARN-9561.003.patch, YARN-9561.004.patch, YARN-9561.005.patch, 
> YARN-9561.006.patch, YARN-9561.007.patch, YARN-9561.008.patch, 
> YARN-9561.009.patch, YARN-9561.010.patch, YARN-9561.011.patch, 
> YARN-9561.012.patch, YARN-9561.013.patch, YARN-9561.014.patch
>
>
> This JIRA will be used to add the C changes to the container-executor native 
> binary that are necessary for the new RuncContainerRuntime. There should be 
> no changes to existing code paths. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9562) Add Java changes for the new RuncContainerRuntime

2019-11-18 Thread Eric Badger (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-9562:
--
Fix Version/s: 3.3.0

Thanks to [~Jim_Brennan], [~eyang], [~ccondit], [~shaneku...@gmail.com], and 
[~jlowe] for the reviews and comments!

I committed this to trunk!

> Add Java changes for the new RuncContainerRuntime
> -
>
> Key: YARN-9562
> URL: https://issues.apache.org/jira/browse/YARN-9562
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Badger
>Assignee: Eric Badger
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: YARN-9562.001.patch, YARN-9562.002.patch, 
> YARN-9562.003.patch, YARN-9562.004.patch, YARN-9562.005.patch, 
> YARN-9562.006.patch, YARN-9562.007.patch, YARN-9562.008.patch, 
> YARN-9562.009.patch, YARN-9562.010.patch, YARN-9562.011.patch, 
> YARN-9562.012.patch, YARN-9562.013.patch, YARN-9562.014.patch, 
> YARN-9562.015.patch
>
>
> This JIRA will be used to add the Java changes for the new 
> RuncContainerRuntime. This will work off of YARN-9560 to use much of the 
> existing DockerLinuxContainerRuntime code once it is moved up into an 
> abstract class that can be extended. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9011) Race condition during decommissioning

2019-11-18 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976799#comment-16976799
 ] 

Hadoop QA commented on YARN-9011:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  8m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
16s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
26s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
35s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
15s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
9s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  1s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
0s{color} | {color:green} branch-3.2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} branch-3.2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
41s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 20s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}187m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisherForV2 |
|   | 
hadoop.yarn.server.resourcemanager.metrics.TestCombinedSystemMetricsPublisher |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:63396beab41 |
| JIRA Issue | YARN-9011 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986126/YARN-9011-branch-3.2.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c632febb0e47 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.2 / 

[jira] [Commented] (YARN-9975) Support proxy acl user for CapacityScheduler

2019-11-18 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976796#comment-16976796
 ] 

Eric Payne commented on YARN-9975:
--

[~cane], does this JIRA have the same requirements outlined in YARN-1115?

> Support proxy acl user for CapacityScheduler
> 
>
> Key: YARN-9975
> URL: https://issues.apache.org/jira/browse/YARN-9975
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: zhoukang
>Assignee: zhoukang
>Priority: Major
>
> As commented in https://issues.apache.org/jira/browse/YARN-9698.
> I will open a new jira for the proxy user feature. 
> The background is that we have long running  sql thriftserver for many users:
> {quote}{{user->sql proxy-> sql thriftserver}}{quote}
> But we do not have keytab for all users on 'sql proxy'. We just use a super 
> user like 'sql_prc' to submit the 'sql thriftserver' application. To support 
> this we should change the scheduler to support proxy user acl



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9011) Race condition during decommissioning

2019-11-18 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976782#comment-16976782
 ] 

Hadoop QA commented on YARN-9011:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
12s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 84m 
57s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}205m 44s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9011 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12985199/YARN-9011-009.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux bebc4d6a66bf 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7f81172 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25189/testReport/ |
| Max. process+thread count | 1348 (vs. ulimit of 5500) |
| modules 

[jira] [Commented] (YARN-9912) Capacity scheduler: support u:user2:%secondary_group queue mapping

2019-11-18 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976760#comment-16976760
 ] 

Hadoop QA commented on YARN-9912:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 29 unchanged - 1 fixed = 30 total (was 30) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m  8s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}133m 45s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.placement.TestPlacementManager |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueMappingFactory
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerAutoQueueCreation
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9912 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986119/YARN-9912-004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 30ec5c16dd2c 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 
16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 2764236 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Commented] (YARN-9912) Capacity scheduler: support u:user2:%secondary_group queue mapping

2019-11-18 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976723#comment-16976723
 ] 

Hadoop QA commented on YARN-9912:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 29 unchanged - 1 fixed = 30 total (was 30) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 44s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 34s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}133m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerQueueMappingFactory
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerAutoQueueCreation
 |
|   | hadoop.yarn.server.resourcemanager.placement.TestPlacementManager |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9912 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986119/YARN-9912-004.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 1f003fe6701c 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7f81172 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 

[jira] [Commented] (YARN-9868) Validate %primary_group queue in CS queue manager

2019-11-18 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976676#comment-16976676
 ] 

Hadoop QA commented on YARN-9868:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 29s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 44 unchanged - 1 fixed = 45 total (was 45) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 26s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 88m  
6s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}144m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9868 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986115/YARN-9868-003.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux eb354839708a 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 34cb595 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/25188/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25188/testReport/ |
| Max. process+thread count | 828 (vs. ulimit of 5500) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Updated] (YARN-9011) Race condition during decommissioning

2019-11-18 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9011:
---
Attachment: YARN-9011-branch-3.2.001.patch

> Race condition during decommissioning
> -
>
> Key: YARN-9011
> URL: https://issues.apache.org/jira/browse/YARN-9011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9011-001.patch, YARN-9011-002.patch, 
> YARN-9011-003.patch, YARN-9011-004.patch, YARN-9011-005.patch, 
> YARN-9011-006.patch, YARN-9011-007.patch, YARN-9011-008.patch, 
> YARN-9011-009.patch, YARN-9011-branch-3.2.001.patch
>
>
> During internal testing, we found a nasty race condition which occurs during 
> decommissioning.
> Node manager, incorrect behaviour:
> {noformat}
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: Disallowed NodeManager nodeId: node-6.hostname.com:8041 
> hostname:node-6.hostname.com
> {noformat}
> Node manager, expected behaviour:
> {noformat}
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: DECOMMISSIONING  node-6.hostname.com:8041 is ready to be 
> decommissioned
> {noformat}
> Note the two different messages from the RM ("Disallowed NodeManager" vs 
> "DECOMMISSIONING"). The problem is that {{ResourceTrackerService}} can see an 
> inconsistent state of nodes while they're being updated:
> {noformat}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: hostsReader 
> include:{172.26.12.198,node-7.hostname.com,node-2.hostname.com,node-5.hostname.com,172.26.8.205,node-8.hostname.com,172.26.23.76,172.26.22.223,node-6.hostname.com,172.26.9.218,node-4.hostname.com,node-3.hostname.com,172.26.13.167,node-9.hostname.com,172.26.21.221,172.26.10.219}
>  exclude:{node-6.hostname.com}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully 
> decommission node node-6.hostname.com:8041 with state RUNNING
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: 
> Disallowed NodeManager nodeId: node-6.hostname.com:8041 node: 
> node-6.hostname.com
> 2018-06-18 21:00:17,576 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node 
> node-6.hostname.com:8041 in DECOMMISSIONING.
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn 
> IP=172.26.22.115OPERATION=refreshNodes  TARGET=AdminService 
> RESULT=SUCCESS
> 2018-06-18 21:00:17,577 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Preserve 
> original total capability: 
> 2018-06-18 21:00:17,577 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
> node-6.hostname.com:8041 Node Transitioned from RUNNING to DECOMMISSIONING
> {noformat}
> When the decommissioning succeeds, there is no output logged from 
> {{ResourceTrackerService}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9011) Race condition during decommissioning

2019-11-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976645#comment-16976645
 ] 

Hudson commented on YARN-9011:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17657 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17657/])
YARN-9011. Race condition during decommissioning. Contributed by Peter 
(snemeth: rev 27642367ef3409a9ca93747c6c2cc279c087a4c0)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/HostsFileReader.java
* (edit) 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/util/TestHostsFileReader.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/NodesListManager.java


> Race condition during decommissioning
> -
>
> Key: YARN-9011
> URL: https://issues.apache.org/jira/browse/YARN-9011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9011-001.patch, YARN-9011-002.patch, 
> YARN-9011-003.patch, YARN-9011-004.patch, YARN-9011-005.patch, 
> YARN-9011-006.patch, YARN-9011-007.patch, YARN-9011-008.patch, 
> YARN-9011-009.patch
>
>
> During internal testing, we found a nasty race condition which occurs during 
> decommissioning.
> Node manager, incorrect behaviour:
> {noformat}
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: Disallowed NodeManager nodeId: node-6.hostname.com:8041 
> hostname:node-6.hostname.com
> {noformat}
> Node manager, expected behaviour:
> {noformat}
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: DECOMMISSIONING  node-6.hostname.com:8041 is ready to be 
> decommissioned
> {noformat}
> Note the two different messages from the RM ("Disallowed NodeManager" vs 
> "DECOMMISSIONING"). The problem is that {{ResourceTrackerService}} can see an 
> inconsistent state of nodes while they're being updated:
> {noformat}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: hostsReader 
> include:{172.26.12.198,node-7.hostname.com,node-2.hostname.com,node-5.hostname.com,172.26.8.205,node-8.hostname.com,172.26.23.76,172.26.22.223,node-6.hostname.com,172.26.9.218,node-4.hostname.com,node-3.hostname.com,172.26.13.167,node-9.hostname.com,172.26.21.221,172.26.10.219}
>  exclude:{node-6.hostname.com}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully 
> decommission node node-6.hostname.com:8041 with state RUNNING
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: 
> Disallowed NodeManager nodeId: node-6.hostname.com:8041 node: 
> node-6.hostname.com
> 2018-06-18 21:00:17,576 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node 
> node-6.hostname.com:8041 in DECOMMISSIONING.
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn 
> IP=172.26.22.115OPERATION=refreshNodes  TARGET=AdminService 
> RESULT=SUCCESS
> 2018-06-18 21:00:17,577 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Preserve 
> original total capability: 
> 2018-06-18 21:00:17,577 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
> node-6.hostname.com:8041 Node Transitioned from RUNNING to DECOMMISSIONING
> {noformat}
> When the decommissioning succeeds, there is no output logged from 
> {{ResourceTrackerService}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9011) Race condition during decommissioning

2019-11-18 Thread Szilard Nemeth (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976635#comment-16976635
 ] 

Szilard Nemeth commented on YARN-9011:
--

Thanks again [~pbacsko]!
Committed to trunk!
Thanks [~adam.antal] for the review!

Could you please take care of the backport patch to at least branch-3.2?

Thanks!

> Race condition during decommissioning
> -
>
> Key: YARN-9011
> URL: https://issues.apache.org/jira/browse/YARN-9011
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.1.1
>Reporter: Peter Bacsko
>Assignee: Peter Bacsko
>Priority: Major
> Attachments: YARN-9011-001.patch, YARN-9011-002.patch, 
> YARN-9011-003.patch, YARN-9011-004.patch, YARN-9011-005.patch, 
> YARN-9011-006.patch, YARN-9011-007.patch, YARN-9011-008.patch, 
> YARN-9011-009.patch
>
>
> During internal testing, we found a nasty race condition which occurs during 
> decommissioning.
> Node manager, incorrect behaviour:
> {noformat}
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:00:17,634 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: Disallowed NodeManager nodeId: node-6.hostname.com:8041 
> hostname:node-6.hostname.com
> {noformat}
> Node manager, expected behaviour:
> {noformat}
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received 
> SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting 
> down.
> 2018-06-18 21:07:37,377 WARN 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from 
> ResourceManager: DECOMMISSIONING  node-6.hostname.com:8041 is ready to be 
> decommissioned
> {noformat}
> Note the two different messages from the RM ("Disallowed NodeManager" vs 
> "DECOMMISSIONING"). The problem is that {{ResourceTrackerService}} can see an 
> inconsistent state of nodes while they're being updated:
> {noformat}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: hostsReader 
> include:{172.26.12.198,node-7.hostname.com,node-2.hostname.com,node-5.hostname.com,172.26.8.205,node-8.hostname.com,172.26.23.76,172.26.22.223,node-6.hostname.com,172.26.9.218,node-4.hostname.com,node-3.hostname.com,172.26.13.167,node-9.hostname.com,172.26.21.221,172.26.10.219}
>  exclude:{node-6.hostname.com}
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully 
> decommission node node-6.hostname.com:8041 with state RUNNING
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: 
> Disallowed NodeManager nodeId: node-6.hostname.com:8041 node: 
> node-6.hostname.com
> 2018-06-18 21:00:17,576 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node 
> node-6.hostname.com:8041 in DECOMMISSIONING.
> 2018-06-18 21:00:17,575 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn 
> IP=172.26.22.115OPERATION=refreshNodes  TARGET=AdminService 
> RESULT=SUCCESS
> 2018-06-18 21:00:17,577 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Preserve 
> original total capability: 
> 2018-06-18 21:00:17,577 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: 
> node-6.hostname.com:8041 Node Transitioned from RUNNING to DECOMMISSIONING
> {noformat}
> When the decommissioning succeeds, there is no output logged from 
> {{ResourceTrackerService}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2019-11-18 Thread Sunil G (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976615#comment-16976615
 ] 

Sunil G commented on YARN-8373:
---

[~wilfreds] cud we add a test to check whether reading from nodes doesnt cause 
any pblms OR any existing tests are already covering the same.

In an another note, does continuousSchedulingAttempt need writeLock by any 
chance ?

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Girish Bhat
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: newbie
> Attachments: YARN-8373.001.patch, YARN-8373.002.patch, 
> YARN-8373.003.patch, YARN-8373.004.patch, YARN-8373.005.patch
>
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9912) Capacity scheduler: support u:user2:%secondary_group queue mapping

2019-11-18 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976599#comment-16976599
 ] 

Peter Bacsko edited comment on YARN-9912 at 11/18/19 2:32 PM:
--

[~maniraj...@gmail.com] the patch cannot apply to trunk because it depends on 
piece of code which hasn't been committed yet. So I added the missing testcase, 
but it's likely that an extra rebase will be necessary as soon as YARN-9866 
appears on trunk. I also didn't include the changes to CapacityScheduler.md 
because that will change in YARN-9969. Sometimes these cross-dependencies are 
annoying but we have to live with them.


was (Author: pbacsko):
[~maniraj...@gmail.com] the patch cannot apply to trunk because it depends on 
piece of code which hasn't been committed yet. So I added the missing testcase, 
but it's likely that an extra rebase will be necessary as soon as YARN-9866 
appears on trunk. I also didn't include the changes to CapacityScheduler.md 
because that will change in YARN-9969.

> Capacity scheduler: support u:user2:%secondary_group queue mapping
> --
>
> Key: YARN-9912
> URL: https://issues.apache.org/jira/browse/YARN-9912
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler, capacityscheduler
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9912-004.patch, YARN-9912.001.patch, 
> YARN-9912.002.patch, YARN-9912.003.patch
>
>
> Similar to u:user2:%primary_group mapping, add support for 
> u:user2:%secondary_group queue mapping as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9912) Capacity scheduler: support u:user2:%secondary_group queue mapping

2019-11-18 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976599#comment-16976599
 ] 

Peter Bacsko commented on YARN-9912:


[~maniraj...@gmail.com] the patch cannot apply to trunk because it depends on 
piece of code which hasn't been committed yet. So I added the missing testcase, 
but it's likely that an extra rebase will be necessary as soon as YARN-9866 
appears on trunk. I also didn't include the changes to CapacityScheduler.md 
because that will change in YARN-9969.

> Capacity scheduler: support u:user2:%secondary_group queue mapping
> --
>
> Key: YARN-9912
> URL: https://issues.apache.org/jira/browse/YARN-9912
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler, capacityscheduler
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9912-004.patch, YARN-9912.001.patch, 
> YARN-9912.002.patch, YARN-9912.003.patch
>
>
> Similar to u:user2:%primary_group mapping, add support for 
> u:user2:%secondary_group queue mapping as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9912) Capacity scheduler: support u:user2:%secondary_group queue mapping

2019-11-18 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9912:
---
Attachment: YARN-9912-004.patch

> Capacity scheduler: support u:user2:%secondary_group queue mapping
> --
>
> Key: YARN-9912
> URL: https://issues.apache.org/jira/browse/YARN-9912
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler, capacityscheduler
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9912-004.patch, YARN-9912.001.patch, 
> YARN-9912.002.patch, YARN-9912.003.patch
>
>
> Similar to u:user2:%primary_group mapping, add support for 
> u:user2:%secondary_group queue mapping as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9868) Validate %primary_group queue in CS queue manager

2019-11-18 Thread Peter Bacsko (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-9868:
---
Attachment: YARN-9868-003.patch

> Validate %primary_group queue in CS queue manager
> -
>
> Key: YARN-9868
> URL: https://issues.apache.org/jira/browse/YARN-9868
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9868-003.patch, YARN-9868.001.patch, 
> YARN-9868.002.patch
>
>
> As part of %secondary_group mapping, we ensure o/p of %secondary_group while 
> processing the queue mapping is available using CSQueueManager. Similarly, we 
> will need to same for %primary_group.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7411) Inter-Queue preemption's computeFixpointAllocation need to handle absolute resources while computing normalizedGuarantee

2019-11-18 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-7411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976574#comment-16976574
 ] 

Eric Payne commented on YARN-7411:
--

[~sunilg] and [~leftnoteasy],

Rather than complicate this by backporting YARN-7483, I would like to just 
backport YARN-7411 without the above removal of code from 
{{ResourcePBImpl#initResources}}. Do you have any objections?

> Inter-Queue preemption's computeFixpointAllocation need to handle absolute 
> resources while computing normalizedGuarantee
> 
>
> Key: YARN-7411
> URL: https://issues.apache.org/jira/browse/YARN-7411
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: YARN-5881
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: YARN-7411-YARN-5881.004.patch, 
> YARN-7411-YARN-5881.005.patch, YARN-7411.001.patch, 
> YARN-7441.YARN-5881.002.patch, YARN-7441.YARN-5881.003.patch
>
>
> {{normalizedGuarantee}} is computed based on queue's capacity. This has to be 
> updated correctly when CS starts to accept queue's capacity in terms of 
> absolute resource.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9969) Improve yarn.scheduler.capacity.queue-mappings documentation

2019-11-18 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976536#comment-16976536
 ] 

Peter Bacsko commented on YARN-9969:


Thanks for the patch [~maniraj...@gmail.com].

Please correct the following:

{noformat}
 is mapped to queue name same as 
...
 is mapped to queue name same as 
...
 is mapped to 
...
  is mapped to 
{noformat}

There are tags between {{}} and {{}}. I think just 
drop the "<" ">" characters.

> Improve yarn.scheduler.capacity.queue-mappings documentation
> 
>
> Key: YARN-9969
> URL: https://issues.apache.org/jira/browse/YARN-9969
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Manikandan R
>Assignee: Manikandan R
>Priority: Major
> Attachments: YARN-9969.001.patch, YARN-9969.002.patch
>
>
> As discussed in 
> https://issues.apache.org/jira/browse/YARN-9865?focusedCommentId=16971482=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16971482,
>  scope of this Jira is to improve the yarn.scheduler.capacity.queue-mappings 
> in CapacityScheduler.md.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-9985) YARN: Unsupported "transitionToObserver" option displaying for rmadmin command

2019-11-18 Thread Souryakanta Dwivedy (Jira)
Souryakanta Dwivedy created YARN-9985:
-

 Summary: YARN: Unsupported "transitionToObserver" option 
displaying for rmadmin command
 Key: YARN-9985
 URL: https://issues.apache.org/jira/browse/YARN-9985
 Project: Hadoop YARN
  Issue Type: Bug
  Components: RM, yarn
Affects Versions: 3.2.1
Reporter: Souryakanta Dwivedy
 Attachments: image-2019-11-18-18-31-17-755.png, 
image-2019-11-18-18-35-54-688.png

Unsupported "transitionToObserver" option displaying for rmadmin command

Check the options for Yarn rmadmin command
It will display the "-transitionToObserver " option which is not 
supported 
 by yarn rmadmin command which is wrong behavior.
 But if you check the yarn rmadmin -help it will not display any option  
"-transitionToObserver "

 

!image-2019-11-18-18-31-17-755.png!

 

==

install/hadoop/resourcemanager/bin> ./yarn rmadmin -help
rmadmin is the command to execute YARN administrative commands.
The full syntax is:

yarn rmadmin [-refreshQueues] [-refreshNodes [-g|graceful [timeout in seconds] 
-client|server]] [-refreshNodesResources] 
[-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] 
[-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] 
[-addToClusterNodeLabels 
<"label1(exclusive=true),label2(exclusive=false),label3">] 
[-removeFromClusterNodeLabels ] [-replaceLabelsOnNode 
<"node1[:port]=label1,label2 node2[:port]=label1"> [-failOnUnknownNodes]] 
[-directlyAccessNodeLabelStore] [-refreshClusterMaxPriority] 
[-updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout]) or 
-updateNodeResource [NodeID] [ResourceTypes] ([OvercommitTimeout])] 
*{color:#FF}[-transitionToActive [--forceactive] ]{color} 
{color:#FF}[-transitionToStandby ]{color}* [-getServiceState 
] [-getAllServiceState] [-checkHealth ] [-help [cmd]]

-refreshQueues: Reload the queues' acls, states and scheduler specific 
properties.
 ResourceManager will reload the mapred-queues configuration file.
 -refreshNodes [-g|graceful [timeout in seconds] -client|server]: Refresh the 
hosts information at the ResourceManager. Here [-g|graceful [timeout in 
seconds] -client|server] is optional, if we specify the timeout then 
ResourceManager will wait for timeout before marking the NodeManager as 
decommissioned. The -client|server indicates if the timeout tracking should be 
handled by the client or the ResourceManager. The client-side tracking is 
blocking, while the server-side tracking is not. Omitting the timeout, or a 
timeout of -1, indicates an infinite timeout. Known Issue: the server-side 
tracking will immediately decommission if an RM HA failover occurs.
 -refreshNodesResources: Refresh resources of NodeManagers at the 
ResourceManager.
 -refreshSuperUserGroupsConfiguration: Refresh superuser proxy groups mappings
 -refreshUserToGroupsMappings: Refresh user-to-groups mappings
 -refreshAdminAcls: Refresh acls for administration of ResourceManager
 -refreshServiceAcl: Reload the service-level authorization policy file.
 ResourceManager will reload the authorization policy file.
 -getGroups [username]: Get the groups which given user belongs to.
 -addToClusterNodeLabels 
<"label1(exclusive=true),label2(exclusive=false),label3">: add to cluster node 
labels. Default exclusivity is true
 -removeFromClusterNodeLabels  (label splitted by ","): 
remove from cluster node labels
 -replaceLabelsOnNode <"node1[:port]=label1,label2 node2[:port]=label1,label2"> 
[-failOnUnknownNodes] : replace labels on nodes (please note that we do not 
support specifying multiple labels on a single host for now.)
 [-failOnUnknownNodes] is optional, when we set this option, it will fail if 
specified nodes are unknown.
 -directlyAccessNodeLabelStore: This is DEPRECATED, will be removed in future 
releases. Directly access node label store, with this option, all node label 
related operations will not connect RM. Instead, they will access/modify stored 
node labels directly. By default, it is false (access via RM). AND PLEASE NOTE: 
if you configured yarn.node-labels.fs-store.root-dir to a local directory 
(instead of NFS or HDFS), this option will only work when the command run on 
the machine where RM is running.
 -refreshClusterMaxPriority: Refresh cluster max priority
 -updateNodeResource [NodeID] [MemSize] [vCores] ([OvercommitTimeout])
 or
 [NodeID] [resourcetypes] ([OvercommitTimeout]). : Update resource on specific 
node.
 -transitionToActive [--forceactive] : Transitions the service into 
Active state
 -transitionToStandby : Transitions the service into Standby state
 -getServiceState : Returns the state of the service
 -getAllServiceState: Returns the state of all the services
 -checkHealth : Requests that the service perform a health check.
The HAAdmin tool will exit with a non-zero exit code
if the check fails.
 -help [cmd]: Displays help for 

[jira] [Commented] (YARN-9984) FSPreemptionThread crash with NullPointerException

2019-11-18 Thread Sunil G (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976438#comment-16976438
 ] 

Sunil G commented on YARN-9984:
---

This is straightforward one. I think UT may be tougher on this.

Since its a null check, lets get this in. +1

I will commit this later today if there are no objection.

> FSPreemptionThread crash with NullPointerException
> --
>
> Key: YARN-9984
> URL: https://issues.apache.org/jira/browse/YARN-9984
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 3.0.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Major
> Attachments: YARN-9984.001.patch
>
>
> When an application is unregistered there is a chance that there are still 
> containers running on a node for that application. In all cases we handle the 
> application missing from the RM gracefully (log a message and continue) 
> except for the FS pre-emption thread.
> In case the application is removed but some containers are still linked to a 
> node the FSPreemptionThread will crash with a NPE when it tries to retrieve 
> the application id for the attempt:
> {code:java}
> FSAppAttempt app =
> scheduler.getSchedulerApp(container.getApplicationAttemptId());
> ApplicationId appId = app.getApplicationId();{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9982) Fix Container API example link in NodeManager REST API doc

2019-11-18 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976414#comment-16976414
 ] 

Hudson commented on YARN-9982:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17651 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/17651/])
YARN-9982. Fix Container API example link in NodeManager REST API doc. 
(prabhujoseph: rev bd454348b0af2f7ed7c7c9a2dfaed4ddbb41)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/NodeManagerRest.md


> Fix Container API example link in NodeManager REST API doc
> --
>
> Key: YARN-9982
> URL: https://issues.apache.org/jira/browse/YARN-9982
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Charan Hebri
>Assignee: Charan Hebri
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: YARN-9982.001.patch
>
>
> In 
> [https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html#Container_API]
> {noformat}
> GET 
> http://nm-http-address:port/ws/v1/nodes/containers/container_1326121700862_0007_01_01{noformat}
> should be changed to,
> {noformat}
> GET 
> http://nm-http-address:port/ws/v1/node/containers/container_1326121700862_0007_01_01{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9982) Fix Container API example link in NodeManager REST API doc

2019-11-18 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976396#comment-16976396
 ] 

Prabhu Joseph commented on YARN-9982:
-

Thank you [~charanh] for fixing this issue. Have committed  
[^YARN-9982.001.patch]  to trunk.

> Fix Container API example link in NodeManager REST API doc
> --
>
> Key: YARN-9982
> URL: https://issues.apache.org/jira/browse/YARN-9982
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Charan Hebri
>Assignee: Charan Hebri
>Priority: Trivial
> Attachments: YARN-9982.001.patch
>
>
> In 
> [https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html#Container_API]
> {noformat}
> GET 
> http://nm-http-address:port/ws/v1/nodes/containers/container_1326121700862_0007_01_01{noformat}
> should be changed to,
> {noformat}
> GET 
> http://nm-http-address:port/ws/v1/node/containers/container_1326121700862_0007_01_01{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9982) Fix Container API example link in NodeManager REST API doc

2019-11-18 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9982:

Fix Version/s: 3.3.0

> Fix Container API example link in NodeManager REST API doc
> --
>
> Key: YARN-9982
> URL: https://issues.apache.org/jira/browse/YARN-9982
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
>Reporter: Charan Hebri
>Assignee: Charan Hebri
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: YARN-9982.001.patch
>
>
> In 
> [https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html#Container_API]
> {noformat}
> GET 
> http://nm-http-address:port/ws/v1/nodes/containers/container_1326121700862_0007_01_01{noformat}
> should be changed to,
> {noformat}
> GET 
> http://nm-http-address:port/ws/v1/node/containers/container_1326121700862_0007_01_01{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9836) General usability improvements in showSimulationTrace.html

2019-11-18 Thread Adam Antal (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976395#comment-16976395
 ] 

Adam Antal commented on YARN-9836:
--

[~snemeth] could you please backport the patches to branch-3.1 and branch-3.2?

> General usability improvements in showSimulationTrace.html
> --
>
> Key: YARN-9836
> URL: https://issues.apache.org/jira/browse/YARN-9836
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler-load-simulator
>Affects Versions: 3.3.0
>Reporter: Adam Antal
>Assignee: Adam Antal
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: YARN-9836.001.patch, YARN-9836.002.patch, 
> YARN-9836.003.patch, YARN-9836.branch-3.1.001.patch, 
> YARN-9836.branch-3.1.001.patch, YARN-9836.branch-3.2.001.patch, 
> YARN-9836.branch-3.2.001.patch
>
>
> There are some small usability improvements that can be made for the offline 
> analysis page (showSimulationTrace.html):
> - empty divs can be hidden until no data is displayed
> - the site can be refactored to be responsive given that bootstrap is already 
> available as third party library
> - there's no proper error handling in the site (e.g. a JSON is malformed and 
> similar cases) which is really a big problem
> - there's no indentation in the raw html file which makes supportability even 
> worse



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-9900) Revert to previous state when Invalid Config is applied and Refresh Support in SchedulerConfig Format

2019-11-18 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated YARN-9900:

Fix Version/s: 3.1.4
   3.2.2
   3.3.0

> Revert to previous state when Invalid Config is applied and Refresh Support 
> in SchedulerConfig Format
> -
>
> Key: YARN-9900
> URL: https://issues.apache.org/jira/browse/YARN-9900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.3.0, 3.2.2, 3.1.4
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-9900-001.patch, YARN-9900-002.patch, 
> YARN-9900-003.patch, YARN-9900-branch-3.1.001.patch
>
>
> Format Scheduler Config Option has to revert to the previous scheduler 
> configuration in case of invalid capacity-scheduler.xml contents. And refresh 
> has to be done after format so  that RM need not be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Closed] (YARN-9900) Revert to previous state when Invalid Config is applied and Refresh Support in SchedulerConfig Format

2019-11-18 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph closed YARN-9900.
---

> Revert to previous state when Invalid Config is applied and Refresh Support 
> in SchedulerConfig Format
> -
>
> Key: YARN-9900
> URL: https://issues.apache.org/jira/browse/YARN-9900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.3.0, 3.2.2, 3.1.4
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Fix For: 3.3.0, 3.2.2, 3.1.4
>
> Attachments: YARN-9900-001.patch, YARN-9900-002.patch, 
> YARN-9900-003.patch, YARN-9900-branch-3.1.001.patch
>
>
> Format Scheduler Config Option has to revert to the previous scheduler 
> configuration in case of invalid capacity-scheduler.xml contents. And refresh 
> has to be done after format so  that RM need not be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9900) Revert to previous state when Invalid Config is applied and Refresh Support in SchedulerConfig Format

2019-11-18 Thread Prabhu Joseph (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976391#comment-16976391
 ] 

Prabhu Joseph commented on YARN-9900:
-

Backported the patch to branch-3.1.

> Revert to previous state when Invalid Config is applied and Refresh Support 
> in SchedulerConfig Format
> -
>
> Key: YARN-9900
> URL: https://issues.apache.org/jira/browse/YARN-9900
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Affects Versions: 3.3.0, 3.2.2, 3.1.4
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
>Priority: Major
> Attachments: YARN-9900-001.patch, YARN-9900-002.patch, 
> YARN-9900-003.patch, YARN-9900-branch-3.1.001.patch
>
>
> Format Scheduler Config Option has to revert to the previous scheduler 
> configuration in case of invalid capacity-scheduler.xml contents. And refresh 
> has to be done after format so  that RM need not be restarted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH

2019-11-18 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976389#comment-16976389
 ] 

Peter Bacsko commented on YARN-8373:


Thanks [~wilfreds] now it looks solid. +1 (non-binding) again from me.

> RM  Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> ---
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.9.0
>Reporter: Girish Bhat
>Assignee: Wilfred Spiegelenburg
>Priority: Major
>  Labels: newbie
> Attachments: YARN-8373.001.patch, YARN-8373.002.patch, 
> YARN-8373.003.patch, YARN-8373.004.patch, YARN-8373.005.patch
>
>
>  
>  
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on 
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum 
> 0a76a9a32a5257331741f8d5932f183 This command was run using 
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0 
>  
> {noformat}
> 2018-05-25 05:53:12,742 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received 
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly: 
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down 
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited 
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates 
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
>  ExpiredTokenRemover received java.lang.InterruptedException: sleep 
> interrupted{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org