[jira] [Commented] (YARN-9697) Efficient allocation of Opportunistic containers.
[ https://issues.apache.org/jira/browse/YARN-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972097#comment-16972097 ] Bibin Chundatt commented on YARN-9697: -- Thank you [~abmodi] Overall patch looks good to me.. > Efficient allocation of Opportunistic containers. > - > > Key: YARN-9697 > URL: https://issues.apache.org/jira/browse/YARN-9697 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9697.001.patch, YARN-9697.002.patch, > YARN-9697.003.patch, YARN-9697.004.patch, YARN-9697.005.patch, > YARN-9697.006.patch, YARN-9697.007.patch, YARN-9697.008.patch, > YARN-9697.009.patch, YARN-9697.ut.patch, YARN-9697.ut2.patch, > YARN-9697.wip1.patch, YARN-9697.wip2.patch > > > In the current implementation, opportunistic containers are allocated based > on the number of queued opportunistic container information received in node > heartbeat. This information becomes stale as soon as more opportunistic > containers are allocated on that node. > Allocation of opportunistic containers happens on the same heartbeat in which > AM asks for the containers. When multiple applications request for > Opportunistic containers, containers might get allocated on the same set of > nodes as already allocated containers on the node are not considered while > serving requests from different applications. This can lead to uneven > allocation of Opportunistic containers across the cluster leading to > increased queuing time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9965) Fix NodeManager failing to start when Hdfs Auxillary Jar is set
[ https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972086#comment-16972086 ] Abhishek Modi commented on YARN-9965: - [~vinodkv] - I think that's a mistake from my end - I should have enforced it to write an UT before committing it. Since this was a very minor fix and I also tested it - I went ahead with the commit. Should I create a separate Jira for writing UT for this - or should we add an addendum patch here only with the UT? > Fix NodeManager failing to start when Hdfs Auxillary Jar is set > --- > > Key: YARN-9965 > URL: https://issues.apache.org/jira/browse/YARN-9965 > Project: Hadoop YARN > Issue Type: Bug > Components: auxservices, nodemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9965-001.patch > > > Loading an auxiliary jar from a Hdfs location on a node manager works as > expected on first time. The subsequent restart fails with > ClassNotFoundException > {code:java} > 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: > classpath: [] > 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: > system classes: [java., javax.accessibility., javax.activation., > javax.activity., javax.annotation., javax.annotation.processing., > javax.crypto., javax.imageio., javax.jws., javax.lang.model., > -javax.management.j2ee., javax.management., javax.naming., javax.net., > javax.print., javax.rmi., javax.script., -javax.security.auth.message., > javax.security.auth., javax.security.cert., javax.security.sasl., > javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., > -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., > org.xml.sax., org.apache.commons.logging., org.apache.log4j., > -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, > hdfs-default.xml, mapred-default.xml, yarn-default.xml] > 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: > Service > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed > in state INITED > java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189) > at > org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016) > {code} > > The issue happens when reusing the previous localized auxillary service jar. > The localized jar file is appended with /* when reusing which has caused the > issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972075#comment-16972075 ] Yufei Gu commented on YARN-9537: [~cane], Thanks for the patch. +1 for the patch 006. Will commit later. > Add configuration to disable AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.2.0, 3.1.2 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-9537-002.patch, YARN-9537.001.patch, > YARN-9537.003.patch, YARN-9537.004.patch, YARN-9537.005.patch, > YARN-9537.006.patch > > > In this issue, i will add a configuration to support disable AM preemption. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9965) Fix NodeManager failing to start when Hdfs Auxillary Jar is set
[ https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972063#comment-16972063 ] Vinod Kumar Vavilapalli commented on YARN-9965: --- bq. -1 test4tests 0m 0s The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. There's no rationale given from the contributor or committer as to why we have not modified any unit tests. Why is that? > Fix NodeManager failing to start when Hdfs Auxillary Jar is set > --- > > Key: YARN-9965 > URL: https://issues.apache.org/jira/browse/YARN-9965 > Project: Hadoop YARN > Issue Type: Bug > Components: auxservices, nodemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9965-001.patch > > > Loading an auxiliary jar from a Hdfs location on a node manager works as > expected on first time. The subsequent restart fails with > ClassNotFoundException > {code:java} > 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: > classpath: [] > 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: > system classes: [java., javax.accessibility., javax.activation., > javax.activity., javax.annotation., javax.annotation.processing., > javax.crypto., javax.imageio., javax.jws., javax.lang.model., > -javax.management.j2ee., javax.management., javax.naming., javax.net., > javax.print., javax.rmi., javax.script., -javax.security.auth.message., > javax.security.auth., javax.security.cert., javax.security.sasl., > javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., > -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., > org.xml.sax., org.apache.commons.logging., org.apache.log4j., > -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, > hdfs-default.xml, mapred-default.xml, yarn-default.xml] > 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: > Service > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed > in state INITED > java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189) > at > org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016) > {code} > > The issue happens when reusing the previous localized auxillary service jar. > The localized jar file is appended with /* when reusing which has caused the > issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16972057#comment-16972057 ] Hadoop QA commented on YARN-9537: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 57s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 2 new + 18 unchanged - 0 fixed = 20 total (was 18) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 84m 40s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}141m 15s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-9537 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985575/YARN-9537.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8d7e652828d0 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 30b93f9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/25141/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25141/testReport/ | | Max. process+thread count | 814 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Updated] (YARN-9537) Add configuration to disable AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhoukang updated YARN-9537: --- Attachment: YARN-9537.006.patch > Add configuration to disable AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.2.0, 3.1.2 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-9537-002.patch, YARN-9537.001.patch, > YARN-9537.003.patch, YARN-9537.004.patch, YARN-9537.005.patch, > YARN-9537.006.patch > > > In this issue, i will add a configuration to support disable AM preemption. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971972#comment-16971972 ] Hadoop QA commented on YARN-9561: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 34s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 63m 29s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 15m 35s{color} | {color:red} root generated 4 new + 23 unchanged - 3 fixed = 27 total (was 26) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 15m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 14m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}157m 54s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 3s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}287m 5s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean | | | hadoop.hdfs.TestMaintenanceState | | | hadoop.yarn.server.webproxy.TestWebAppProxyServlet | | | hadoop.yarn.server.webproxy.amfilter.TestAmFilter | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-9561 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985549/YARN-9561.012.patch | | Optional Tests | dupname asflicense compile cc mvnsite javac unit | | uname | Linux bc21c94d0655 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 30b93f9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | cc | https://builds.apache.org/job/PreCommit-YARN-Build/25140/artifact/out/diff-compile-cc-root.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/25140/artifact/out/patch-unit-root.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25140/testReport/ | | Max. process+thread count | 3171 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager . U: . | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25140/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Add C changes for the new RuncContainerRuntime > -- > > Key:
[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971887#comment-16971887 ] Wilfred Spiegelenburg commented on YARN-8373: - See my comment: {quote} The one inside the scheduler prevents updates of nodes to flow through as much as possible. {quote} There is no guarantee that there will not be a change. The only way we can prevent all changes is by locking down all nodes in the list which we do not want. Putting the scheduler lock in place will quiet down things but does not guarantee it. The PriorityQueue does not avoid change. It handles the fact that objects can change while in the list. > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > --- > > Key: YARN-8373 > URL: https://issues.apache.org/jira/browse/YARN-8373 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Girish Bhat >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: newbie > Attachments: YARN-8373.001.patch, YARN-8373.002.patch, > YARN-8373.003.patch > > > > > {noformat} > sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 > Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r > 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on > 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum > 0a76a9a32a5257331741f8d5932f183 This command was run using > /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat} > This is for version 2.9.0 > > {noformat} > 2018-05-25 05:53:12,742 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai > rSchedulerContinuousScheduling, that exited unexpectedly: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,743 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down > the resource manager. > 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: a critical thread, FairSchedulerContinuousScheduling, that exited > unexpectedly: java.lang.IllegalArgumentException: Comparison method violates > its general contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,772 ERROR > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > ExpiredTokenRemover received java.lang.InterruptedException: sleep > interrupted{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-9561: -- Attachment: YARN-9561.012.patch > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch, YARN-9561.004.patch, YARN-9561.005.patch, > YARN-9561.006.patch, YARN-9561.007.patch, YARN-9561.008.patch, > YARN-9561.009.patch, YARN-9561.010.patch, YARN-9561.011.patch, > YARN-9561.012.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9561) Add C changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971846#comment-16971846 ] Eric Badger commented on YARN-9561: --- [~eyang], looks like there was a bug there beacuse I wasn't explicitly setting up the container-executor.cfg file in the cetest framework before setting the nm user. That should be fixed in patch 012, which I have attached. > Add C changes for the new RuncContainerRuntime > -- > > Key: YARN-9561 > URL: https://issues.apache.org/jira/browse/YARN-9561 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9561.001.patch, YARN-9561.002.patch, > YARN-9561.003.patch, YARN-9561.004.patch, YARN-9561.005.patch, > YARN-9561.006.patch, YARN-9561.007.patch, YARN-9561.008.patch, > YARN-9561.009.patch, YARN-9561.010.patch, YARN-9561.011.patch, > YARN-9561.012.patch > > > This JIRA will be used to add the C changes to the container-executor native > binary that are necessary for the new RuncContainerRuntime. There should be > no changes to existing code paths. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9562) Add Java changes for the new RuncContainerRuntime
[ https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971833#comment-16971833 ] Eric Badger commented on YARN-9562: --- {noformat} Error while deleting /tmp/hadoop-yarn/nm-local-dir/usercache/hadoopuser/appcache/application_1573397883403_0002/container_e02_1573397883403_0002_01_02: 39 (Directory not empty) {noformat} So the container-executor is failing to delete this directory. But I'm really confused about the output that it is spitting out. {noformat} failed to rmdir application_1573397883403_0002: Permission denied failed to rmdir appcache: Permission denied failed to rmdir filecache: Permission denied failed to rmdir hadoopuser: Permission denied failed to rmdir usercache: Permission denied failed to rmdir filecache: Permission denied failed to rmdir nm-local-dir: Permission denied failed to rmdir hadoop-yarn: Directory not empty failed to rmdir private_slash_tmp: Directory not empty {noformat} Why is it trying to delete all of these directories? Do _all_ os these directories exist underneath {{/tmp/hadoop-yarn/nm-local-dir/usercache/hadoopuser/appcache/application_1573397883403_0002/container_e02_1573397883403_0002_01_02}}? If not, why are they trying to be deleted? [~shaneku...@gmail.com], could you send me a recursive ls of the directories prior to them being deleted? I am unable to reproduce this failure and I'm having trouble trying to explain how this could happen. One idea that may explain it is that I always have to blow away my hadoop dirs when switching between local user mode and non local user mode. Since this code was previously only using local user mode, it might have created some of these directories previously with the wrong ownership and now they can't be deleted. I don't _think_ that's what is happening here, but it's worth a look. > Add Java changes for the new RuncContainerRuntime > - > > Key: YARN-9562 > URL: https://issues.apache.org/jira/browse/YARN-9562 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Badger >Assignee: Eric Badger >Priority: Major > Attachments: YARN-9562.001.patch, YARN-9562.002.patch, > YARN-9562.003.patch, YARN-9562.004.patch, YARN-9562.005.patch, > YARN-9562.006.patch, YARN-9562.007.patch, YARN-9562.008.patch, > YARN-9562.009.patch, YARN-9562.010.patch, YARN-9562.011.patch, > YARN-9562.012.patch, YARN-9562.013.patch, YARN-9562.014.patch > > > This JIRA will be used to add the Java changes for the new > RuncContainerRuntime. This will work off of YARN-9560 to use much of the > existing DockerLinuxContainerRuntime code once it is moved up into an > abstract class that can be extended. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9894) CapacitySchedulerPerf test for measuring hundreds of apps in a large number of queues.
[ https://issues.apache.org/jira/browse/YARN-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971809#comment-16971809 ] Jonathan Hung edited comment on YARN-9894 at 11/11/19 7:24 PM: --- Thanks [[~epayne] for working on this - * for {noformat} int numQueues = 40; if (Integer.getInteger("NumberOfQueues") != null) { numQueues = Integer.getInteger("NumberOfQueues"); } int pctActiveQueues = 100; if (Integer.getInteger("PercentActiveQueues") != null) { pctActiveQueues = Integer.getInteger("PercentActiveQueues"); } int appCount = 100; if (Integer.getInteger("NumberOfApplications") != null) { appCount = Integer.getInteger("NumberOfApplications"); }{noformat} can we change this to: {{Integer.getInteger("NumberOfQueues", 40);}}, etc.? Also thinking out loud, it seems there's quite a bit of duplicate code, any chance we can refactor it? was (Author: jhung): Thanks [[~epayne] for working on this - * for {noformat} if (Integer.getInteger("NumberOfQueues") != null) { numQueues = Integer.getInteger("NumberOfQueues"); } int pctActiveQueues = 100; if (Integer.getInteger("PercentActiveQueues") != null) { pctActiveQueues = Integer.getInteger("PercentActiveQueues"); } int appCount = 100; if (Integer.getInteger("NumberOfApplications") != null) { appCount = Integer.getInteger("NumberOfApplications"); }{noformat} > CapacitySchedulerPerf test for measuring hundreds of apps in a large number > of queues. > -- > > Key: YARN-9894 > URL: https://issues.apache.org/jira/browse/YARN-9894 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, test >Affects Versions: 2.9.2, 2.8.5, 3.2.1, 3.1.3 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Major > Attachments: YARN-9894.001.patch > > > I have developed a unit test based on the existing TestCapacitySchedulerPerf > tests that will measure the performance of a configurable number of apps in a > configurable number of queues. It will also test the performance of a cluster > that has many queues but only a portion of them are active. > {code:title=For example:} > $ mvn test > -Dtest=TestCapacitySchedulerPerf#testUserLimitThroughputWithManyQueues \ > -DRunCapacitySchedulerPerfTests=true > -DNumberOfQueues=100 \ > -DNumberOfApplications=200 \ > -DPercentActiveQueues=100 > {code} > - Parameters: > -- RunCapacitySchedulerPerfTests=true: > Needed in order to trigger the test > -- NumberOfQueues > Configurable number of queues > -- NumberOfApplications > Total number of apps to run in the whole cluster, distributed evenly across > all queues > -- PercentActiveQueues > Percentage of the queues that contain active applications -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-9894) CapacitySchedulerPerf test for measuring hundreds of apps in a large number of queues.
[ https://issues.apache.org/jira/browse/YARN-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971809#comment-16971809 ] Jonathan Hung edited comment on YARN-9894 at 11/11/19 7:24 PM: --- Thanks [[~epayne] for working on this - * for {noformat} if (Integer.getInteger("NumberOfQueues") != null) { numQueues = Integer.getInteger("NumberOfQueues"); } int pctActiveQueues = 100; if (Integer.getInteger("PercentActiveQueues") != null) { pctActiveQueues = Integer.getInteger("PercentActiveQueues"); } int appCount = 100; if (Integer.getInteger("NumberOfApplications") != null) { appCount = Integer.getInteger("NumberOfApplications"); }{noformat} was (Author: jhung): Thanks [[~epayne] for working on this - * for {noformat} if (Integer.getInteger("NumberOfQueues") != null) { numQueues = Integer.getInteger("NumberOfQueues");}282int pctActiveQueues = 100;283if (Integer.getInteger("PercentActiveQueues") != null) {284 pctActiveQueues = Integer.getInteger("PercentActiveQueues");285 }286int appCount = 100;287if (Integer.getInteger("NumberOfApplications") != null) {288 appCount = Integer.getInteger("NumberOfApplications");289}{noformat} > CapacitySchedulerPerf test for measuring hundreds of apps in a large number > of queues. > -- > > Key: YARN-9894 > URL: https://issues.apache.org/jira/browse/YARN-9894 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, test >Affects Versions: 2.9.2, 2.8.5, 3.2.1, 3.1.3 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Major > Attachments: YARN-9894.001.patch > > > I have developed a unit test based on the existing TestCapacitySchedulerPerf > tests that will measure the performance of a configurable number of apps in a > configurable number of queues. It will also test the performance of a cluster > that has many queues but only a portion of them are active. > {code:title=For example:} > $ mvn test > -Dtest=TestCapacitySchedulerPerf#testUserLimitThroughputWithManyQueues \ > -DRunCapacitySchedulerPerfTests=true > -DNumberOfQueues=100 \ > -DNumberOfApplications=200 \ > -DPercentActiveQueues=100 > {code} > - Parameters: > -- RunCapacitySchedulerPerfTests=true: > Needed in order to trigger the test > -- NumberOfQueues > Configurable number of queues > -- NumberOfApplications > Total number of apps to run in the whole cluster, distributed evenly across > all queues > -- PercentActiveQueues > Percentage of the queues that contain active applications -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9894) CapacitySchedulerPerf test for measuring hundreds of apps in a large number of queues.
[ https://issues.apache.org/jira/browse/YARN-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971809#comment-16971809 ] Jonathan Hung commented on YARN-9894: - Thanks [[~epayne] for working on this - * for {noformat} if (Integer.getInteger("NumberOfQueues") != null) { numQueues = Integer.getInteger("NumberOfQueues");}282int pctActiveQueues = 100;283if (Integer.getInteger("PercentActiveQueues") != null) {284 pctActiveQueues = Integer.getInteger("PercentActiveQueues");285 }286int appCount = 100;287if (Integer.getInteger("NumberOfApplications") != null) {288 appCount = Integer.getInteger("NumberOfApplications");289}{noformat} > CapacitySchedulerPerf test for measuring hundreds of apps in a large number > of queues. > -- > > Key: YARN-9894 > URL: https://issues.apache.org/jira/browse/YARN-9894 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, test >Affects Versions: 2.9.2, 2.8.5, 3.2.1, 3.1.3 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Major > Attachments: YARN-9894.001.patch > > > I have developed a unit test based on the existing TestCapacitySchedulerPerf > tests that will measure the performance of a configurable number of apps in a > configurable number of queues. It will also test the performance of a cluster > that has many queues but only a portion of them are active. > {code:title=For example:} > $ mvn test > -Dtest=TestCapacitySchedulerPerf#testUserLimitThroughputWithManyQueues \ > -DRunCapacitySchedulerPerfTests=true > -DNumberOfQueues=100 \ > -DNumberOfApplications=200 \ > -DPercentActiveQueues=100 > {code} > - Parameters: > -- RunCapacitySchedulerPerfTests=true: > Needed in order to trigger the test > -- NumberOfQueues > Configurable number of queues > -- NumberOfApplications > Total number of apps to run in the whole cluster, distributed evenly across > all queues > -- PercentActiveQueues > Percentage of the queues that contain active applications -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7721) TestContinuousScheduling fails sporadically with NPE
[ https://issues.apache.org/jira/browse/YARN-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971683#comment-16971683 ] Hadoop QA commented on YARN-7721: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 44s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 2 unchanged - 1 fixed = 2 total (was 3) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 28s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}143m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-7721 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12945331/YARN-7721.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 088b986d9526 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 30b93f9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25139/testReport/ | | Max. process+thread count | 833 (vs. ulimit of 5500) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25139/console | | Powered by | Apache
[jira] [Commented] (YARN-9697) Efficient allocation of Opportunistic containers.
[ https://issues.apache.org/jira/browse/YARN-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971665#comment-16971665 ] Hadoop QA commented on YARN-9697: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 8s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 0 new + 9 unchanged - 4 fixed = 9 total (was 13) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 47s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 25s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 48s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}156m 10s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-9697 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985512/YARN-9697.009.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 38263877a18a 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 30b93f9 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results |
[jira] [Comment Edited] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971655#comment-16971655 ] kailiu_dev edited comment on YARN-8373 at 11/11/19 2:57 PM: ok ,as you say: "The one inside the scheduler prevents updates of nodes", so I understand when sortedNodeList other action of updating node resource will not begin. *but you add this:* Nodes can change while being sorted. Using a standard sort will fail + * without locking each node, the PriorityQueue handles this without locks. so my other question is : why use PriorityQueue to avoid change ? was (Author: kailiu_dev): ok ,as you say: "The one inside the scheduler prevents updates of nodes", so I understand when sortedNodeList other action of uodating node resource will not begin. *but you add this:* Nodes can change while being sorted. Using a standard sort will fail + * without locking each node, the PriorityQueue handles this without locks. so my other question is : why use PriorityQueue to avoid change ? > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > --- > > Key: YARN-8373 > URL: https://issues.apache.org/jira/browse/YARN-8373 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Girish Bhat >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: newbie > Attachments: YARN-8373.001.patch, YARN-8373.002.patch, > YARN-8373.003.patch > > > > > {noformat} > sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 > Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r > 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on > 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum > 0a76a9a32a5257331741f8d5932f183 This command was run using > /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat} > This is for version 2.9.0 > > {noformat} > 2018-05-25 05:53:12,742 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai > rSchedulerContinuousScheduling, that exited unexpectedly: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,743 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down > the resource manager. > 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: a critical thread, FairSchedulerContinuousScheduling, that exited > unexpectedly: java.lang.IllegalArgumentException: Comparison method violates > its general contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,772 ERROR > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > ExpiredTokenRemover received java.lang.InterruptedException: sleep > interrupted{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971655#comment-16971655 ] kailiu_dev commented on YARN-8373: -- ok ,as you say: "The one inside the scheduler prevents updates of nodes", so I understand when sortedNodeList other action of uodating node resource will not begin. *but you add this:* Nodes can change while being sorted. Using a standard sort will fail + * without locking each node, the PriorityQueue handles this without locks. so my other question is : why use PriorityQueue to avoid change ? > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > --- > > Key: YARN-8373 > URL: https://issues.apache.org/jira/browse/YARN-8373 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Girish Bhat >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: newbie > Attachments: YARN-8373.001.patch, YARN-8373.002.patch, > YARN-8373.003.patch > > > > > {noformat} > sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 > Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r > 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on > 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum > 0a76a9a32a5257331741f8d5932f183 This command was run using > /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat} > This is for version 2.9.0 > > {noformat} > 2018-05-25 05:53:12,742 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai > rSchedulerContinuousScheduling, that exited unexpectedly: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,743 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down > the resource manager. > 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: a critical thread, FairSchedulerContinuousScheduling, that exited > unexpectedly: java.lang.IllegalArgumentException: Comparison method violates > its general contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,772 ERROR > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > ExpiredTokenRemover received java.lang.InterruptedException: sleep > interrupted{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971642#comment-16971642 ] Wilfred Spiegelenburg commented on YARN-8373: - The one inside the scheduler prevents updates of nodes to flow through as much as possible. Node changes are linked to allocating or finishing containers, scheduling actions. These are prevented as much as possible with this lock. The one in the node tracker prevents adding or deleting nodes while it creates the list. However a node can still be removed after sorting and that is handled later while iterating over the list of nodes that is returned The two locks do different things. > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > --- > > Key: YARN-8373 > URL: https://issues.apache.org/jira/browse/YARN-8373 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Girish Bhat >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: newbie > Attachments: YARN-8373.001.patch, YARN-8373.002.patch, > YARN-8373.003.patch > > > > > {noformat} > sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 > Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r > 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on > 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum > 0a76a9a32a5257331741f8d5932f183 This command was run using > /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat} > This is for version 2.9.0 > > {noformat} > 2018-05-25 05:53:12,742 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai > rSchedulerContinuousScheduling, that exited unexpectedly: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,743 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down > the resource manager. > 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: a critical thread, FairSchedulerContinuousScheduling, that exited > unexpectedly: java.lang.IllegalArgumentException: Comparison method violates > its general contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,772 ERROR > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > ExpiredTokenRemover received java.lang.InterruptedException: sleep > interrupted{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9836) General usability improvements in showSimulationTrace.html
[ https://issues.apache.org/jira/browse/YARN-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971569#comment-16971569 ] Hadoop QA commented on YARN-9836: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 29s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 35m 19s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 9s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 20s{color} | {color:green} hadoop-sls in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 78m 24s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:080e9d0f9b3 | | JIRA Issue | YARN-9836 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985516/YARN-9836.branch-3.1.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient | | uname | Linux a83d12d73022 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.1 / 9179046 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25138/testReport/ | | Max. process+thread count | 447 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-sls U: hadoop-tools/hadoop-sls | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25138/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > General usability improvements in showSimulationTrace.html > -- > > Key: YARN-9836 > URL: https://issues.apache.org/jira/browse/YARN-9836 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler-load-simulator >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9836.001.patch, YARN-9836.002.patch, > YARN-9836.003.patch,
[jira] [Commented] (YARN-9836) General usability improvements in showSimulationTrace.html
[ https://issues.apache.org/jira/browse/YARN-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971561#comment-16971561 ] Hadoop QA commented on YARN-9836: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 0s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} branch-3.1 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 2s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} branch-3.1 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 28m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} branch-3.1 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 21s{color} | {color:green} hadoop-sls in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 73m 54s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:080e9d0f9b3 | | JIRA Issue | YARN-9836 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985516/YARN-9836.branch-3.1.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient | | uname | Linux 4721def97e24 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-3.1 / 9179046 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/25137/testReport/ | | Max. process+thread count | 439 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-sls U: hadoop-tools/hadoop-sls | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/25137/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > General usability improvements in showSimulationTrace.html > -- > > Key: YARN-9836 > URL: https://issues.apache.org/jira/browse/YARN-9836 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler-load-simulator >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9836.001.patch, YARN-9836.002.patch, > YARN-9836.003.patch,
[jira] [Comment Edited] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971523#comment-16971523 ] kailiu_dev edited comment on YARN-8373 at 11/11/19 1:26 PM: Dear, [~wilfreds] , why you use {color:#00}readLock.lock() below this code in your patch, I konw that it has used readLock{color} inside sortedNodeList, and when sortedNodeList can avoid node change , beause node add or node remove or node resource change is wirteLock , they will do not work in time readLock.lock(); try { nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator); } finally { readLock.unlock(); } was (Author: kailiu_dev): Dear, [~wilfreds] , why you use {color:#00}readLock.lock() below this code in your patch, I konw that it has used readLock{color} inside sortedNodeList, and when sortedNodeList can avoid node change , beause node add or node remove or node resource change is wirteLock , they will do not work in time readLock.lock(); + try{ nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator); + } finally { + readLock.unlock(); } > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > --- > > Key: YARN-8373 > URL: https://issues.apache.org/jira/browse/YARN-8373 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Girish Bhat >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: newbie > Attachments: YARN-8373.001.patch, YARN-8373.002.patch, > YARN-8373.003.patch > > > > > {noformat} > sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 > Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r > 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on > 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum > 0a76a9a32a5257331741f8d5932f183 This command was run using > /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat} > This is for version 2.9.0 > > {noformat} > 2018-05-25 05:53:12,742 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai > rSchedulerContinuousScheduling, that exited unexpectedly: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,743 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down > the resource manager. > 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: a critical thread, FairSchedulerContinuousScheduling, that exited > unexpectedly: java.lang.IllegalArgumentException: Comparison method violates > its general contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,772 ERROR > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > ExpiredTokenRemover received java.lang.InterruptedException: sleep > interrupted{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Comment Edited] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971523#comment-16971523 ] kailiu_dev edited comment on YARN-8373 at 11/11/19 1:24 PM: Dear, [~wilfreds] , why you use {color:#00}readLock.lock() below this code in your patch, I konw that it has used readLock{color} inside sortedNodeList, and when sortedNodeList can avoid node change , beause node add or node remove or node resource change is wirteLock , they will do not work in time readLock.lock(); + try{ nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator); + } finally { + readLock.unlock(); } was (Author: kailiu_dev): Dear, [~wilfreds] , why you use {color:#00}readLock.lock() below this code in your patch, I konw that it has used readLock{color} inside sortedNodeList, and when sortedNodeList can avoid node change , beause node add or node remove or node resource change is wirteLock , they will do not work in time readLock.lock(); + try { nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator); + } finally { + readLock.unlock(); } > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > --- > > Key: YARN-8373 > URL: https://issues.apache.org/jira/browse/YARN-8373 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Girish Bhat >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: newbie > Attachments: YARN-8373.001.patch, YARN-8373.002.patch, > YARN-8373.003.patch > > > > > {noformat} > sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 > Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r > 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on > 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum > 0a76a9a32a5257331741f8d5932f183 This command was run using > /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat} > This is for version 2.9.0 > > {noformat} > 2018-05-25 05:53:12,742 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai > rSchedulerContinuousScheduling, that exited unexpectedly: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,743 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down > the resource manager. > 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: a critical thread, FairSchedulerContinuousScheduling, that exited > unexpectedly: java.lang.IllegalArgumentException: Comparison method violates > its general contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,772 ERROR > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > ExpiredTokenRemover received java.lang.InterruptedException: sleep > interrupted{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail:
[jira] [Comment Edited] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971523#comment-16971523 ] kailiu_dev edited comment on YARN-8373 at 11/11/19 1:23 PM: Dear, [~wilfreds] , why you use {color:#00}readLock.lock() below this code in your patch, I konw that it has used readLock{color} inside sortedNodeList, and when sortedNodeList can avoid node change , beause node add or node remove or node resource change is wirteLock , they will do not work in time readLock.lock(); + try { nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator); + } finally { + readLock.unlock(); } was (Author: kailiu_dev): Dear, [~wilfreds] , why you use {color:#00}readLock.lock() below this code in your patch, I konw that it has used {color:#00}readLock{color} inside sortedNodeList, and when sortedNodeList can avoid node change , beause node add or node remove or node resource change is wirteLock , they will do not work in time{color} {color:#00}readLock.lock(); + try { nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator); + } finally { + readLock.unlock(); }{color} > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > --- > > Key: YARN-8373 > URL: https://issues.apache.org/jira/browse/YARN-8373 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Girish Bhat >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: newbie > Attachments: YARN-8373.001.patch, YARN-8373.002.patch, > YARN-8373.003.patch > > > > > {noformat} > sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 > Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r > 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on > 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum > 0a76a9a32a5257331741f8d5932f183 This command was run using > /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat} > This is for version 2.9.0 > > {noformat} > 2018-05-25 05:53:12,742 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai > rSchedulerContinuousScheduling, that exited unexpectedly: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,743 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down > the resource manager. > 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: a critical thread, FairSchedulerContinuousScheduling, that exited > unexpectedly: java.lang.IllegalArgumentException: Comparison method violates > its general contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,772 ERROR > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > ExpiredTokenRemover received java.lang.InterruptedException: sleep > interrupted{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) -
[jira] [Commented] (YARN-8373) RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
[ https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971523#comment-16971523 ] kailiu_dev commented on YARN-8373: -- Dear, [~wilfreds] , why you use {color:#00}readLock.lock() below this code in your patch, I konw that it has used {color:#00}readLock{color} inside sortedNodeList, and when sortedNodeList can avoid node change , beause node add or node remove or node resource change is wirteLock , they will do not work in time{color} {color:#00}readLock.lock(); + try { nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator); + } finally { + readLock.unlock(); }{color} > RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH > --- > > Key: YARN-8373 > URL: https://issues.apache.org/jira/browse/YARN-8373 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 2.9.0 >Reporter: Girish Bhat >Assignee: Wilfred Spiegelenburg >Priority: Major > Labels: newbie > Attachments: YARN-8373.001.patch, YARN-8373.002.patch, > YARN-8373.003.patch > > > > > {noformat} > sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0 > Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r > 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on > 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum > 0a76a9a32a5257331741f8d5932f183 This command was run using > /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat} > This is for version 2.9.0 > > {noformat} > 2018-05-25 05:53:12,742 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received > RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai > rSchedulerContinuousScheduling, that exited unexpectedly: > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,743 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down > the resource manager. > 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1: a critical thread, FairSchedulerContinuousScheduling, that exited > unexpectedly: java.lang.IllegalArgumentException: Comparison method violates > its general contract! > at java.util.TimSort.mergeHi(TimSort.java:899) > at java.util.TimSort.mergeAt(TimSort.java:516) > at java.util.TimSort.mergeForceCollapse(TimSort.java:457) > at java.util.TimSort.sort(TimSort.java:254) > at java.util.Arrays.sort(Arrays.java:1512) > at java.util.ArrayList.sort(ArrayList.java:1454) > at java.util.Collections.sort(Collections.java:175) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296) > 2018-05-25 05:53:12,772 ERROR > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager: > ExpiredTokenRemover received java.lang.InterruptedException: sleep > interrupted{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9290) Invalid SchedulingRequest not rejected in Scheduler PlacementConstraintsHandler
[ https://issues.apache.org/jira/browse/YARN-9290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971512#comment-16971512 ] Szilard Nemeth commented on YARN-9290: -- Hi [~prabhujoseph]! Thanks for this patch. Thanks also for the detailed design described in [comment| https://issues.apache.org/jira/browse/YARN-9290?focusedCommentId=16907508=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16907508]: Looks good in overall, I only have one comment: In TestSchedulingRequestContainerAllocation#testInvalidSchedulingRequest: There's a while loop that can become infinite if the size of response.getRejectedSchedulingRequests() never becomes 1. Can you do something different than waiting for size of the rejected requests to become 1 in a while-loop with Sleeps? > Invalid SchedulingRequest not rejected in Scheduler > PlacementConstraintsHandler > > > Key: YARN-9290 > URL: https://issues.apache.org/jira/browse/YARN-9290 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Attachments: YARN-9290-001.patch, YARN-9290-002.patch, > YARN-9290-003.patch, YARN-9290-004.patch, YARN-9290-005.patch, > YARN-9290-006.patch, YARN-9290-007.patch > > > SchedulingRequest with Invalid namespace is not rejected in Scheduler > PlacementConstraintsHandler. RM keeps on trying to allocateOnNode with > logging the exception. This is rejected in case of placement-processor > handler. > {code} > 2019-02-08 16:51:27,548 WARN > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator: > Failed to query node cardinality: > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.InvalidAllocationTagsQueryException: > Invalid namespace prefix: notselfi, valid values are: > all,not-self,app-id,app-tag,self > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TargetApplicationsNamespace.fromString(TargetApplicationsNamespace.java:277) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.TargetApplicationsNamespace.parse(TargetApplicationsNamespace.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.AllocationTags.createAllocationTags(AllocationTags.java:93) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfySingleConstraintExpression(PlacementConstraintsUtil.java:78) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfySingleConstraint(PlacementConstraintsUtil.java:240) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:321) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyAndConstraint(PlacementConstraintsUtil.java:272) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:324) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.PlacementConstraintsUtil.canSatisfyConstraints(PlacementConstraintsUtil.java:365) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.checkCardinalityAndPending(SingleConstraintAppPlacementAllocator.java:355) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.placement.SingleConstraintAppPlacementAllocator.precheckNode(SingleConstraintAppPlacementAllocator.java:395) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.precheckNode(AppSchedulingInfo.java:779) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.preCheckForNodeCandidateSet(RegularContainerAllocator.java:145) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.allocate(RegularContainerAllocator.java:837) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator.assignContainers(RegularContainerAllocator.java:890) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.ContainerAllocator.assignContainers(ContainerAllocator.java:54) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp.assignContainers(FiCaSchedulerApp.java:977) > at >
[jira] [Assigned] (YARN-9966) Code duplication in UserGroupMappingPlacementRule
[ https://issues.apache.org/jira/browse/YARN-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kevin su reassigned YARN-9966: -- Assignee: kevin su > Code duplication in UserGroupMappingPlacementRule > - > > Key: YARN-9966 > URL: https://issues.apache.org/jira/browse/YARN-9966 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Assignee: kevin su >Priority: Major > Labels: newbie, newbie++ > > The methods > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule#validateParentQueue > and > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils#validateQueueMappingUnderParentQueue > are exactly the same. > In these 2 classes, we also have a duplicate method named "extractQueuePath". > We need to extract these to a common method and delete one of these dupes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971506#comment-16971506 ] Hadoop QA commented on YARN-9927: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 50s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 47s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 2s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 33s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 23s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 63 new + 274 unchanged - 0 fixed = 337 total (was 274) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 46s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 30s{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 55s{color} | {color:red} hadoop-yarn-api in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 53s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 19s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 41s{color} | {color:red} The patch generated 2 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}182m 43s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | | Nullcheck of nodeId at line 1202 of value previously dereferenced in org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$NodeEventMultiDispatcher.handle(RMNodeEvent) At ResourceManager.java:1202 of value previously dereferenced in
[jira] [Commented] (YARN-9836) General usability improvements in showSimulationTrace.html
[ https://issues.apache.org/jira/browse/YARN-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971495#comment-16971495 ] Szilard Nemeth commented on YARN-9836: -- Re-attached branch-3.1 patch to retrigger Jenkins. > General usability improvements in showSimulationTrace.html > -- > > Key: YARN-9836 > URL: https://issues.apache.org/jira/browse/YARN-9836 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler-load-simulator >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9836.001.patch, YARN-9836.002.patch, > YARN-9836.003.patch, YARN-9836.branch-3.1.001.patch, > YARN-9836.branch-3.1.001.patch, YARN-9836.branch-3.2.001.patch > > > There are some small usability improvements that can be made for the offline > analysis page (showSimulationTrace.html): > - empty divs can be hidden until no data is displayed > - the site can be refactored to be responsive given that bootstrap is already > available as third party library > - there's no proper error handling in the site (e.g. a JSON is malformed and > similar cases) which is really a big problem > - there's no indentation in the raw html file which makes supportability even > worse -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9836) General usability improvements in showSimulationTrace.html
[ https://issues.apache.org/jira/browse/YARN-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9836: - Attachment: YARN-9836.branch-3.1.001.patch > General usability improvements in showSimulationTrace.html > -- > > Key: YARN-9836 > URL: https://issues.apache.org/jira/browse/YARN-9836 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler-load-simulator >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Minor > Fix For: 3.3.0 > > Attachments: YARN-9836.001.patch, YARN-9836.002.patch, > YARN-9836.003.patch, YARN-9836.branch-3.1.001.patch, > YARN-9836.branch-3.1.001.patch, YARN-9836.branch-3.2.001.patch > > > There are some small usability improvements that can be made for the offline > analysis page (showSimulationTrace.html): > - empty divs can be hidden until no data is displayed > - the site can be refactored to be responsive given that bootstrap is already > available as third party library > - there's no proper error handling in the site (e.g. a JSON is malformed and > similar cases) which is really a big problem > - there's no indentation in the raw html file which makes supportability even > worse -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9011) Race condition during decommissioning
[ https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971492#comment-16971492 ] Szilard Nemeth commented on YARN-9011: -- Hi [~pbacsko]! Thanks for this patch, +1. Would commit this soon if no objections. [~bibinchundatt], [~tangzhankun]: Any more comments? Thanks! > Race condition during decommissioning > - > > Key: YARN-9011 > URL: https://issues.apache.org/jira/browse/YARN-9011 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.1.1 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-9011-001.patch, YARN-9011-002.patch, > YARN-9011-003.patch, YARN-9011-004.patch, YARN-9011-005.patch, > YARN-9011-006.patch, YARN-9011-007.patch, YARN-9011-008.patch, > YARN-9011-009.patch > > > During internal testing, we found a nasty race condition which occurs during > decommissioning. > Node manager, incorrect behaviour: > {noformat} > 2018-06-18 21:00:17,634 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received > SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting > down. > 2018-06-18 21:00:17,634 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from > ResourceManager: Disallowed NodeManager nodeId: node-6.hostname.com:8041 > hostname:node-6.hostname.com > {noformat} > Node manager, expected behaviour: > {noformat} > 2018-06-18 21:07:37,377 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received > SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting > down. > 2018-06-18 21:07:37,377 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from > ResourceManager: DECOMMISSIONING node-6.hostname.com:8041 is ready to be > decommissioned > {noformat} > Note the two different messages from the RM ("Disallowed NodeManager" vs > "DECOMMISSIONING"). The problem is that {{ResourceTrackerService}} can see an > inconsistent state of nodes while they're being updated: > {noformat} > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: hostsReader > include:{172.26.12.198,node-7.hostname.com,node-2.hostname.com,node-5.hostname.com,172.26.8.205,node-8.hostname.com,172.26.23.76,172.26.22.223,node-6.hostname.com,172.26.9.218,node-4.hostname.com,node-3.hostname.com,172.26.13.167,node-9.hostname.com,172.26.21.221,172.26.10.219} > exclude:{node-6.hostname.com} > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully > decommission node node-6.hostname.com:8041 with state RUNNING > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: > Disallowed NodeManager nodeId: node-6.hostname.com:8041 node: > node-6.hostname.com > 2018-06-18 21:00:17,576 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node > node-6.hostname.com:8041 in DECOMMISSIONING. > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn > IP=172.26.22.115OPERATION=refreshNodes TARGET=AdminService > RESULT=SUCCESS > 2018-06-18 21:00:17,577 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Preserve > original total capability: > 2018-06-18 21:00:17,577 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: > node-6.hostname.com:8041 Node Transitioned from RUNNING to DECOMMISSIONING > {noformat} > When the decommissioning succeeds, there is no output logged from > {{ResourceTrackerService}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9865) Capacity scheduler: add support for combined %user + %secondary_group mapping
[ https://issues.apache.org/jira/browse/YARN-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971491#comment-16971491 ] Hudson commented on YARN-9865: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17627 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17627/]) YARN-9865. Capacity scheduler: add support for combined %user + (snemeth: rev 30b93f914b7015d4567e199c51a2ebe727fee320) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerAutoCreatedQueueBase.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/placement/TestUserGroupMappingPlacementRule.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerQueueMappingFactory.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/UserGroupMappingPlacementRule.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/CapacityScheduler.md > Capacity scheduler: add support for combined %user + %secondary_group mapping > - > > Key: YARN-9865 > URL: https://issues.apache.org/jira/browse/YARN-9865 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9865-005.patch, YARN-9865.001.patch, > YARN-9865.002.patch, YARN-9865.003.patch, YARN-9865.004.patch > > > Similiar to YARN-9841, but for secondary group. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9865) Capacity scheduler: add support for combined %user + %secondary_group mapping
[ https://issues.apache.org/jira/browse/YARN-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971482#comment-16971482 ] Szilard Nemeth commented on YARN-9865: -- Hi [~maniraj...@gmail.com]! Patch looks good, +1, committed to trunk! Some comments: 1. Can you please file a follow-up jira to fix the bloatedness of yarn.scheduler.capacity.queue-mappings in CapacityScheduler.md? There are too many values in the "value" tag. I propose to put the examples + their description as separate lines into the "description" tag instead. 2. Another follow-up jira candite: In TestUserGroupMappingPlacementRule, verifyQueueMapping contains many parameters. For clarity, we could refactor it as a builder-style invocation, so we could see the named parameters. Within the scope of this jira, one could also refactor QueueMapping and introduce a builder for this class. > Capacity scheduler: add support for combined %user + %secondary_group mapping > - > > Key: YARN-9865 > URL: https://issues.apache.org/jira/browse/YARN-9865 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9865-005.patch, YARN-9865.001.patch, > YARN-9865.002.patch, YARN-9865.003.patch, YARN-9865.004.patch > > > Similiar to YARN-9841, but for secondary group. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9912) Support u:user2:%secondary_group queue mapping
[ https://issues.apache.org/jira/browse/YARN-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971479#comment-16971479 ] Szilard Nemeth commented on YARN-9912: -- Hi [~maniraj...@gmail.com]! Is this jira should be in patch available? If so, please adjust the status of the jira so jenkins can pick the patch up. Thanks! > Support u:user2:%secondary_group queue mapping > -- > > Key: YARN-9912 > URL: https://issues.apache.org/jira/browse/YARN-9912 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-9912.001.patch, YARN-9912.002.patch > > > Similar to u:user2:%primary_group mapping, add support for > u:user2:%secondary_group queue mapping as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9697) Efficient allocation of Opportunistic containers.
[ https://issues.apache.org/jira/browse/YARN-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Modi updated YARN-9697: Attachment: YARN-9697.009.patch > Efficient allocation of Opportunistic containers. > - > > Key: YARN-9697 > URL: https://issues.apache.org/jira/browse/YARN-9697 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9697.001.patch, YARN-9697.002.patch, > YARN-9697.003.patch, YARN-9697.004.patch, YARN-9697.005.patch, > YARN-9697.006.patch, YARN-9697.007.patch, YARN-9697.008.patch, > YARN-9697.009.patch, YARN-9697.ut.patch, YARN-9697.ut2.patch, > YARN-9697.wip1.patch, YARN-9697.wip2.patch > > > In the current implementation, opportunistic containers are allocated based > on the number of queued opportunistic container information received in node > heartbeat. This information becomes stale as soon as more opportunistic > containers are allocated on that node. > Allocation of opportunistic containers happens on the same heartbeat in which > AM asks for the containers. When multiple applications request for > Opportunistic containers, containers might get allocated on the same set of > nodes as already allocated containers on the node are not considered while > serving requests from different applications. This can lead to uneven > allocation of Opportunistic containers across the cluster leading to > increased queuing time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9697) Efficient allocation of Opportunistic containers.
[ https://issues.apache.org/jira/browse/YARN-9697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971480#comment-16971480 ] Abhishek Modi commented on YARN-9697: - Thanks [~bibinchundatt] for the review. {code:java} private int numNodesForAnyAllocation = DEFAULT_OPP_CONTAINER_ALLOCATION_NODES_NUMBER_USED; {code} This is being used in another constructor that is being used in the test cases. Apart from that I have addressed all other comments in Yarn-9697.009.patch. > Efficient allocation of Opportunistic containers. > - > > Key: YARN-9697 > URL: https://issues.apache.org/jira/browse/YARN-9697 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Abhishek Modi >Assignee: Abhishek Modi >Priority: Major > Attachments: YARN-9697.001.patch, YARN-9697.002.patch, > YARN-9697.003.patch, YARN-9697.004.patch, YARN-9697.005.patch, > YARN-9697.006.patch, YARN-9697.007.patch, YARN-9697.008.patch, > YARN-9697.009.patch, YARN-9697.ut.patch, YARN-9697.ut2.patch, > YARN-9697.wip1.patch, YARN-9697.wip2.patch > > > In the current implementation, opportunistic containers are allocated based > on the number of queued opportunistic container information received in node > heartbeat. This information becomes stale as soon as more opportunistic > containers are allocated on that node. > Allocation of opportunistic containers happens on the same heartbeat in which > AM asks for the containers. When multiple applications request for > Opportunistic containers, containers might get allocated on the same set of > nodes as already allocated containers on the node are not considered while > serving requests from different applications. This can lead to uneven > allocation of Opportunistic containers across the cluster leading to > increased queuing time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9966) Code duplication in UserGroupMappingPlacementRule
[ https://issues.apache.org/jira/browse/YARN-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szilard Nemeth updated YARN-9966: - Description: The methods org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule#validateParentQueue and org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils#validateQueueMappingUnderParentQueue are exactly the same. In these 2 classes, we also have a duplicate method named "extractQueuePath". We need to extract these to a common method and delete one of these dupes. was:The methods org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule#validateParentQueue and org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils#validateQueueMappingUnderParentQueue are exactly the same. We need to extract it to a common method and delete one of these dupes. > Code duplication in UserGroupMappingPlacementRule > - > > Key: YARN-9966 > URL: https://issues.apache.org/jira/browse/YARN-9966 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Szilard Nemeth >Priority: Major > Labels: newbie, newbie++ > > The methods > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule#validateParentQueue > and > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils#validateQueueMappingUnderParentQueue > are exactly the same. > In these 2 classes, we also have a duplicate method named "extractQueuePath". > We need to extract these to a common method and delete one of these dupes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-9966) Code duplication in UserGroupMappingPlacementRule
Szilard Nemeth created YARN-9966: Summary: Code duplication in UserGroupMappingPlacementRule Key: YARN-9966 URL: https://issues.apache.org/jira/browse/YARN-9966 Project: Hadoop YARN Issue Type: Improvement Reporter: Szilard Nemeth The methods org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule#validateParentQueue and org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils#validateQueueMappingUnderParentQueue are exactly the same. We need to extract it to a common method and delete one of these dupes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9886) Queue mapping based on userid passed through application tag
[ https://issues.apache.org/jira/browse/YARN-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971471#comment-16971471 ] Hadoop QA commented on YARN-9886: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 43s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 49s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 33s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 57s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 4s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 50s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 87m 4s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}183m 43s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.4 Server=19.03.4 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | YARN-9886 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985500/YARN-9886.004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 7240e76ac522 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-9965) Fix NodeManager failing to start when Hdfs Auxillary Jar is set
[ https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971450#comment-16971450 ] Prabhu Joseph commented on YARN-9965: - Thanks [~abmodi]. > Fix NodeManager failing to start when Hdfs Auxillary Jar is set > --- > > Key: YARN-9965 > URL: https://issues.apache.org/jira/browse/YARN-9965 > Project: Hadoop YARN > Issue Type: Bug > Components: auxservices, nodemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9965-001.patch > > > Loading an auxiliary jar from a Hdfs location on a node manager works as > expected on first time. The subsequent restart fails with > ClassNotFoundException > {code:java} > 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: > classpath: [] > 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: > system classes: [java., javax.accessibility., javax.activation., > javax.activity., javax.annotation., javax.annotation.processing., > javax.crypto., javax.imageio., javax.jws., javax.lang.model., > -javax.management.j2ee., javax.management., javax.naming., javax.net., > javax.print., javax.rmi., javax.script., -javax.security.auth.message., > javax.security.auth., javax.security.cert., javax.security.sasl., > javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., > -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., > org.xml.sax., org.apache.commons.logging., org.apache.log4j., > -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, > hdfs-default.xml, mapred-default.xml, yarn-default.xml] > 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: > Service > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed > in state INITED > java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189) > at > org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016) > {code} > > The issue happens when reusing the previous localized auxillary service jar. > The localized jar file is appended with /* when reusing which has caused the > issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9965) Fix NodeManager failing to start when Hdfs Auxillary Jar is set
[ https://issues.apache.org/jira/browse/YARN-9965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971442#comment-16971442 ] Hudson commented on YARN-9965: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17626 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17626/]) YARN-9965. Fix NodeManager failing to start when Hdfs Auxillary Jar is (abmodi: rev 516377bfa6faa21f50b7e7c3889e4196c6d464b8) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/AuxServices.java > Fix NodeManager failing to start when Hdfs Auxillary Jar is set > --- > > Key: YARN-9965 > URL: https://issues.apache.org/jira/browse/YARN-9965 > Project: Hadoop YARN > Issue Type: Bug > Components: auxservices, nodemanager >Affects Versions: 3.2.0 >Reporter: Prabhu Joseph >Assignee: Prabhu Joseph >Priority: Major > Fix For: 3.3.0 > > Attachments: YARN-9965-001.patch > > > Loading an auxiliary jar from a Hdfs location on a node manager works as > expected on first time. The subsequent restart fails with > ClassNotFoundException > {code:java} > 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: > classpath: [] > 2019-11-08 03:59:49,256 INFO org.apache.hadoop.util.ApplicationClassLoader: > system classes: [java., javax.accessibility., javax.activation., > javax.activity., javax.annotation., javax.annotation.processing., > javax.crypto., javax.imageio., javax.jws., javax.lang.model., > -javax.management.j2ee., javax.management., javax.naming., javax.net., > javax.print., javax.rmi., javax.script., -javax.security.auth.message., > javax.security.auth., javax.security.cert., javax.security.sasl., > javax.sound., javax.sql., javax.swing., javax.tools., javax.transaction., > -javax.xml.registry., -javax.xml.rpc., javax.xml., org.w3c.dom., > org.xml.sax., org.apache.commons.logging., org.apache.log4j., > -org.apache.hadoop.hbase., org.apache.hadoop., core-default.xml, > hdfs-default.xml, mapred-default.xml, yarn-default.xml] > 2019-11-08 03:59:49,257 INFO org.apache.hadoop.service.AbstractService: > Service > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices failed > in state INITED > java.lang.ClassNotFoundException: org.apache.auxtest.AuxServiceFromHDFS > at java.net.URLClassLoader.findClass(URLClassLoader.java:381) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > at > org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189) > at > org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:348) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.getInstance(AuxiliaryServiceWithCustomClassLoader.java:169) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:270) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:321) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:478) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:936) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1016) > {code} > > The issue happens when reusing the previous localized auxillary service jar. > The localized jar file is appended with /* when reusing which has caused the > issue. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971421#comment-16971421 ] hcarrot commented on YARN-9927: --- sorry for the first time, our YARN version is not compatible with the origin version. The patch has been uploaded again. > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hcarrot updated YARN-9927: -- Attachment: YARN-9927.001.patch > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf, > YARN-9927.001.patch > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9927) RM multi-thread event processing mechanism
[ https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hcarrot updated YARN-9927: -- Attachment: (was: YARN-9927.001.patch) > RM multi-thread event processing mechanism > -- > > Key: YARN-9927 > URL: https://issues.apache.org/jira/browse/YARN-9927 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0, 2.9.2 >Reporter: hcarrot >Priority: Major > Attachments: RM multi-thread event processing mechanism.pdf > > > Recently, we have observed serious event blocking in RM event dispatcher > queue. After analysis of RM event monitoring data and RM event processing > logic, we found that > 1) environment: a cluster with thousands of nodes > 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler > 3) Meanwhile, RM event processing is in a single-thread mode, and It results > in the low headroom of RM event scheduler, thus performance of RM. > So we proposed a RM multi-thread event processing mechanism to improve RM > performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9886) Queue mapping based on userid passed through application tag
[ https://issues.apache.org/jira/browse/YARN-9886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kinga Marton updated YARN-9886: --- Attachment: YARN-9886.004.patch > Queue mapping based on userid passed through application tag > > > Key: YARN-9886 > URL: https://issues.apache.org/jira/browse/YARN-9886 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Reporter: Kinga Marton >Assignee: Kinga Marton >Priority: Major > Attachments: YARN-9886-WIP.patch, YARN-9886.001.patch, > YARN-9886.002.patch, YARN-9886.003.patch, YARN-9886.004.patch > > > There are situations when the real submitting user differs from the user what > arrives to YARN. For example in case of a Hive application when Hive > impersonation is turned off, the hive queries will run as Hive user and the > mapping is done based on this username. Unfortunately in this case YARN > doesn't have any information about the real user and there are cases when the > customer may want to map these applications to the real submitting user's > queue instead of the Hive queue. > For these cases, if they would pass the username in the application tag we > may read it and use it during the queue mapping, if that user has rights to > run on the real user's queue. > [~sunilg] please correct me if I missed something. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9537) Add configuration to disable AM preemption
[ https://issues.apache.org/jira/browse/YARN-9537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971376#comment-16971376 ] Yufei Gu commented on YARN-9537: Agreed with [~snemeth]. The production code shouldn't do the null checking. Class FairScheduler should make sure that {{getConf}} won't be null before creating any {{FSAppAttempt}} object. Hi [~cane], can you refactor the test code since it fails a test case per Hadoop QA? > Add configuration to disable AM preemption > -- > > Key: YARN-9537 > URL: https://issues.apache.org/jira/browse/YARN-9537 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 3.2.0, 3.1.2 >Reporter: zhoukang >Assignee: zhoukang >Priority: Major > Attachments: YARN-9537-002.patch, YARN-9537.001.patch, > YARN-9537.003.patch, YARN-9537.004.patch, YARN-9537.005.patch > > > In this issue, i will add a configuration to support disable AM preemption. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org