[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290712#comment-17290712 ] Hadoop QA commented on YARN-10532: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 54s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 15s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 2m 8s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 50s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/672/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 9 new + 278 unchanged - 1 fixed = 287 total (was 279) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 50s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | |
[jira] [Commented] (YARN-10651) CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource()
[ https://issues.apache.org/jira/browse/YARN-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290705#comment-17290705 ] Hadoop QA commented on YARN-10651: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 41s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 49s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 24s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 48s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 46s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 47s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | |
[jira] [Commented] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290682#comment-17290682 ] Hadoop QA commented on YARN-10258: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 41s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 33s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 25s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 23s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 27s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 27s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 15s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 22s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green}{color} | {color:green} the patch
[jira] [Updated] (YARN-10258) Add metrics for 'ApplicationsRunning' in NodeManager
[ https://issues.apache.org/jira/browse/YARN-10258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ANANDA G B updated YARN-10258: -- Attachment: YARN-10258-007.patch > Add metrics for 'ApplicationsRunning' in NodeManager > > > Key: YARN-10258 > URL: https://issues.apache.org/jira/browse/YARN-10258 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager >Affects Versions: 3.1.3 >Reporter: ANANDA G B >Assignee: ANANDA G B >Priority: Minor > Attachments: YARN-10258-001.patch, YARN-10258-002.patch, > YARN-10258-003.patch, YARN-10258-005.patch, YARN-10258-006.patch, > YARN-10258-007.patch, YARN-10258_004.patch > > > Add metrics for 'ApplicationsRunning' in NodeManagers. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10651) CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource()
[ https://issues.apache.org/jira/browse/YARN-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-10651: -- Attachment: YARN-10651.00.patch > CapacityScheduler crashed with NPE in > AbstractYarnScheduler.updateNodeResource() > - > > Key: YARN-10651 > URL: https://issues.apache.org/jira/browse/YARN-10651 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.10.0, 2.10.1 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: YARN-10651.00.patch, event_seq.jpg > > > {code:java} > 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: > Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: > Exiting, bbye..{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10651) CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource()
[ https://issues.apache.org/jira/browse/YARN-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-10651: -- Affects Version/s: 2.10.0 2.10.1 > CapacityScheduler crashed with NPE in > AbstractYarnScheduler.updateNodeResource() > - > > Key: YARN-10651 > URL: https://issues.apache.org/jira/browse/YARN-10651 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.10.0, 2.10.1 >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: event_seq.jpg > > > {code:java} > 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: > Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: > Exiting, bbye..{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290651#comment-17290651 ] Qi Zhu commented on YARN-10532: --- Thanks [~pbacsko] for hard work review, i am appreciate with your valid suggestion, I have updated it in latest change. Also i have changed the thread sleep to GenericTestUtils.waitFor for test. Thanks again, if you have any other advice for merge.:D > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is > not being used > > > Key: YARN-10532 > URL: https://issues.apache.org/jira/browse/YARN-10532 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10532.001.patch, YARN-10532.002.patch, > YARN-10532.003.patch, YARN-10532.004.patch, YARN-10532.005.patch, > YARN-10532.006.patch, YARN-10532.007.patch, YARN-10532.008.patch, > YARN-10532.009.patch, YARN-10532.010.patch, YARN-10532.011.patch, > YARN-10532.012.patch, YARN-10532.013.patch, YARN-10532.014.patch, > YARN-10532.015.patch, YARN-10532.016.patch, YARN-10532.017.patch, > YARN-10532.018.patch, YARN-10532.019.patch, YARN-10532.020.patch, > YARN-10532.021.patch, YARN-10532.022.patch, image-2021-02-12-21-32-02-267.png > > > It's better if we can delete auto-created queues when they are not in use for > a period of time (like 5 mins). It will be helpful when we have a large > number of auto-created queues (e.g. from 500 users), but only a small subset > of queues are actively used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10532: -- Attachment: YARN-10532.022.patch > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is > not being used > > > Key: YARN-10532 > URL: https://issues.apache.org/jira/browse/YARN-10532 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10532.001.patch, YARN-10532.002.patch, > YARN-10532.003.patch, YARN-10532.004.patch, YARN-10532.005.patch, > YARN-10532.006.patch, YARN-10532.007.patch, YARN-10532.008.patch, > YARN-10532.009.patch, YARN-10532.010.patch, YARN-10532.011.patch, > YARN-10532.012.patch, YARN-10532.013.patch, YARN-10532.014.patch, > YARN-10532.015.patch, YARN-10532.016.patch, YARN-10532.017.patch, > YARN-10532.018.patch, YARN-10532.019.patch, YARN-10532.020.patch, > YARN-10532.021.patch, YARN-10532.022.patch, image-2021-02-12-21-32-02-267.png > > > It's better if we can delete auto-created queues when they are not in use for > a period of time (like 5 mins). It will be helpful when we have a large > number of auto-created queues (e.g. from 500 users), but only a small subset > of queues are actively used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10651) CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource()
[ https://issues.apache.org/jira/browse/YARN-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290641#comment-17290641 ] Haibo Chen commented on YARN-10651: --- When a node update scheduler event is processed by the scheduler thread, the node might have turned unhealthy and taken to Decommissioning state, in which case the scheduler would generate a NodeResourceUpdateSchedulerEvent. If there is already a NodeRemovedSchedulerEvent on the scheduler event loop (because the node was unhealthy), then the scheduler thread would first process NodeRemovedSchedulerEvent, removing the schedulerNode and then process NodeResourceUpdateSchedulerEvent which currently assumes the scheduler is still there. The attached diagram shows the sequence of events triggering this. > CapacityScheduler crashed with NPE in > AbstractYarnScheduler.updateNodeResource() > - > > Key: YARN-10651 > URL: https://issues.apache.org/jira/browse/YARN-10651 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: event_seq.jpg > > > {code:java} > 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: > Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: > Exiting, bbye..{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10651) CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource()
[ https://issues.apache.org/jira/browse/YARN-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-10651: -- Attachment: event_seq.jpg > CapacityScheduler crashed with NPE in > AbstractYarnScheduler.updateNodeResource() > - > > Key: YARN-10651 > URL: https://issues.apache.org/jira/browse/YARN-10651 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > Attachments: event_seq.jpg > > > {code:java} > 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: > Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: > Exiting, bbye..{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10651) CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource()
[ https://issues.apache.org/jira/browse/YARN-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290636#comment-17290636 ] Haibo Chen commented on YARN-10651: --- Relevant RM log {code:java} 6553854:2021-02-24 17:06:33,934 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6553856:2021-02-24 17:06:33,935 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: xxx.linkedin.com:8041 Node Transitioned from RUNNING to UNHEALTHY 6667464:2021-02-24 17:06:43,316 INFO org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully decommission node xxx.linkedin.com:8041 with state UNHEALTHY 6667894:2021-02-24 17:06:43,344 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node xxx.linkedin.com:8041 in DECOMMISSIONING. 6667896:2021-02-24 17:06:43,344 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: xxx.linkedin.com:8041 Node Transitioned from UNHEALTHY to DECOMMISSIONING 6674223:2021-02-24 17:06:44,019 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6685460:2021-02-24 17:06:45,021 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6694638:2021-02-24 17:06:46,021 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6708206:2021-02-24 17:06:46,482 INFO org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: No action for node xxx.linkedin.com:8041 with state DECOMMISSIONING 6713019:2021-02-24 17:06:47,064 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6722017:2021-02-24 17:06:48,022 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6731628:2021-02-24 17:06:49,024 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6743847:2021-02-24 17:06:50,063 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6753586:2021-02-24 17:06:51,026 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6762950:2021-02-24 17:06:52,028 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6772642:2021-02-24 17:06:53,081 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041 reported UNHEALTHY with details: ERROR -> /dev/sdi - modules.DISK FAILED | OK -> User:yarn,modules.CPU PASS,modules.RAM PASS,modules.PROCESSES PASS,modules.NET PASS,modules.TMP_FULL PASS,modules.CGROUP PASS 6781739:2021-02-24 17:06:54,033 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node xxx.linkedin.com:8041
[jira] [Commented] (YARN-10613) Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF
[ https://issues.apache.org/jira/browse/YARN-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290634#comment-17290634 ] Hadoop QA commented on YARN-10613: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 8s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} branch-2.10 Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 44s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Oracle Corporation-1.7.0_95-b00 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Oracle Corporation-1.7.0_95-b00 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 33s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 31s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} the patch passed with JDK Oracle Corporation-1.7.0_95-b00 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green}{color} | {color:green} the patch passed with JDK Oracle Corporation-1.7.0_95-b00 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~16.04-b08 {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 37s{color} | {color:green}{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 29s{color} |
[jira] [Comment Edited] (YARN-10633) setup yarn federation failed
[ https://issues.apache.org/jira/browse/YARN-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290615#comment-17290615 ] yuguang edited comment on YARN-10633 at 2/25/21, 2:38 AM: -- [~subru] , Thanks for your reply .Actually I have configured all the necessary item and didn't configure the optional item in that link. Below is my yarn-site.xml configuration . But still get above error message yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address yarna:8032 yarn.resourcemanager.scheduler.address yarna:8030 yarn.resourcemanager.resource-tracker.address yarna:8031 yarn.resourcemanager.admin.address yarna:8033 yarn.resourcemanager.webapp.address yarna:8088 yarn.resourcemanager.hostname yarna yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler yarn.scheduler.maximum-allocation-mb 12288 yarn.scheduler.minimum-allocation-mb 32 yarn.nodemanager.resource.memory-mb 12288 yarn.nodemanager.vmem-pmem-ratio 5 yarn.scheduler.maximum-allocation-vcores 3 yarn.nodemanager.local-dirs /hadoop/yarn true Where to store container logs. yarn.nodemanager.log-dirs /hadoop/var/log/hadoop-yarn/containers Where to aggregate logs to. yarn.nodemanager.remote-app-log-dir /hadoop/var/log/hadoop-yarn/apps yarn.federation.enabled true yarn.resourcemanager.cluster-id clustera yarn.federation.state-store.class org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore yarn.federation.state-store.sql.url jdbc:mysql://nn:3306/FederationStateStore yarn.federation.state-store.sql.jdbc-class com.mysql.jdbc.jdbc2.optional.MysqlDataSource yarn.federation.state-store.sql.username FederationUser yarn.federation.state-store.sql.password yarn.nodemanager.amrmproxy.enabled true yarn.nodemanager.amrmproxy.interceptor-class.pipeline org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor was (Author: hanfrank): [~subru] , Thanks for your reply .Actually I have configured all the necessary item and didn't configure the optional item in that link. Below is my yarn-site.xml configuration . yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address yarna:8032 yarn.resourcemanager.scheduler.address yarna:8030 yarn.resourcemanager.resource-tracker.address yarna:8031 yarn.resourcemanager.admin.address yarna:8033 yarn.resourcemanager.webapp.address yarna:8088 yarn.resourcemanager.hostname yarna yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler yarn.scheduler.maximum-allocation-mb 12288 yarn.scheduler.minimum-allocation-mb 32 yarn.nodemanager.resource.memory-mb 12288 yarn.nodemanager.vmem-pmem-ratio 5 yarn.scheduler.maximum-allocation-vcores 3 yarn.nodemanager.local-dirs /hadoop/yarn true Where to store container logs. yarn.nodemanager.log-dirs /hadoop/var/log/hadoop-yarn/containers Where to aggregate logs to. yarn.nodemanager.remote-app-log-dir /hadoop/var/log/hadoop-yarn/apps yarn.federation.enabled true yarn.resourcemanager.cluster-id clustera yarn.federation.state-store.class org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore yarn.federation.state-store.sql.url jdbc:mysql://nn:3306/FederationStateStore yarn.federation.state-store.sql.jdbc-class com.mysql.jdbc.jdbc2.optional.MysqlDataSource yarn.federation.state-store.sql.username FederationUser yarn.federation.state-store.sql.password yarn.nodemanager.amrmproxy.enabled true yarn.nodemanager.amrmproxy.interceptor-class.pipeline org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor > setup yarn federation failed > > > Key: YARN-10633 > URL: https://issues.apache.org/jira/browse/YARN-10633 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Affects Versions: 3.2.2 >Reporter: yuguang >Priority: Major > > Hi > I am trying to setup yarn federation mode. But after I add below > configuration in etc/hadoop/yarn-site.xml > > yarn.federation.enabled > true > > then when I run yarn node -list . Get below error . Also the historyserver > service can not be
[jira] [Commented] (YARN-10633) setup yarn federation failed
[ https://issues.apache.org/jira/browse/YARN-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290615#comment-17290615 ] yuguang commented on YARN-10633: [~subru] , Thanks for your reply .Actually I have configured all the necessary item and didn't configure the optional item in that link. Below is my yarn-site.xml configuration . yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address yarna:8032 yarn.resourcemanager.scheduler.address yarna:8030 yarn.resourcemanager.resource-tracker.address yarna:8031 yarn.resourcemanager.admin.address yarna:8033 yarn.resourcemanager.webapp.address yarna:8088 yarn.resourcemanager.hostname yarna yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler yarn.scheduler.maximum-allocation-mb 12288 yarn.scheduler.minimum-allocation-mb 32 yarn.nodemanager.resource.memory-mb 12288 yarn.nodemanager.vmem-pmem-ratio 5 yarn.scheduler.maximum-allocation-vcores 3 yarn.nodemanager.local-dirs /hadoop/yarn true Where to store container logs. yarn.nodemanager.log-dirs /hadoop/var/log/hadoop-yarn/containers Where to aggregate logs to. yarn.nodemanager.remote-app-log-dir /hadoop/var/log/hadoop-yarn/apps yarn.federation.enabled true yarn.resourcemanager.cluster-id clustera yarn.federation.state-store.class org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore yarn.federation.state-store.sql.url jdbc:mysql://nn:3306/FederationStateStore yarn.federation.state-store.sql.jdbc-class com.mysql.jdbc.jdbc2.optional.MysqlDataSource yarn.federation.state-store.sql.username FederationUser yarn.federation.state-store.sql.password yarn.nodemanager.amrmproxy.enabled true yarn.nodemanager.amrmproxy.interceptor-class.pipeline org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor > setup yarn federation failed > > > Key: YARN-10633 > URL: https://issues.apache.org/jira/browse/YARN-10633 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Affects Versions: 3.2.2 >Reporter: yuguang >Priority: Major > > Hi > I am trying to setup yarn federation mode. But after I add below > configuration in etc/hadoop/yarn-site.xml > > yarn.federation.enabled > true > > then when I run yarn node -list . Get below error . Also the historyserver > service can not be started either . > I am using hadoop-3.2.2 version . > [root@yarna hadoop-3.2.2]# yarn node -list > 2021-02-18 05:51:39,178 INFO service.AbstractService: Service > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state > STARTEDjava.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for > length 0 at > org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62) > at > org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:175) > at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:130) > at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103) at > org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:233) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.client.cli.YarnCLI.createAndStartYarnClient(YarnCLI.java:55) > at org.apache.hadoop.yarn.client.cli.NodeCLI.run(NodeCLI.java:110) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at > org.apache.hadoop.yarn.client.cli.NodeCLI.main(NodeCLI.java:62)Exception in > thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds > for length 0 at > org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62) > at > org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:175) > at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:130) > at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103) at > org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:233) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at >
[jira] [Issue Comment Deleted] (YARN-10633) setup yarn federation failed
[ https://issues.apache.org/jira/browse/YARN-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuguang updated YARN-10633: --- Comment: was deleted (was: Thanks Subramaniam. I follow the configuration guide and configure all the necessary item . Some optional I didn't configure it . Below is my yarn-site.xml yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address yarna:8032 yarn.resourcemanager.scheduler.address yarna:8030 yarn.resourcemanager.resource-tracker.address yarna:8031 yarn.resourcemanager.admin.address yarna:8033 yarn.resourcemanager.webapp.address yarna:8088 yarn.resourcemanager.hostname yarna yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler yarn.scheduler.maximum-allocation-mb 12288 yarn.scheduler.minimum-allocation-mb 32 yarn.nodemanager.resource.memory-mb 12288 yarn.nodemanager.vmem-pmem-ratio 5 yarn.scheduler.maximum-allocation-vcores 3 yarn.nodemanager.local-dirs /hadoop/yarn true Where to store container logs. yarn.nodemanager.log-dirs /hadoop/var/log/hadoop-yarn/containers Where to aggregate logs to. yarn.nodemanager.remote-app-log-dir /hadoop/var/log/hadoop-yarn/apps yarn.federation.enable true yarn.resourcemanager.cluster-id clustera yarn.federation.state-store.class org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore yarn.federation.state-store.sql.url jdbc:mysql://nn:3306/FederationStateStore yarn.federation.state-store.sql.jdbc-class com.mysql.jdbc.jdbc2.optional.MysqlDataSource yarn.federation.state-store.sql.username FederationUser yarn.federation.state-store.sql.password xx yarn.nodemanager.amrmproxy.enabled true yarn.nodemanager.amrmproxy.interceptor-class.pipeline org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor ) > setup yarn federation failed > > > Key: YARN-10633 > URL: https://issues.apache.org/jira/browse/YARN-10633 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Affects Versions: 3.2.2 >Reporter: yuguang >Priority: Major > > Hi > I am trying to setup yarn federation mode. But after I add below > configuration in etc/hadoop/yarn-site.xml > > yarn.federation.enabled > true > > then when I run yarn node -list . Get below error . Also the historyserver > service can not be started either . > I am using hadoop-3.2.2 version . > [root@yarna hadoop-3.2.2]# yarn node -list > 2021-02-18 05:51:39,178 INFO service.AbstractService: Service > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state > STARTEDjava.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for > length 0 at > org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62) > at > org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:175) > at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:130) > at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103) at > org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:233) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.client.cli.YarnCLI.createAndStartYarnClient(YarnCLI.java:55) > at org.apache.hadoop.yarn.client.cli.NodeCLI.run(NodeCLI.java:110) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at > org.apache.hadoop.yarn.client.cli.NodeCLI.main(NodeCLI.java:62)Exception in > thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds > for length 0 at > org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62) > at > org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:175) > at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:130) > at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103) at > org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:233) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.client.cli.YarnCLI.createAndStartYarnClient(YarnCLI.java:55) >
[jira] [Assigned] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Ahuja reassigned YARN-10652: -- Assignee: Siddharth Ahuja > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10633) setup yarn federation failed
[ https://issues.apache.org/jira/browse/YARN-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290612#comment-17290612 ] yuguang commented on YARN-10633: Thanks Subramaniam. I follow the configuration guide and configure all the necessary item . Some optional I didn't configure it . Below is my yarn-site.xml yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.address yarna:8032 yarn.resourcemanager.scheduler.address yarna:8030 yarn.resourcemanager.resource-tracker.address yarna:8031 yarn.resourcemanager.admin.address yarna:8033 yarn.resourcemanager.webapp.address yarna:8088 yarn.resourcemanager.hostname yarna yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler yarn.scheduler.maximum-allocation-mb 12288 yarn.scheduler.minimum-allocation-mb 32 yarn.nodemanager.resource.memory-mb 12288 yarn.nodemanager.vmem-pmem-ratio 5 yarn.scheduler.maximum-allocation-vcores 3 yarn.nodemanager.local-dirs /hadoop/yarn true Where to store container logs. yarn.nodemanager.log-dirs /hadoop/var/log/hadoop-yarn/containers Where to aggregate logs to. yarn.nodemanager.remote-app-log-dir /hadoop/var/log/hadoop-yarn/apps yarn.federation.enable true yarn.resourcemanager.cluster-id clustera yarn.federation.state-store.class org.apache.hadoop.yarn.server.federation.store.impl.SQLFederationStateStore yarn.federation.state-store.sql.url jdbc:mysql://nn:3306/FederationStateStore yarn.federation.state-store.sql.jdbc-class com.mysql.jdbc.jdbc2.optional.MysqlDataSource yarn.federation.state-store.sql.username FederationUser yarn.federation.state-store.sql.password xx yarn.nodemanager.amrmproxy.enabled true yarn.nodemanager.amrmproxy.interceptor-class.pipeline org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor > setup yarn federation failed > > > Key: YARN-10633 > URL: https://issues.apache.org/jira/browse/YARN-10633 > Project: Hadoop YARN > Issue Type: Bug > Components: federation >Affects Versions: 3.2.2 >Reporter: yuguang >Priority: Major > > Hi > I am trying to setup yarn federation mode. But after I add below > configuration in etc/hadoop/yarn-site.xml > > yarn.federation.enabled > true > > then when I run yarn node -list . Get below error . Also the historyserver > service can not be started either . > I am using hadoop-3.2.2 version . > [root@yarna hadoop-3.2.2]# yarn node -list > 2021-02-18 05:51:39,178 INFO service.AbstractService: Service > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state > STARTEDjava.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for > length 0 at > org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62) > at > org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:175) > at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:130) > at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103) at > org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:233) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at > org.apache.hadoop.yarn.client.cli.YarnCLI.createAndStartYarnClient(YarnCLI.java:55) > at org.apache.hadoop.yarn.client.cli.NodeCLI.run(NodeCLI.java:110) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at > org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) at > org.apache.hadoop.yarn.client.cli.NodeCLI.main(NodeCLI.java:62)Exception in > thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds > for length 0 at > org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider.init(ConfiguredRMFailoverProxyProvider.java:62) > at > org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:175) > at org.apache.hadoop.yarn.client.RMProxy.newProxyInstance(RMProxy.java:130) > at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:103) at > org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:233) > at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) > at >
[jira] [Created] (YARN-10651) CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource()
Haibo Chen created YARN-10651: - Summary: CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource() Key: YARN-10651 URL: https://issues.apache.org/jira/browse/YARN-10651 Project: Hadoop YARN Issue Type: Bug Reporter: Haibo Chen Assignee: Haibo Chen {code:java} 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154){code} at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) at java.lang.Thread.run(Thread.java:748) 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: Exiting, bbye.. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10651) CapacityScheduler crashed with NPE in AbstractYarnScheduler.updateNodeResource()
[ https://issues.apache.org/jira/browse/YARN-10651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haibo Chen updated YARN-10651: -- Description: {code:java} 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) at java.lang.Thread.run(Thread.java:748) 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: Exiting, bbye..{code} was: {code:java} 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154){code} at org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) at java.lang.Thread.run(Thread.java:748) 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: Exiting, bbye.. > CapacityScheduler crashed with NPE in > AbstractYarnScheduler.updateNodeResource() > - > > Key: YARN-10651 > URL: https://issues.apache.org/jira/browse/YARN-10651 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Haibo Chen >Assignee: Haibo Chen >Priority: Major > > {code:java} > 2021-02-24 17:07:39,798 FATAL org.apache.hadoop.yarn.event.EventDispatcher: > Error in handling event type NODE_RESOURCE_UPDATE to the Event Dispatcher > java.lang.NullPointerException > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.updateNodeResource(AbstractYarnScheduler.java:809) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updateNodeAndQueueResource(CapacityScheduler.java:1116) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1505) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:154) > at > org.apache.hadoop.yarn.event.EventDispatcher$EventProcessor.run(EventDispatcher.java:66) > at java.lang.Thread.run(Thread.java:748) > 2021-02-24 17:07:39,798 INFO org.apache.hadoop.yarn.event.EventDispatcher: > Exiting, bbye..{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
Siddharth Ahuja created YARN-10652: -- Summary: Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it Key: YARN-10652 URL: https://issues.apache.org/jira/browse/YARN-10652 Project: Hadoop YARN Issue Type: Bug Components: capacity scheduler Affects Versions: 3.3.0 Reporter: Siddharth Ahuja AD usernames can have a "." (dot) in them i.e. they can be of the format -> {{firstname.lastname}}. However, if you specify a username with this format against the Capacity Scheduler setting -> {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, it fails to be applied and is instead assigned the default of 1.0f weight. This renders the user weight feature (being used as a means of setting user priorities for a queue) unusable for such users. This limitation comes from [1]. From [1], only word characters (A word character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no good for AD names that contain a "." (dot). Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and HADOOP-15395 and the outcome was to use non-whitespace characters i.e. instead of {{\w+}}, use {{\S+}}. We could go down similar path and unblock this feature for the AD usernames with a "." (dot) in them. [1] https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 [2] https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4754) Too many connection opened to TimelineServer while publishing entities
[ https://issues.apache.org/jira/browse/YARN-4754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290609#comment-17290609 ] 王帅 commented on YARN-4754: -- * so,hasn't the problem been solved? > Too many connection opened to TimelineServer while publishing entities > -- > > Key: YARN-4754 > URL: https://issues.apache.org/jira/browse/YARN-4754 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Rohith Sharma K S >Priority: Critical > Attachments: ConnectionLeak.rar > > > It is observed that there are too many connections are kept opened to > TimelineServer while publishing entities via SystemMetricsPublisher. This > cause sometimes resource shortage for other process or RM itself > {noformat} > tcp0 0 10.18.99.110:3999 10.18.214.60:59265 > ESTABLISHED 115302/java > tcp0 0 10.18.99.110:25001 :::*LISTEN > 115302/java > tcp0 0 10.18.99.110:25002 :::*LISTEN > 115302/java > tcp0 0 10.18.99.110:25003 :::*LISTEN > 115302/java > tcp0 0 10.18.99.110:25004 :::*LISTEN > 115302/java > tcp0 0 10.18.99.110:25005 :::*LISTEN > 115302/java > tcp1 0 10.18.99.110:48866 10.18.99.110:8188 > CLOSE_WAIT 115302/java > tcp1 0 10.18.99.110:48137 10.18.99.110:8188 > CLOSE_WAIT 115302/java > tcp1 0 10.18.99.110:47553 10.18.99.110:8188 > CLOSE_WAIT 115302/java > tcp1 0 10.18.99.110:48424 10.18.99.110:8188 > CLOSE_WAIT 115302/java > tcp1 0 10.18.99.110:48139 10.18.99.110:8188 > CLOSE_WAIT 115302/java > tcp1 0 10.18.99.110:48096 10.18.99.110:8188 > CLOSE_WAIT 115302/java > tcp1 0 10.18.99.110:47558 10.18.99.110:8188 > CLOSE_WAIT 115302/java > tcp1 0 10.18.99.110:49270 10.18.99.110:8188 > CLOSE_WAIT 115302/java > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10613) Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF
[ https://issues.apache.org/jira/browse/YARN-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290587#comment-17290587 ] Eric Payne commented on YARN-10613: --- The {{TestRMRestart}} failure is unrelated to this patch. I attached the branch-2.10 patch. > Config to allow Intra- and Inter-queue preemption to enable/disable > conservativeDRF > > > Key: YARN-10613 > URL: https://issues.apache.org/jira/browse/YARN-10613 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, scheduler preemption >Affects Versions: 3.3.0, 3.2.2, 3.1.4, 2.10.1 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Minor > Attachments: YARN-10613.branch-2.10.002.patch, > YARN-10613.trunk.001.patch, YARN-10613.trunk.002.patch > > > YARN-8292 added code that prevents CS intra-queue preemption from preempting > containers from an app unless all of the major resources used by the app are > greater than the user limit for that user. > Ex: > | Used | User Limit | > | <58GB, 58> | <30GB, 300> | > In this example, only used memory is above the user limit, not used vcores. > So, intra-queue preemption will not occur. > YARN-8292 added the {{conservativeDRF}} flag to > {{CapacitySchedulerPreemptionUtils#tryPreemptContainerAndDeductResToObtain}}. > If {{conservativeDRF}} is false, containers will be preempted from apps in > the example state. If true, containers will not be preempted. > This flag is hard-coded to false for Inter-queue (cross-queue) preemption and > true for intra-queue (in-queue) preemption. > I propose that in some cases, we want intra-queue preemption to be more > aggressive and preempt in the example case. To accommodate that, I propose > the addition of a config property. > Also, we may want inter-queue (cross-queue) preemption to be more > conservative, so I propose also making that a configuration property: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10613) Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF
[ https://issues.apache.org/jira/browse/YARN-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-10613: -- Attachment: YARN-10613.branch-2.10.002.patch > Config to allow Intra- and Inter-queue preemption to enable/disable > conservativeDRF > > > Key: YARN-10613 > URL: https://issues.apache.org/jira/browse/YARN-10613 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, scheduler preemption >Affects Versions: 3.3.0, 3.2.2, 3.1.4, 2.10.1 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Minor > Attachments: YARN-10613.branch-2.10.002.patch, > YARN-10613.trunk.001.patch, YARN-10613.trunk.002.patch > > > YARN-8292 added code that prevents CS intra-queue preemption from preempting > containers from an app unless all of the major resources used by the app are > greater than the user limit for that user. > Ex: > | Used | User Limit | > | <58GB, 58> | <30GB, 300> | > In this example, only used memory is above the user limit, not used vcores. > So, intra-queue preemption will not occur. > YARN-8292 added the {{conservativeDRF}} flag to > {{CapacitySchedulerPreemptionUtils#tryPreemptContainerAndDeductResToObtain}}. > If {{conservativeDRF}} is false, containers will be preempted from apps in > the example state. If true, containers will not be preempted. > This flag is hard-coded to false for Inter-queue (cross-queue) preemption and > true for intra-queue (in-queue) preemption. > I propose that in some cases, we want intra-queue preemption to be more > aggressive and preempt in the example case. To accommodate that, I propose > the addition of a config property. > Also, we may want inter-queue (cross-queue) preemption to be more > conservative, so I propose also making that a configuration property: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10613) Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF
[ https://issues.apache.org/jira/browse/YARN-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290579#comment-17290579 ] Hadoop QA commented on YARN-10613: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 16s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 5s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 53s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 53s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green}{color} | {color:green} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 0 new + 103 unchanged - 1 fixed = 103 total (was 104) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 42s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
[jira] [Commented] (YARN-10613) Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF
[ https://issues.apache.org/jira/browse/YARN-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290520#comment-17290520 ] Jim Brennan commented on YARN-10613: Thanks for the update [~epayne]. This looks good to me. +1 on patch 002. I will wait for the pre-commit build to finish before committing this. > Config to allow Intra- and Inter-queue preemption to enable/disable > conservativeDRF > > > Key: YARN-10613 > URL: https://issues.apache.org/jira/browse/YARN-10613 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, scheduler preemption >Affects Versions: 3.3.0, 3.2.2, 3.1.4, 2.10.1 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Minor > Attachments: YARN-10613.trunk.001.patch, YARN-10613.trunk.002.patch > > > YARN-8292 added code that prevents CS intra-queue preemption from preempting > containers from an app unless all of the major resources used by the app are > greater than the user limit for that user. > Ex: > | Used | User Limit | > | <58GB, 58> | <30GB, 300> | > In this example, only used memory is above the user limit, not used vcores. > So, intra-queue preemption will not occur. > YARN-8292 added the {{conservativeDRF}} flag to > {{CapacitySchedulerPreemptionUtils#tryPreemptContainerAndDeductResToObtain}}. > If {{conservativeDRF}} is false, containers will be preempted from apps in > the example state. If true, containers will not be preempted. > This flag is hard-coded to false for Inter-queue (cross-queue) preemption and > true for intra-queue (in-queue) preemption. > I propose that in some cases, we want intra-queue preemption to be more > aggressive and preempt in the example case. To accommodate that, I propose > the addition of a config property. > Also, we may want inter-queue (cross-queue) preemption to be more > conservative, so I propose also making that a configuration property: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10613) Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF
[ https://issues.apache.org/jira/browse/YARN-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290483#comment-17290483 ] Eric Payne commented on YARN-10613: --- Thanks a lot, [~Jim_Brennan], for the review! I have attached version 002 of the patch. This patch backports fairly cleanly (with minor import conflicts) back to branch-3.1. The patch has quite a few conflicts with branch-2.10, so I will need to put up a separate patch for that. > Config to allow Intra- and Inter-queue preemption to enable/disable > conservativeDRF > > > Key: YARN-10613 > URL: https://issues.apache.org/jira/browse/YARN-10613 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, scheduler preemption >Affects Versions: 3.3.0, 3.2.2, 3.1.4, 2.10.1 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Minor > Attachments: YARN-10613.trunk.001.patch, YARN-10613.trunk.002.patch > > > YARN-8292 added code that prevents CS intra-queue preemption from preempting > containers from an app unless all of the major resources used by the app are > greater than the user limit for that user. > Ex: > | Used | User Limit | > | <58GB, 58> | <30GB, 300> | > In this example, only used memory is above the user limit, not used vcores. > So, intra-queue preemption will not occur. > YARN-8292 added the {{conservativeDRF}} flag to > {{CapacitySchedulerPreemptionUtils#tryPreemptContainerAndDeductResToObtain}}. > If {{conservativeDRF}} is false, containers will be preempted from apps in > the example state. If true, containers will not be preempted. > This flag is hard-coded to false for Inter-queue (cross-queue) preemption and > true for intra-queue (in-queue) preemption. > I propose that in some cases, we want intra-queue preemption to be more > aggressive and preempt in the example case. To accommodate that, I propose > the addition of a config property. > Also, we may want inter-queue (cross-queue) preemption to be more > conservative, so I propose also making that a configuration property: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10613) Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF
[ https://issues.apache.org/jira/browse/YARN-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Payne updated YARN-10613: -- Attachment: YARN-10613.trunk.002.patch > Config to allow Intra- and Inter-queue preemption to enable/disable > conservativeDRF > > > Key: YARN-10613 > URL: https://issues.apache.org/jira/browse/YARN-10613 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, scheduler preemption >Affects Versions: 3.3.0, 3.2.2, 3.1.4, 2.10.1 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Minor > Attachments: YARN-10613.trunk.001.patch, YARN-10613.trunk.002.patch > > > YARN-8292 added code that prevents CS intra-queue preemption from preempting > containers from an app unless all of the major resources used by the app are > greater than the user limit for that user. > Ex: > | Used | User Limit | > | <58GB, 58> | <30GB, 300> | > In this example, only used memory is above the user limit, not used vcores. > So, intra-queue preemption will not occur. > YARN-8292 added the {{conservativeDRF}} flag to > {{CapacitySchedulerPreemptionUtils#tryPreemptContainerAndDeductResToObtain}}. > If {{conservativeDRF}} is false, containers will be preempted from apps in > the example state. If true, containers will not be preempted. > This flag is hard-coded to false for Inter-queue (cross-queue) preemption and > true for intra-queue (in-queue) preemption. > I propose that in some cases, we want intra-queue preemption to be more > aggressive and preempt in the example case. To accommodate that, I propose > the addition of a config property. > Also, we may want inter-queue (cross-queue) preemption to be more > conservative, so I propose also making that a configuration property: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9615) Add dispatcher metrics to RM
[ https://issues.apache.org/jira/browse/YARN-9615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290237#comment-17290237 ] Peter Bacsko commented on YARN-9615: Thanks [~zhuqi] for the patch. I'll try to review this tomorrow. > Add dispatcher metrics to RM > > > Key: YARN-9615 > URL: https://issues.apache.org/jira/browse/YARN-9615 > Project: Hadoop YARN > Issue Type: Task >Reporter: Jonathan Hung >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-9615.001.patch, YARN-9615.002.patch, > YARN-9615.003.patch, YARN-9615.poc.patch, screenshot-1.png > > > It'd be good to have counts/processing times for each event type in RM async > dispatcher and scheduler async dispatcher. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289968#comment-17289968 ] Peter Bacsko edited comment on YARN-10532 at 2/24/21, 8:12 PM: --- FIRST round review. I might post more but these are that stand out to me right now. 1. AbstractYarnScheduler: {noformat} public void removeQueue(CSQueue queueName) throws YarnException { throw new YarnException(getClass().getSimpleName() + " does not support removing queues"); } {noformat} If this is an abstract class, just make this method abstract without implementation: {{public abstract void removeQueue(CSQueue queueName) throws YarnException;}} 2. {noformat} // When this queue has application submit to? // This property only applies to dynamic queue, // and will be used to check when the queue need to be removed. {noformat} Rephrase this comment a little bit: {noformat} // The timestamp of the last submitted application to this queue. // Only applies to dynamic queues. {noformat} 3. {noformat} // "Tab" the queue, so this queue won't be removed because of idle timeout. public void signalToSubmitToQueue() { {noformat} I'd comment that "Update the timestamp of the last submitted application". Also, the method name sounds weird to me. What it does is really simple. Call it {{updateLastSubmittedTimeStamp()}}. If you use the right naming, then the comment is probably unnecessary. We don't need comments if the method is simple and easy to understand its purpose. 4. Instead of this: {noformat} // just for test public void setLastSubmittedTimestamp(long lastSubmittedTimestamp) { {noformat} use this: {noformat} @VisibleForTesting public void setLastSubmittedTimestamp(long lastSubmittedTimestamp) { {noformat} 5. This comment is completely unnecessary I think: {noformat} // Expired queue, when there are no applications in queue, // and the last submit time has been expired. // Delete queue when expired deletion enabled. {noformat} It's obvious what the method is doing. Or if you insist on having a comment there, just add "Timeout expired, delete the dynamic queue" 6. I suggest a better exception message: {noformat} throw new SchedulerDynamicEditException( "The queue " + queue.getQueuePath() + " can't removed normally."); {noformat} It should say "The queue ABC cannot be removed because it's parent is null". 7. {{LOG.info("Removed queue: " + queue.getQueuePath());}} – not necessary to log a successful removal. If there is no message, it means that the removal was successful. 8. Typo in comment: {{// 300s for expired defualt}} --> "default" 9. These methods are used by the code itself, not just test: {noformat} @VisibleForTesting public void prepareForAutoDeletion() { ... @VisibleForTesting public void triggerAutoDeletionForExpiredQueues() { {noformat} So "VisibleForTesting" should be removed. 10. {noformat} private void queueAutoDeletion(CSQueue checkQueue) { //Scheduler update is asynchronous if (checkQueue != null) { {noformat} Three things: * {{queueAutoDeletion()}} - this method is a noun. Ideally, methods begin with a verb. For example "deleteDynamicQueue()" or "deleteAutoCreatedQueue()". * Also, why is it called "checkQueue"? Just call it "queue". * The comment is confusing: "Scheduler update is asynchronous". Why is it there? This statement does not tell me anything in this context. Does it refer to the null-check? 11. {noformat} @Before public void setUp() throws Exception { // The expired time for deletion will be 1s super.setUp(); } {noformat} This method is unnecessary, the setUp() method in the super class will be called anyway. 12. Test methods: {{testEditSchedule}}, {{testCapacitySchedulerAutoQueueDeletion}}, {{testCapacitySchedulerAutoQueueDeletionDisabled}} These test methods are long, but it's not my main problem. There are {{Thread.sleep()}} calls inside. This is really problematic, especially short sleeps like {{Thread.sleep(100)}}. I have fixed many flaky tests where the test code were full of {{Thread.sleep()}}. This must be avoided whever possible. We should come up with a better solution, eg. polling a certain state regularly, for example: {noformat} GenericTestUtils.waitFor(someObject.isConditionTrue(), 500, 10_000); {noformat} This method calls {{someObject.isConditionTrue()}} in every 500ms and it times out after 10 seconds. In case of a timeout, a {{TimeoutException}} will be thrown. was (Author: pbacsko): FIRST round review. I might post more but these are that stand out to me
[jira] [Commented] (YARN-10623) Capacity scheduler should support refresh queue automatically by a thread policy.
[ https://issues.apache.org/jira/browse/YARN-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290235#comment-17290235 ] Peter Bacsko commented on YARN-10623: - I have some minor comments. 1. {{LOG.info("Auto refreshed queue successfully!");}} The sentence "Queue auto refresh completed successfully" sounds better. 2. {noformat} LOG.error("Can't refresh queue: " + e.getMessage()); ... LOG.error("Can't get file status for refresh : " + e.getMessage()); {noformat} We don't have the stack trace. Having the stack trace is very important for debugging, so either use {{LOG.error("Can't refresh queue", e);}} or log it separately. 3. {noformat} public FileSystem getFs() { return fs; } public Path getAllocCsFile() { return allocCsFile; } public ResourceCalculator getResourceCalculator() { return rc; } public RMContext getRmContext() { return rmContext; } public CapacityScheduler getScheduler() { return scheduler; } {noformat} Are these methods used? To me it looks like that not even the test code calls these methods. So remove those which are unused. 4. {noformat} try { Thread.sleep(3000); } catch (Exception e) { // do nothing } {noformat} Just as I mentioned in a different review, we should refrain from {{Thread.sleep()}}. It unnecessarily slows down the test. Use {{GenerticTestUtils.waitFor()}}. 5. {noformat} try { rm = new MockRM(configuration); rm.init(configuration); rm.start(); } catch(Exception ex) { fail("Should not get any exceptions"); } {noformat} You don't have to catch the exceptions from MockRM. If it fails, the test fails anyway. In this case, it will be counted as a failed test. But if it cannot start, that's really a test error, which is a separate counter in JUnit. Just remove the try-catch block. > Capacity scheduler should support refresh queue automatically by a thread > policy. > - > > Key: YARN-10623 > URL: https://issues.apache.org/jira/browse/YARN-10623 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10623.001.patch, YARN-10623.002.patch, > YARN-10623.003.patch > > > In fair scheduler, it is supported that refresh queue related conf > automatically by a thread to reload, but in capacity scheduler we only > support to refresh queue related changes by refreshQueues, it is needed for > our cluster to realize queue manage. > cc [~wangda] [~ztang] [~pbacsko] [~snemeth] [~gandras] [~bteke] [~shuzirra] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10609) Update the document for YARN-10531(Be able to disable user limit factor for CapacityScheduler Leaf Queue)
[ https://issues.apache.org/jira/browse/YARN-10609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10609: Hadoop Flags: Reviewed > Update the document for YARN-10531(Be able to disable user limit factor for > CapacityScheduler Leaf Queue) > - > > Key: YARN-10609 > URL: https://issues.apache.org/jira/browse/YARN-10609 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Fix For: 3.4.0 > > Attachments: YARN-10609.001.patch, YARN-10609.002.patch, > YARN-10609.003.patch, YARN-10609.004.patch, YARN-10609.005.patch > > > Since we have finished YARN-10531. > We should update the corresponding document. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10609) Update the document for YARN-10531(Be able to disable user limit factor for CapacityScheduler Leaf Queue)
[ https://issues.apache.org/jira/browse/YARN-10609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290217#comment-17290217 ] Peter Bacsko commented on YARN-10609: - +1 Thanks [~zhuqi] for the patch and [~bteke] for the review. Committed to master. > Update the document for YARN-10531(Be able to disable user limit factor for > CapacityScheduler Leaf Queue) > - > > Key: YARN-10609 > URL: https://issues.apache.org/jira/browse/YARN-10609 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10609.001.patch, YARN-10609.002.patch, > YARN-10609.003.patch, YARN-10609.004.patch, YARN-10609.005.patch > > > Since we have finished YARN-10531. > We should update the corresponding document. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10613) Config to allow Intra- and Inter-queue preemption to enable/disable conservativeDRF
[ https://issues.apache.org/jira/browse/YARN-10613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290079#comment-17290079 ] Jim Brennan commented on YARN-10613: Thanks [~epayne]! The patch looks good to me but for two minor issues: # In ProportionalCapacityPreemptionPolicy, I think you need to add a {{"\n"}} between the new lines. # In the new inter-queue test, I think you should explicitly set the property to true instead of relying on {{DEFAULT_CROSS_QUEUE_PREEMPTION_CONSERVATIVE_DRF}} to be true (line 237). Same applies for the first part of the Intra-Queue test, you should explicitly set the property to true - it's currently not set at all. > Config to allow Intra- and Inter-queue preemption to enable/disable > conservativeDRF > > > Key: YARN-10613 > URL: https://issues.apache.org/jira/browse/YARN-10613 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler, scheduler preemption >Affects Versions: 3.3.0, 3.2.2, 3.1.4, 2.10.1 >Reporter: Eric Payne >Assignee: Eric Payne >Priority: Minor > Attachments: YARN-10613.trunk.001.patch > > > YARN-8292 added code that prevents CS intra-queue preemption from preempting > containers from an app unless all of the major resources used by the app are > greater than the user limit for that user. > Ex: > | Used | User Limit | > | <58GB, 58> | <30GB, 300> | > In this example, only used memory is above the user limit, not used vcores. > So, intra-queue preemption will not occur. > YARN-8292 added the {{conservativeDRF}} flag to > {{CapacitySchedulerPreemptionUtils#tryPreemptContainerAndDeductResToObtain}}. > If {{conservativeDRF}} is false, containers will be preempted from apps in > the example state. If true, containers will not be preempted. > This flag is hard-coded to false for Inter-queue (cross-queue) preemption and > true for intra-queue (in-queue) preemption. > I propose that in some cases, we want intra-queue preemption to be more > aggressive and preempt in the example case. To accommodate that, I propose > the addition of a config property. > Also, we may want inter-queue (cross-queue) preemption to be more > conservative, so I propose also making that a configuration property: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10640) Adjust the queue Configured capacity to Configured weight number for weight mode in UI.
[ https://issues.apache.org/jira/browse/YARN-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Teke updated YARN-10640: - Summary: Adjust the queue Configured capacity to Configured weight number for weight mode in UI. (was: Ajust the queue Configured capacity to Configured weight number for weight mode in UI.) > Adjust the queue Configured capacity to Configured weight number for weight > mode in UI. > > > Key: YARN-10640 > URL: https://issues.apache.org/jira/browse/YARN-10640 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10640.001.patch, YARN-10640.002.patch, > image-2021-02-20-11-21-50-306.png, image-2021-02-20-14-18-56-261.png, > image-2021-02-20-14-19-30-767.png > > > In weight mode: > Both the weight mode static queue and the dynamic queue will show the > Configured Capacity to 0. I think this should change to Configured Weight if > we use weight mode, this will be helpful. > Such as in dynamic weight mode queue: > !image-2021-02-20-11-21-50-306.png|width=528,height=374! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10639) Queueinfo related capacity, should ajusted to weight mode.
[ https://issues.apache.org/jira/browse/YARN-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290022#comment-17290022 ] Gergely Pollak commented on YARN-10639: --- While this fix fills the capacity field in the case of a weight configured queue, I think we should investigate if we could painlessly add a weight field to the queue info and fill it rather. The problem is I've check the usages of the capacity field of the QueueInfo, and while it is mostly used for display purposes, it is considered a percentage value, and parsed as such, so it might cause confusion or even worse it might result in bogus behaviour. > Queueinfo related capacity, should ajusted to weight mode. > -- > > Key: YARN-10639 > URL: https://issues.apache.org/jira/browse/YARN-10639 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10639.001.patch, YARN-10639.002.patch > > > {color:#172b4d}The class QueueInfo capacity field should consider the weight > mode.{color} > {color:#172b4d}Now when client use getQueueInfo to get queue capacity in > weight mode, i always return 0, it is wrong.{color} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10623) Capacity scheduler should support refresh queue automatically by a thread policy.
[ https://issues.apache.org/jira/browse/YARN-10623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17290013#comment-17290013 ] Gergely Pollak commented on YARN-10623: --- [~zhuqi] thank you for the patch! I really like the idea that the solution is a configurable extra component (scheduling edit policy), this way it is cleanly decoupled from the main code base, won't cause any issue for those who don't want to use it. I very much agree with [~gandras] that last modification check was necessary however I think it introduced a bug: {code:java} 126if (lastModified > lastSuccessfulReload && 127time > lastModified + monitoringInterval) { {code} So in the edge case when you update this file MORE frequently than the monitoringInterval (which is configurable, so it might be in the minute or even 10 minutes range), then you won't ever refresh the config, since last modified + monitoringInterval will be ALWAYS greater than the current time. I think you should go with {code:java} 126 if (lastModified > lastSuccessfulReload && 127 time > lastSuccessfulReload + monitoringInterval) { {code} Or even better introduce a lastReloadAttempt, since a reload can fail, and in this case it would result keep trying to reload the invalid configuration, so if you'd introduce a lastReloadAttempt and set it each time before you try to reload the configuration, then you could use {code:java} 126 if (lastModified > lastReloadAttempt && 127 time > lastReloadAttempt + monitoringInterval) { {code} This would guarantee that you don't reload more frequently than the monitoringInterval, you don't reload if the configuration hasn't been modified, but still keep reloading if the file is updated frequently. Otherwise the patch looks good to me (non-binding). > Capacity scheduler should support refresh queue automatically by a thread > policy. > - > > Key: YARN-10623 > URL: https://issues.apache.org/jira/browse/YARN-10623 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10623.001.patch, YARN-10623.002.patch, > YARN-10623.003.patch > > > In fair scheduler, it is supported that refresh queue related conf > automatically by a thread to reload, but in capacity scheduler we only > support to refresh queue related changes by refreshQueues, it is needed for > our cluster to realize queue manage. > cc [~wangda] [~ztang] [~pbacsko] [~snemeth] [~gandras] [~bteke] [~shuzirra] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10532) Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is not being used
[ https://issues.apache.org/jira/browse/YARN-10532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289968#comment-17289968 ] Peter Bacsko commented on YARN-10532: - FIRST round review. I might post more but these are that stand out to me right now. 1. AbstractYarnScheduler: {noformat} public void removeQueue(CSQueue queueName) throws YarnException { throw new YarnException(getClass().getSimpleName() + " does not support removing queues"); } {noformat} If this is an abstract class, just make this method abstract without implementation: {{public abstract void removeQueue(CSQueue queueName) throws YarnException;}} 2. {noformat} // When this queue has application submit to? // This property only applies to dynamic queue, // and will be used to check when the queue need to be removed. {noformat} Rephrase this comment a little bit: {noformat} // The timestamp of the last submitted application to this queue. // Only applies to dynamic queues. {noformat} 3. {noformat} // "Tab" the queue, so this queue won't be removed because of idle timeout. public void signalToSubmitToQueue() { {noformat} I'd comment that "Update the timestamp of the last submitted application". Also, the method name is sounds weird to me. What it does is really simple. Call it {{updateLastSubmittedTimeStamp()}}. If you use the right naming, then the comment is probably unnecessary. We don't need comments if the method is simple and easy to understand its purpose. 4. Instead of this: {noformat} // just for test public void setLastSubmittedTimestamp(long lastSubmittedTimestamp) { {noformat} use this: {noformat} @VisibleForTesting public void setLastSubmittedTimestamp(long lastSubmittedTimestamp) { {noformat} 5. This comment is completely unnecessary I think: {noformat} // Expired queue, when there are no applications in queue, // and the last submit time has been expired. // Delete queue when expired deletion enabled. {noformat} It's obvious what the method is doing. Or if you insist on having a comment there, just add "Timeout expired, delete the dynamic queue" 6. I suggest a better exception message: {noformat} throw new SchedulerDynamicEditException( "The queue " + queue.getQueuePath() + " can't removed normally."); {noformat} It should say "The queue ABC cannot be removed because it's parent is null". 7. {{LOG.info("Removed queue: " + queue.getQueuePath());}} – not necessary to log a successful removal. If there is no message, it means that the removal was successful. 8. Typo in comment: {{// 300s for expired defualt}} --> "default" 9. These methods are used by the code itself, not just test: {noformat} @VisibleForTesting public void prepareForAutoDeletion() { ... @VisibleForTesting public void triggerAutoDeletionForExpiredQueues() { {noformat} So "VisibleForTesting" should be removed. 10. {noformat} private void queueAutoDeletion(CSQueue checkQueue) { //Scheduler update is asynchronous if (checkQueue != null) { {noformat} Three things: * {{queueAutoDeletion()}} - this method is a noun. Ideally, methods begin with a verb. For example "deleteDynamicQueue()" or "deleteAutoCreatedQueue()". * Also, why is it called "checkQueue"? Just call it "queue". * The comment is confusing: "Scheduler update is asynchronous". Why is it there? This statement does not tell me anything in this context. Does it refer to the null-check? 11. {noformat} @Before public void setUp() throws Exception { // The expired time for deletion will be 1s super.setUp(); } {noformat} This method is unnecessary, the setUp() method in the super class will be called anyway. 12. Test methods: {{testEditSchedule}}, {{testCapacitySchedulerAutoQueueDeletion}}, {{testCapacitySchedulerAutoQueueDeletionDisabled}} These test methods are long, but it's not my main problem. There are {{Thread.sleep()}} calls inside. This is really problematic, especially short sleeps like {{Thread.sleep(100)}}. I have fixed many flaky tests where the test code were full of {{Thread.sleep()}}. This must be avoided whever possible. We should come up with a better solution, eg. polling a certain state regularly, for example: {noformat} GenericTestUtils.waitFor(someObject.isConditionTrue(), 500, 10_000); {noformat} This method calls {{someObject.isConditionTrue()}} in every 500ms and it times out after 10 seconds. In case of a timeout, a {{TimeoutException}} will be thrown. > Capacity Scheduler Auto Queue Creation: Allow auto delete queue when queue is > not being used >
[jira] [Commented] (YARN-10627) Extend logging to give more information about weight mode
[ https://issues.apache.org/jira/browse/YARN-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289945#comment-17289945 ] Andras Gyori commented on YARN-10627: - Thank you [~bteke] for the patch. Apart from checkstyle issues and comments from Peter, I have some minor nit feedback: * GB is not used in TestCapacitySchedulerWeightMode * If it is not too cumbersome, perhaps creating the non weight mode counterpart of tests would be valuable as well, too see everything is working when weight mode is disabled. > Extend logging to give more information about weight mode > - > > Key: YARN-10627 > URL: https://issues.apache.org/jira/browse/YARN-10627 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10627.001.patch, YARN-10627.002.patch, > YARN-10627.003.patch, image-2021-02-20-00-07-09-875.png > > > In YARN-10504 weight mode was added, however the logged information about the > created queues or the toString methods weren't updated accordingly. Some > examples: > ParentQueue#setupQueueConfigs: > {code:java} > LOG.info(queueName + ", capacity=" + this.queueCapacities.getCapacity() > + ", absoluteCapacity=" + this.queueCapacities.getAbsoluteCapacity() > + ", maxCapacity=" + this.queueCapacities.getMaximumCapacity() > + ", absoluteMaxCapacity=" + this.queueCapacities > .getAbsoluteMaximumCapacity() + ", state=" + getState() + ", acls=" > + aclsString + ", labels=" + labelStrBuilder.toString() + "\n" > + ", reservationsContinueLooking=" + reservationsContinueLooking > + ", orderingPolicy=" + getQueueOrderingPolicyConfigName() > + ", priority=" + priority > + ", allowZeroCapacitySum=" + allowZeroCapacitySum); > {code} > ParentQueue#toString: > {code:java} > public String toString() { > return queueName + ": " + > "numChildQueue= " + childQueues.size() + ", " + > "capacity=" + queueCapacities.getCapacity() + ", " + > "absoluteCapacity=" + queueCapacities.getAbsoluteCapacity() + ", " + > "usedResources=" + queueUsage.getUsed() + > "usedCapacity=" + getUsedCapacity() + ", " + > "numApps=" + getNumApplications() + ", " + > "numContainers=" + getNumContainers(); > } > {code} > LeafQueue#setupQueueConfigs: > {code:java} > LOG.info( > "Initializing " + getQueuePath() + "\n" + "capacity = " > + queueCapacities.getCapacity() > + " [= (float) configuredCapacity / 100 ]" + "\n" > + "absoluteCapacity = " + queueCapacities.getAbsoluteCapacity() > + " [= parentAbsoluteCapacity * capacity ]" + "\n" > + "maxCapacity = " + queueCapacities.getMaximumCapacity() > + " [= configuredMaxCapacity ]" + "\n" + "absoluteMaxCapacity = > " > + queueCapacities.getAbsoluteMaximumCapacity() > + " [= 1.0 maximumCapacity undefined, " > + "(parentAbsoluteMaxCapacity * maximumCapacity) / 100 > otherwise ]" > + "\n" + "effectiveMinResource=" + > getEffectiveCapacity(CommonNodeLabelsManager.NO_LABEL) + "\n" > + " , effectiveMaxResource=" + > getEffectiveMaxCapacity(CommonNodeLabelsManager.NO_LABEL) > + "\n" + "userLimit = " + usersManager.getUserLimit() > + " [= configuredUserLimit ]" + "\n" + "userLimitFactor = " > + usersManager.getUserLimitFactor() > + " [= configuredUserLimitFactor ]" + "\n" + "maxApplications = > " > + maxApplications > + " [= configuredMaximumSystemApplicationsPerQueue or" > + " (int)(configuredMaximumSystemApplications * > absoluteCapacity)]" > + "\n" + "maxApplicationsPerUser = " + maxApplicationsPerUser > + " [= (int)(maxApplications * (userLimit / 100.0f) * " > + "userLimitFactor) ]" + "\n" > + "maxParallelApps = " + getMaxParallelApps() + "\n" > + "usedCapacity = " + > + queueCapacities.getUsedCapacity() + " [= usedResourcesMemory > / " > + "(clusterResourceMemory * absoluteCapacity)]" + "\n" > + "absoluteUsedCapacity = " + absoluteUsedCapacity > + " [= usedResourcesMemory / clusterResourceMemory]" + "\n" > + "maxAMResourcePerQueuePercent = " + > maxAMResourcePerQueuePercent > + " [= configuredMaximumAMResourcePercent ]" + "\n" > + "minimumAllocationFactor = " + minimumAllocationFactor > + " [= (float)(maximumAllocationMemory - > minimumAllocationMemory) /
[jira] [Commented] (YARN-10640) Ajust the queue Configured capacity to Configured weight number for weight mode in UI.
[ https://issues.apache.org/jira/browse/YARN-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289944#comment-17289944 ] Qi Zhu commented on YARN-10640: --- Thanks a lot [~pbacsko] for your review work.(y) > Ajust the queue Configured capacity to Configured weight number for weight > mode in UI. > --- > > Key: YARN-10640 > URL: https://issues.apache.org/jira/browse/YARN-10640 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10640.001.patch, YARN-10640.002.patch, > image-2021-02-20-11-21-50-306.png, image-2021-02-20-14-18-56-261.png, > image-2021-02-20-14-19-30-767.png > > > In weight mode: > Both the weight mode static queue and the dynamic queue will show the > Configured Capacity to 0. I think this should change to Configured Weight if > we use weight mode, this will be helpful. > Such as in dynamic weight mode queue: > !image-2021-02-20-11-21-50-306.png|width=528,height=374! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10640) Ajust the queue Configured capacity to Configured weight number for weight mode in UI.
[ https://issues.apache.org/jira/browse/YARN-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289929#comment-17289929 ] Peter Bacsko commented on YARN-10640: - [~zhuqi] Thanks for your work. I've just started to review your patches, but there are many, so I'll try to do my best and give feedbacks sooner or later. > Ajust the queue Configured capacity to Configured weight number for weight > mode in UI. > --- > > Key: YARN-10640 > URL: https://issues.apache.org/jira/browse/YARN-10640 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10640.001.patch, YARN-10640.002.patch, > image-2021-02-20-11-21-50-306.png, image-2021-02-20-14-18-56-261.png, > image-2021-02-20-14-19-30-767.png > > > In weight mode: > Both the weight mode static queue and the dynamic queue will show the > Configured Capacity to 0. I think this should change to Configured Weight if > we use weight mode, this will be helpful. > Such as in dynamic weight mode queue: > !image-2021-02-20-11-21-50-306.png|width=528,height=374! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10564) Support Auto Queue Creation template configurations
[ https://issues.apache.org/jira/browse/YARN-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289917#comment-17289917 ] Andras Gyori commented on YARN-10564: - Thank you [~zhuqi] for the collaboration and for the verification. I will address the checkstyle issues. > Support Auto Queue Creation template configurations > --- > > Key: YARN-10564 > URL: https://issues.apache.org/jira/browse/YARN-10564 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10564.001.patch, YARN-10564.002.patch, > YARN-10564.003.patch, YARN-10564.004.patch, YARN-10564.poc.001.patch > > > Similar to how the template configuration works for ManagedParents, we need > to support templates for the new auto queue creation logic. Proposition is to > allow wildcards in template configs such as: > {noformat} > yarn.scheduler.capacity.root.*.*.weight 10{noformat} > which would mean, that set weight to 10 of every leaf of every parent under > root. > We should possibly take an approach, that could support arbitrary depth of > template configuration, because we might need to lift the limitation of auto > queue nesting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10640) Ajust the queue Configured capacity to Configured weight number for weight mode in UI.
[ https://issues.apache.org/jira/browse/YARN-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289915#comment-17289915 ] Qi Zhu commented on YARN-10640: --- [~pbacsko] If you any other advice? Thanks. > Ajust the queue Configured capacity to Configured weight number for weight > mode in UI. > --- > > Key: YARN-10640 > URL: https://issues.apache.org/jira/browse/YARN-10640 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10640.001.patch, YARN-10640.002.patch, > image-2021-02-20-11-21-50-306.png, image-2021-02-20-14-18-56-261.png, > image-2021-02-20-14-19-30-767.png > > > In weight mode: > Both the weight mode static queue and the dynamic queue will show the > Configured Capacity to 0. I think this should change to Configured Weight if > we use weight mode, this will be helpful. > Such as in dynamic weight mode queue: > !image-2021-02-20-11-21-50-306.png|width=528,height=374! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10627) Extend logging to give more information about weight mode
[ https://issues.apache.org/jira/browse/YARN-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289911#comment-17289911 ] Peter Bacsko commented on YARN-10627: - # Still have 4 checkstyle issues, they're not serious, but if we're not in a rush, we should fix those. # testGetCapacityOrWeightStringUsingWeights / testGetCapacityOrWeightStringParentPctLeafWeights -> make sure MockRM is closed in a finally block > Extend logging to give more information about weight mode > - > > Key: YARN-10627 > URL: https://issues.apache.org/jira/browse/YARN-10627 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Benjamin Teke >Assignee: Benjamin Teke >Priority: Major > Attachments: YARN-10627.001.patch, YARN-10627.002.patch, > YARN-10627.003.patch, image-2021-02-20-00-07-09-875.png > > > In YARN-10504 weight mode was added, however the logged information about the > created queues or the toString methods weren't updated accordingly. Some > examples: > ParentQueue#setupQueueConfigs: > {code:java} > LOG.info(queueName + ", capacity=" + this.queueCapacities.getCapacity() > + ", absoluteCapacity=" + this.queueCapacities.getAbsoluteCapacity() > + ", maxCapacity=" + this.queueCapacities.getMaximumCapacity() > + ", absoluteMaxCapacity=" + this.queueCapacities > .getAbsoluteMaximumCapacity() + ", state=" + getState() + ", acls=" > + aclsString + ", labels=" + labelStrBuilder.toString() + "\n" > + ", reservationsContinueLooking=" + reservationsContinueLooking > + ", orderingPolicy=" + getQueueOrderingPolicyConfigName() > + ", priority=" + priority > + ", allowZeroCapacitySum=" + allowZeroCapacitySum); > {code} > ParentQueue#toString: > {code:java} > public String toString() { > return queueName + ": " + > "numChildQueue= " + childQueues.size() + ", " + > "capacity=" + queueCapacities.getCapacity() + ", " + > "absoluteCapacity=" + queueCapacities.getAbsoluteCapacity() + ", " + > "usedResources=" + queueUsage.getUsed() + > "usedCapacity=" + getUsedCapacity() + ", " + > "numApps=" + getNumApplications() + ", " + > "numContainers=" + getNumContainers(); > } > {code} > LeafQueue#setupQueueConfigs: > {code:java} > LOG.info( > "Initializing " + getQueuePath() + "\n" + "capacity = " > + queueCapacities.getCapacity() > + " [= (float) configuredCapacity / 100 ]" + "\n" > + "absoluteCapacity = " + queueCapacities.getAbsoluteCapacity() > + " [= parentAbsoluteCapacity * capacity ]" + "\n" > + "maxCapacity = " + queueCapacities.getMaximumCapacity() > + " [= configuredMaxCapacity ]" + "\n" + "absoluteMaxCapacity = > " > + queueCapacities.getAbsoluteMaximumCapacity() > + " [= 1.0 maximumCapacity undefined, " > + "(parentAbsoluteMaxCapacity * maximumCapacity) / 100 > otherwise ]" > + "\n" + "effectiveMinResource=" + > getEffectiveCapacity(CommonNodeLabelsManager.NO_LABEL) + "\n" > + " , effectiveMaxResource=" + > getEffectiveMaxCapacity(CommonNodeLabelsManager.NO_LABEL) > + "\n" + "userLimit = " + usersManager.getUserLimit() > + " [= configuredUserLimit ]" + "\n" + "userLimitFactor = " > + usersManager.getUserLimitFactor() > + " [= configuredUserLimitFactor ]" + "\n" + "maxApplications = > " > + maxApplications > + " [= configuredMaximumSystemApplicationsPerQueue or" > + " (int)(configuredMaximumSystemApplications * > absoluteCapacity)]" > + "\n" + "maxApplicationsPerUser = " + maxApplicationsPerUser > + " [= (int)(maxApplications * (userLimit / 100.0f) * " > + "userLimitFactor) ]" + "\n" > + "maxParallelApps = " + getMaxParallelApps() + "\n" > + "usedCapacity = " + > + queueCapacities.getUsedCapacity() + " [= usedResourcesMemory > / " > + "(clusterResourceMemory * absoluteCapacity)]" + "\n" > + "absoluteUsedCapacity = " + absoluteUsedCapacity > + " [= usedResourcesMemory / clusterResourceMemory]" + "\n" > + "maxAMResourcePerQueuePercent = " + > maxAMResourcePerQueuePercent > + " [= configuredMaximumAMResourcePercent ]" + "\n" > + "minimumAllocationFactor = " + minimumAllocationFactor > + " [= (float)(maximumAllocationMemory - > minimumAllocationMemory) / " > + "maximumAllocationMemory ]" + "\n" + "maximumAllocation = " > +
[jira] [Commented] (YARN-10640) Ajust the queue Configured capacity to Configured weight number for weight mode in UI.
[ https://issues.apache.org/jira/browse/YARN-10640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289902#comment-17289902 ] Benjamin Teke commented on YARN-10640: -- [~zhuqi], thanks for working on this. LGTM, +1 (non-binding) > Ajust the queue Configured capacity to Configured weight number for weight > mode in UI. > --- > > Key: YARN-10640 > URL: https://issues.apache.org/jira/browse/YARN-10640 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10640.001.patch, YARN-10640.002.patch, > image-2021-02-20-11-21-50-306.png, image-2021-02-20-14-18-56-261.png, > image-2021-02-20-14-19-30-767.png > > > In weight mode: > Both the weight mode static queue and the dynamic queue will show the > Configured Capacity to 0. I think this should change to Configured Weight if > we use weight mode, this will be helpful. > Such as in dynamic weight mode queue: > !image-2021-02-20-11-21-50-306.png|width=528,height=374! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10650) Create dispatcher metrics interface, and apply to RM async dispatcher.
[ https://issues.apache.org/jira/browse/YARN-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289892#comment-17289892 ] Hadoop QA commented on YARN-10650: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 1s{color} | | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} codespell {color} | {color:blue} 0m 1s{color} | | {color:blue} codespell was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 8m 30s{color} | | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 30m 37s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 56s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 11s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 46s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 53s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 48s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 58s{color} | | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 0s{color} | [/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-warnings.html|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2720/1/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-warnings.html] | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in trunk has 1 extant findbugs warnings. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 38s{color} | | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 12s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 12s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 17s{color} | | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 17s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} blanks {color} | {color:green} 0m 0s{color} | | {color:green} The patch has no blanks issues. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 40s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | | {color:green} the
[jira] [Commented] (YARN-10564) Support Auto Queue Creation template configurations
[ https://issues.apache.org/jira/browse/YARN-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289891#comment-17289891 ] Qi Zhu commented on YARN-10564: --- Thanks [~gandras] update. It has been confirmed included in latest patch. > Support Auto Queue Creation template configurations > --- > > Key: YARN-10564 > URL: https://issues.apache.org/jira/browse/YARN-10564 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10564.001.patch, YARN-10564.002.patch, > YARN-10564.003.patch, YARN-10564.004.patch, YARN-10564.poc.001.patch > > > Similar to how the template configuration works for ManagedParents, we need > to support templates for the new auto queue creation logic. Proposition is to > allow wildcards in template configs such as: > {noformat} > yarn.scheduler.capacity.root.*.*.weight 10{noformat} > which would mean, that set weight to 10 of every leaf of every parent under > root. > We should possibly take an approach, that could support arbitrary depth of > template configuration, because we might need to lift the limitation of auto > queue nesting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10564) Support Auto Queue Creation template configurations
[ https://issues.apache.org/jira/browse/YARN-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289882#comment-17289882 ] Hadoop QA commented on YARN-10564: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 8s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 49s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 48s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 45s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 41s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/669/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 8 new + 185 unchanged - 0 fixed = 193 total (was 185) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 12s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | |
[jira] [Comment Edited] (YARN-10642) AsyncDispatcher will stuck introduced by YARN-8995.
[ https://issues.apache.org/jira/browse/YARN-10642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17287730#comment-17287730 ] Qi Zhu edited comment on YARN-10642 at 2/24/21, 9:43 AM: - Thanks for [~zhengchenyu] update. The patch LGTM +1. Waiting for committers to review it. [~taoyang] [~ebadger] [~bibinchundatt] [~ztang] [~bteke] Could you help review the patch ? was (Author: zhuqi): Thanks for [~zhengchenyu] update. The patch LGTM +1. Waiting for committers to review it. [~ztang] [~bteke] Could you help review the patch ? > AsyncDispatcher will stuck introduced by YARN-8995. > --- > > Key: YARN-10642 > URL: https://issues.apache.org/jira/browse/YARN-10642 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 3.2.1 >Reporter: zhengchenyu >Assignee: zhengchenyu >Priority: Critical > Attachments: MockForDeadLoop.java, YARN-10642.001.patch, > YARN-10642.002.patch, YARN-10642.003.patch, deadloop.png, debugfornode.png, > put.png, take.png > > > In our cluster, ResouceManager stuck twice within twenty days. Yarn client > can't submit application. I got jstack info at second time, then found the > reason. > I analyze all the jstack, I found many thread stuck because can't get > LinkedBlockingQueue.putLock. (Note: Sorry for limited space , omit the > analytical process) > The reason is that one thread hold the putLock all the time, > printEventQueueDetails will called forEachRemaining, then hold putLock and > readLock. The AsyncDispatcher will stuck. > {code} > Thread 6526 (IPC Server handler 454 on default port 8030): > State: RUNNABLE > Blocked count: 29988 > Waited count: 2035029 > Stack: > > java.util.concurrent.LinkedBlockingQueue$LBQSpliterator.forEachRemaining(LinkedBlockingQueue.java:926) > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) > java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) > > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.printEventQueueDetails(AsyncDispatcher.java:270) > > org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler.handle(AsyncDispatcher.java:295) > > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.handleProgress(DefaultAMSProcessor.java:408) > > org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.allocate(DefaultAMSProcessor.java:215) > > org.apache.hadoop.yarn.server.resourcemanager.scheduler.constraint.processor.DisabledPlacementProcessor.allocate(DisabledPlacementProcessor.java:75) > > org.apache.hadoop.yarn.server.resourcemanager.AMSProcessingChain.allocate(AMSProcessingChain.java:92) > > org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.allocate(ApplicationMasterService.java:432) > > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:60) > > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) > > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1040) > org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:958) > java.security.AccessController.doPrivileged(Native Method) > {code} > I analyze LinkedBlockingQueue's source code. I found forEachRemaining in > LinkedBlockingQueue.LBQSpliterator may stuck, when forEachRemaining and take > are called in different thread. > YARN-8995 introduce printEventQueueDetails method, > "eventQueue.stream().collect" will called forEachRemaining method. > Let's see why? "put.png" shows that how to put("a"), "take.png" shows that > how to take()。Specical Node: The removed Node will point itself for help gc!!! > The key point code is in forEachRemaining, we see LBQSpliterator use > forEachRemaining to visit all Node. But when got item value from Node, will > release the lock. If at this time, take() will be called. > The variable 'p' in forEachRemaining may point a Node which point itself, > then forEachRemaining will be in dead loop. You can see it in "deadloop.png" > Let's see a simple uni-test, Let's forEachRemaining called more slow than > take, the problem will reproduction。uni-test is MockForDeadLoop.java. > I debug MockForDeadLoop.java, and see a Node
[jira] [Commented] (YARN-10564) Support Auto Queue Creation template configurations
[ https://issues.apache.org/jira/browse/YARN-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289804#comment-17289804 ] Qi Zhu commented on YARN-10564: --- Thanks [~gandras] for update. I just rebased it, and add test to confirm if it has included YARN-10645. > Support Auto Queue Creation template configurations > --- > > Key: YARN-10564 > URL: https://issues.apache.org/jira/browse/YARN-10564 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10564.001.patch, YARN-10564.002.patch, > YARN-10564.003.patch, YARN-10564.004.patch, YARN-10564.poc.001.patch > > > Similar to how the template configuration works for ManagedParents, we need > to support templates for the new auto queue creation logic. Proposition is to > allow wildcards in template configs such as: > {noformat} > yarn.scheduler.capacity.root.*.*.weight 10{noformat} > which would mean, that set weight to 10 of every leaf of every parent under > root. > We should possibly take an approach, that could support arbitrary depth of > template configuration, because we might need to lift the limitation of auto > queue nesting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10564) Support Auto Queue Creation template configurations
[ https://issues.apache.org/jira/browse/YARN-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10564: -- Attachment: YARN-10564.004.patch > Support Auto Queue Creation template configurations > --- > > Key: YARN-10564 > URL: https://issues.apache.org/jira/browse/YARN-10564 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10564.001.patch, YARN-10564.002.patch, > YARN-10564.003.patch, YARN-10564.004.patch, YARN-10564.poc.001.patch > > > Similar to how the template configuration works for ManagedParents, we need > to support templates for the new auto queue creation logic. Proposition is to > allow wildcards in template configs such as: > {noformat} > yarn.scheduler.capacity.root.*.*.weight 10{noformat} > which would mean, that set weight to 10 of every leaf of every parent under > root. > We should possibly take an approach, that could support arbitrary depth of > template configuration, because we might need to lift the limitation of auto > queue nesting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10645) Fix queue state related update for auto created queue.
[ https://issues.apache.org/jira/browse/YARN-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289788#comment-17289788 ] Qi Zhu commented on YARN-10645: --- Thanks [~gandras] for update, i will check the latest patch in YARN-10564.:D > Fix queue state related update for auto created queue. > -- > > Key: YARN-10645 > URL: https://issues.apache.org/jira/browse/YARN-10645 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10645.001.patch > > > Now the queue state in auto created queue can't be updated after refactor in > YARN-10504. > We should support fix the queue state related logic. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10645) Fix queue state related update for auto created queue.
[ https://issues.apache.org/jira/browse/YARN-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289783#comment-17289783 ] Andras Gyori commented on YARN-10645: - Hi [~zhuqi]. I think YARN-10564 supports this case. Can you check it if it covers your use case? > Fix queue state related update for auto created queue. > -- > > Key: YARN-10645 > URL: https://issues.apache.org/jira/browse/YARN-10645 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10645.001.patch > > > Now the queue state in auto created queue can't be updated after refactor in > YARN-10504. > We should support fix the queue state related logic. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10564) Support Auto Queue Creation template configurations
[ https://issues.apache.org/jira/browse/YARN-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289780#comment-17289780 ] Hadoop QA commented on YARN-10564: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 17s{color} | {color:red}{color} | {color:red} YARN-10564 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-10564 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/13021098/YARN-10564.003.patch | | Console output | https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/668/console | | versions | git=2.17.1 | | Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org | This message was automatically generated. > Support Auto Queue Creation template configurations > --- > > Key: YARN-10564 > URL: https://issues.apache.org/jira/browse/YARN-10564 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10564.001.patch, YARN-10564.002.patch, > YARN-10564.003.patch, YARN-10564.poc.001.patch > > > Similar to how the template configuration works for ManagedParents, we need > to support templates for the new auto queue creation logic. Proposition is to > allow wildcards in template configs such as: > {noformat} > yarn.scheduler.capacity.root.*.*.weight 10{noformat} > which would mean, that set weight to 10 of every leaf of every parent under > root. > We should possibly take an approach, that could support arbitrary depth of > template configuration, because we might need to lift the limitation of auto > queue nesting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10564) Support Auto Queue Creation template configurations
[ https://issues.apache.org/jira/browse/YARN-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289781#comment-17289781 ] Andras Gyori commented on YARN-10564: - Uploaded a new revision, in which I have fixed the dynamic queue config update issue. It is now possible to update dynamic queues in the same way as static queues. Also the template configurations could be overwritten by explicit settings, but the template configurations persist after reinitialization. > Support Auto Queue Creation template configurations > --- > > Key: YARN-10564 > URL: https://issues.apache.org/jira/browse/YARN-10564 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10564.001.patch, YARN-10564.002.patch, > YARN-10564.003.patch, YARN-10564.poc.001.patch > > > Similar to how the template configuration works for ManagedParents, we need > to support templates for the new auto queue creation logic. Proposition is to > allow wildcards in template configs such as: > {noformat} > yarn.scheduler.capacity.root.*.*.weight 10{noformat} > which would mean, that set weight to 10 of every leaf of every parent under > root. > We should possibly take an approach, that could support arbitrary depth of > template configuration, because we might need to lift the limitation of auto > queue nesting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10564) Support Auto Queue Creation template configurations
[ https://issues.apache.org/jira/browse/YARN-10564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10564: Attachment: YARN-10564.003.patch > Support Auto Queue Creation template configurations > --- > > Key: YARN-10564 > URL: https://issues.apache.org/jira/browse/YARN-10564 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10564.001.patch, YARN-10564.002.patch, > YARN-10564.003.patch, YARN-10564.poc.001.patch > > > Similar to how the template configuration works for ManagedParents, we need > to support templates for the new auto queue creation logic. Proposition is to > allow wildcards in template configs such as: > {noformat} > yarn.scheduler.capacity.root.*.*.weight 10{noformat} > which would mean, that set weight to 10 of every leaf of every parent under > root. > We should possibly take an approach, that could support arbitrary depth of > template configuration, because we might need to lift the limitation of auto > queue nesting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10650) Create dispatcher metrics interface, and apply to RM async dispatcher.
[ https://issues.apache.org/jira/browse/YARN-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17289770#comment-17289770 ] Qi Zhu commented on YARN-10650: --- [~ebadger] [~epayne] Could you help review this to realize the dispatcher event metrics monitor about event time and event counters. I think this is very helpful to big cluster. Thanks. > Create dispatcher metrics interface, and apply to RM async dispatcher. > -- > > Key: YARN-10650 > URL: https://issues.apache.org/jira/browse/YARN-10650 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > It is needed for us to add dispatcher metrics interface. > And will apply to RM async dispatcher first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10650) Create dispatcher metrics interface, and apply to RM async dispatcher.
[ https://issues.apache.org/jira/browse/YARN-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated YARN-10650: -- Labels: pull-request-available (was: ) > Create dispatcher metrics interface, and apply to RM async dispatcher. > -- > > Key: YARN-10650 > URL: https://issues.apache.org/jira/browse/YARN-10650 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > It is needed for us to add dispatcher metrics interface. > And will apply to RM async dispatcher first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10650) Create dispatcher metrics interface, and apply to RM async dispatcher.
Qi Zhu created YARN-10650: - Summary: Create dispatcher metrics interface, and apply to RM async dispatcher. Key: YARN-10650 URL: https://issues.apache.org/jira/browse/YARN-10650 Project: Hadoop YARN Issue Type: Improvement Reporter: Qi Zhu Assignee: Qi Zhu It is needed for us to add dispatcher metrics interface. And will apply to RM async dispatcher first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org