[jira] [Updated] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor
[ https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Badger updated YARN-10702: --- Fix Version/s: 3.3.1 3.4.0 Thanks for the patch, [~Jim_Brennan]. I've committed it to branch-3.3 So now it's been committed to trunk (3.4) and branch-3.3. There's another conflict with branch-3.2. If you'd like it to go back there, please provide a patch for that branch as well. Also a belated thanks to [~gandras] and [~zhuqi] for the reviews on the original patch > Add cluster metric for amount of CPU used by RM Event Processor > --- > > Key: YARN-10702 > URL: https://issues.apache.org/jira/browse/YARN-10702 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 2.10.1, 3.4.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Fix For: 3.4.0, 3.3.1 > > Attachments: Scheduler-Busy.png, YARN-10702-branch-3.3.006.patch, > YARN-10702.001.patch, YARN-10702.002.patch, YARN-10702.003.patch, > YARN-10702.004.patch, YARN-10702.005.patch, YARN-10702.006.patch, > simon-scheduler-busy.png > > > Add a cluster metric to track the cpu usage of the ResourceManager Event > Processing thread. This lets us know when the critical path of the RM is > running out of headroom. > This feature was originally added for us internally by [~nroberts] and we've > been running with it on production clusters for nearly four years. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10714) Remove dangling dynamic queues on reinitialization
[ https://issues.apache.org/jira/browse/YARN-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315775#comment-17315775 ] Hadoop QA commented on YARN-10714: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 39s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 57s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 0s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m 8s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 1m 51s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 46s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 39s{color} | {color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/900/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 1 new + 2 unchanged - 2 fixed = 3 total (was 4) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 55s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} |
[jira] [Created] (YARN-10727) ParentQueue does not validate the queue on removal
Andras Gyori created YARN-10727: --- Summary: ParentQueue does not validate the queue on removal Key: YARN-10727 URL: https://issues.apache.org/jira/browse/YARN-10727 Project: Hadoop YARN Issue Type: Bug Reporter: Andras Gyori Assignee: Andras Gyori With the addition of YARN-10532 ParentQueue has a public method, removeQueue, which allows the deletion of a queue at runtime. However, there is no validation regarding the queue which is to be removed, therefore it is possible to remove a queue from the CSQueueManager that is not a child of the ParentQueue. Since it is a public method, there must be validations such as: * check, if the parent of the queue to be removed is the current ParentQueue * check, if the parent actually contains the queue in its childQueues collection -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10714) Remove dangling dynamic queues on reinitialization
[ https://issues.apache.org/jira/browse/YARN-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315674#comment-17315674 ] Andras Gyori commented on YARN-10714: - Uploaded a new patch with extra validation and test scenarios. > Remove dangling dynamic queues on reinitialization > -- > > Key: YARN-10714 > URL: https://issues.apache.org/jira/browse/YARN-10714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10714.001.patch, YARN-10714.002.patch, > YARN-10714.003.patch > > > Current logic does not handle orphaned auto created child queues. The > following example steps show a scenario in which it is possible to submit > applications to an orphaned queue, that has an invalid (already removed) > ParentQueue. > # Auto create a queue root.a.a-auto > # Remove root.a from the config > # Reinitialize CS without restarting it (possible via mutation API) > # Submit application to root.a.a-auto, while root.a is a non-existent queue -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10714) Remove dangling dynamic queues on reinitialization
[ https://issues.apache.org/jira/browse/YARN-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andras Gyori updated YARN-10714: Attachment: YARN-10714.003.patch > Remove dangling dynamic queues on reinitialization > -- > > Key: YARN-10714 > URL: https://issues.apache.org/jira/browse/YARN-10714 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Andras Gyori >Assignee: Andras Gyori >Priority: Major > Attachments: YARN-10714.001.patch, YARN-10714.002.patch, > YARN-10714.003.patch > > > Current logic does not handle orphaned auto created child queues. The > following example steps show a scenario in which it is possible to submit > applications to an orphaned queue, that has an invalid (already removed) > ParentQueue. > # Auto create a queue root.a.a-auto > # Remove root.a from the config > # Reinitialize CS without restarting it (possible via mutation API) > # Submit application to root.a.a-auto, while root.a is a non-existent queue -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor
[ https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315597#comment-17315597 ] Jim Brennan commented on YARN-10702: The failed unit test for branch-3.3 is unrelated. Looks like it was fixed in [YARN-10337], which was only committed to trunk. > Add cluster metric for amount of CPU used by RM Event Processor > --- > > Key: YARN-10702 > URL: https://issues.apache.org/jira/browse/YARN-10702 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Affects Versions: 2.10.1, 3.4.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Minor > Attachments: Scheduler-Busy.png, YARN-10702-branch-3.3.006.patch, > YARN-10702.001.patch, YARN-10702.002.patch, YARN-10702.003.patch, > YARN-10702.004.patch, YARN-10702.005.patch, YARN-10702.006.patch, > simon-scheduler-busy.png > > > Add a cluster metric to track the cpu usage of the ResourceManager Event > Processing thread. This lets us know when the critical path of the RM is > running out of headroom. > This feature was originally added for us internally by [~nroberts] and we've > been running with it on production clusters for nearly four years. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.
[ https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315488#comment-17315488 ] Hadoop QA commented on YARN-10503: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 23s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 46s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 32s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 57s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 19s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 40s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 50s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 19m 7s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 26m 0s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are enabled, using SpotBugs. {color} | | {color:green}+1{color} | {color:green} spotbugs {color} | {color:green} 4m 2s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 8s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 8s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 14s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 14s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 35s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 46s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green}{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient
[jira] [Comment Edited] (YARN-10723) Change CS nodes page in UI to support custom resource.
[ https://issues.apache.org/jira/browse/YARN-10723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315382#comment-17315382 ] Qi Zhu edited comment on YARN-10723 at 4/6/21, 9:22 AM: Thanks [~gandras] for very good suggestions. Update it in latest patch. I only have the original GPU page, i will update the screenshot. It will same as the patch applied. !image-2021-04-06-17-22-32-733.png|width=613,height=69! was (Author: zhuqi): Thanks [~gandras] for very good suggestions. Update it in latest patch. I only have the original GPU page, i will update the screenshot. It will same as the patch applied. > Change CS nodes page in UI to support custom resource. > -- > > Key: YARN-10723 > URL: https://issues.apache.org/jira/browse/YARN-10723 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10723.001.patch, YARN-10723.002.patch, > YARN-10723.003.patch, YARN-10723.004.patch, image-2021-04-06-17-22-32-733.png > > > Node page now only support gpu for custom resource. > We should make this supported for all custom resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10723) Change CS nodes page in UI to support custom resource.
[ https://issues.apache.org/jira/browse/YARN-10723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315382#comment-17315382 ] Qi Zhu commented on YARN-10723: --- Thanks [~gandras] for very good suggestions. Update it in latest patch. I only have the original GPU page, i will update the screenshot. It will same as the patch applied. > Change CS nodes page in UI to support custom resource. > -- > > Key: YARN-10723 > URL: https://issues.apache.org/jira/browse/YARN-10723 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10723.001.patch, YARN-10723.002.patch, > YARN-10723.003.patch, YARN-10723.004.patch > > > Node page now only support gpu for custom resource. > We should make this supported for all custom resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10723) Change CS nodes page in UI to support custom resource.
[ https://issues.apache.org/jira/browse/YARN-10723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10723: -- Attachment: YARN-10723.004.patch > Change CS nodes page in UI to support custom resource. > -- > > Key: YARN-10723 > URL: https://issues.apache.org/jira/browse/YARN-10723 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10723.001.patch, YARN-10723.002.patch, > YARN-10723.003.patch, YARN-10723.004.patch > > > Node page now only support gpu for custom resource. > We should make this supported for all custom resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10723) Change CS nodes page in UI to support custom resource.
[ https://issues.apache.org/jira/browse/YARN-10723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315369#comment-17315369 ] Andras Gyori commented on YARN-10723: - Thank you [~zhuqi] for working on this issue. The patch looks good to me, can you provide a screenshot of custom resource on UI1? I have some minor comments: * 248,250 String.valueOf is unnecessary * nodesPage#L116 has a null check, but it is unnecessary. If this Map could contain a null key, its already troublesome, if it is not possible, then it is unnecessary. In either case, you have a integerEntry.getKey().equals before the null check, which would throw a NPE if you have a null key > Change CS nodes page in UI to support custom resource. > -- > > Key: YARN-10723 > URL: https://issues.apache.org/jira/browse/YARN-10723 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10723.001.patch, YARN-10723.002.patch, > YARN-10723.003.patch > > > Node page now only support gpu for custom resource. > We should make this supported for all custom resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.
[ https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315349#comment-17315349 ] Qi Zhu edited comment on YARN-10503 at 4/6/21, 8:51 AM: Thanks [~gandras] for review and confirm. And the suggestion is valid, i moved getCustomResourcesStrings to the ResourceUtils in latest patch. [~pbacsko] If you any other advice? Thanks. was (Author: zhuqi): Thanks [~gandras] for review and confirm. And the suggestion is valid, i moved getCustomResourcesStrings to the ResourceUtils in latest patch. > Support queue capacity in terms of absolute resources with custom > resourceType. > --- > > Key: YARN-10503 > URL: https://issues.apache.org/jira/browse/YARN-10503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10503.001.patch, YARN-10503.002.patch, > YARN-10503.003.patch, YARN-10503.004.patch, YARN-10503.005.patch, > YARN-10503.006.patch, YARN-10503.007.patch, YARN-10503.008.patch, > YARN-10503.009.patch, YARN-10503.010.patch > > > Now the absolute resources are memory and cores. > {code:java} > /** > * Different resource types supported. > */ > public enum AbsoluteResourceType { > MEMORY, VCORES; > }{code} > But in our GPU production clusters, we need to support more resourceTypes. > It's very import for cluster scaling when with different resourceType > absolute demands. > > This Jira will handle GPU first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10657) We should make max application per queue to support node label.
[ https://issues.apache.org/jira/browse/YARN-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315353#comment-17315353 ] Qi Zhu edited comment on YARN-10657 at 4/6/21, 8:49 AM: Thanks [~gandras] for review. I agree with you the current approach is not a true node label support, but the original only use the last node label maxApplication to use, i may be very small will have a big side effect to the user, so i just use the max of all label now to ugly solved this problem. Also , i agree with you that it would involve too much work for a feature that is not a high priority item to a true node label support. [~pbacsko] [~epayne] What your opinions? Thanks. was (Author: zhuqi): Thanks [~gandras] for review. I agree with you the current approach is not a true node label support, but the original only use the last node label maxApplication to use, i may be very small will have a big side effect to the user, so i just use the max of all label now to ugly solved this problem. I agree with you that it would involve too much work for a feature that is not a high priority item to a true node label support. [~pbacsko] [~epayne] What your opinions? Thanks. > We should make max application per queue to support node label. > --- > > Key: YARN-10657 > URL: https://issues.apache.org/jira/browse/YARN-10657 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10657.001.patch, YARN-10657.002.patch > > > https://issues.apache.org/jira/browse/YARN-10641?focusedCommentId=17291708=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17291708 > As we discussed in above comment: > We should deep into the label related max applications per queue. > I think when node label enabled in queue, max applications should consider > the max capacity of all labels. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10657) We should make max application per queue to support node label.
[ https://issues.apache.org/jira/browse/YARN-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315353#comment-17315353 ] Qi Zhu commented on YARN-10657: --- Thanks [~gandras] for review. I agree with you the current approach is not a true node label support, but the original only use the last node label maxApplication to use, i may be very small will have a big side effect to the user, so i just use the max of all label now to ugly solved this problem. I agree with you that it would involve too much work for a feature that is not a high priority item to a true node label support. [~pbacsko] [~epayne] What your opinions? Thanks. > We should make max application per queue to support node label. > --- > > Key: YARN-10657 > URL: https://issues.apache.org/jira/browse/YARN-10657 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10657.001.patch, YARN-10657.002.patch > > > https://issues.apache.org/jira/browse/YARN-10641?focusedCommentId=17291708=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17291708 > As we discussed in above comment: > We should deep into the label related max applications per queue. > I think when node label enabled in queue, max applications should consider > the max capacity of all labels. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.
[ https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315349#comment-17315349 ] Qi Zhu commented on YARN-10503: --- Thanks [~gandras] for review and confirm. And the suggestion is valid, i moved getCustomResourcesStrings to the ResourceUtils in latest patch. > Support queue capacity in terms of absolute resources with custom > resourceType. > --- > > Key: YARN-10503 > URL: https://issues.apache.org/jira/browse/YARN-10503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10503.001.patch, YARN-10503.002.patch, > YARN-10503.003.patch, YARN-10503.004.patch, YARN-10503.005.patch, > YARN-10503.006.patch, YARN-10503.007.patch, YARN-10503.008.patch, > YARN-10503.009.patch, YARN-10503.010.patch > > > Now the absolute resources are memory and cores. > {code:java} > /** > * Different resource types supported. > */ > public enum AbsoluteResourceType { > MEMORY, VCORES; > }{code} > But in our GPU production clusters, we need to support more resourceTypes. > It's very import for cluster scaling when with different resourceType > absolute demands. > > This Jira will handle GPU first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.
[ https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Qi Zhu updated YARN-10503: -- Attachment: YARN-10503.010.patch > Support queue capacity in terms of absolute resources with custom > resourceType. > --- > > Key: YARN-10503 > URL: https://issues.apache.org/jira/browse/YARN-10503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10503.001.patch, YARN-10503.002.patch, > YARN-10503.003.patch, YARN-10503.004.patch, YARN-10503.005.patch, > YARN-10503.006.patch, YARN-10503.007.patch, YARN-10503.008.patch, > YARN-10503.009.patch, YARN-10503.010.patch > > > Now the absolute resources are memory and cores. > {code:java} > /** > * Different resource types supported. > */ > public enum AbsoluteResourceType { > MEMORY, VCORES; > }{code} > But in our GPU production clusters, we need to support more resourceTypes. > It's very import for cluster scaling when with different resourceType > absolute demands. > > This Jira will handle GPU first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10657) We should make max application per queue to support node label.
[ https://issues.apache.org/jira/browse/YARN-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315348#comment-17315348 ] Andras Gyori commented on YARN-10657: - Thank you [~zhuqi] for working on this issue. After analysing the affected code I have the following concerns: * The current approach is not a true node label support, it is still using a single maxApplication value for all nodeLabels. I am not sure if the current approach would make anything better, but could easily create confusion. * To be able to support true nodeLabels, one would need to store a maxApplication value for each nodeLabel (eg. in a Map). However, as I see it, on application submission, where this maxApplication is retrieved and used as some sort of validation in the queue, we do not have a reference to the node label. I think it would involve too much work for a feature that is not a high priority item. > We should make max application per queue to support node label. > --- > > Key: YARN-10657 > URL: https://issues.apache.org/jira/browse/YARN-10657 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Major > Attachments: YARN-10657.001.patch, YARN-10657.002.patch > > > https://issues.apache.org/jira/browse/YARN-10641?focusedCommentId=17291708=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17291708 > As we discussed in above comment: > We should deep into the label related max applications per queue. > I think when node label enabled in queue, max applications should consider > the max capacity of all labels. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.
[ https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315321#comment-17315321 ] Andras Gyori commented on YARN-10503: - Thank you [~zhuqi] for the patch. I think I understand the points made by [~ebadger], but I see that the repetition was necessary, because AbsoluteResourceType.valueOf would throw an exception if encountering an unknown secondary resource. I thought about a solution, but I do not think it is worth it. My only suggestion would be to move getCustomResourcesStrings to the ResourceUtils instead. Apart from this, it looks good to me. > Support queue capacity in terms of absolute resources with custom > resourceType. > --- > > Key: YARN-10503 > URL: https://issues.apache.org/jira/browse/YARN-10503 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Qi Zhu >Assignee: Qi Zhu >Priority: Critical > Attachments: YARN-10503.001.patch, YARN-10503.002.patch, > YARN-10503.003.patch, YARN-10503.004.patch, YARN-10503.005.patch, > YARN-10503.006.patch, YARN-10503.007.patch, YARN-10503.008.patch, > YARN-10503.009.patch > > > Now the absolute resources are memory and cores. > {code:java} > /** > * Different resource types supported. > */ > public enum AbsoluteResourceType { > MEMORY, VCORES; > }{code} > But in our GPU production clusters, we need to support more resourceTypes. > It's very import for cluster scaling when with different resourceType > absolute demands. > > This Jira will handle GPU first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org