[jira] [Updated] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-04-06 Thread Eric Badger (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-10702:
---
Fix Version/s: 3.3.1
   3.4.0

Thanks for the patch, [~Jim_Brennan]. I've committed it to branch-3.3 So now 
it's been committed to trunk (3.4) and branch-3.3. There's another conflict 
with branch-3.2. If you'd like it to go back there, please provide a patch for 
that branch as well.

Also a belated thanks to [~gandras] and [~zhuqi] for the reviews on the 
original patch

> Add cluster metric for amount of CPU used by RM Event Processor
> ---
>
> Key: YARN-10702
> URL: https://issues.apache.org/jira/browse/YARN-10702
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Fix For: 3.4.0, 3.3.1
>
> Attachments: Scheduler-Busy.png, YARN-10702-branch-3.3.006.patch, 
> YARN-10702.001.patch, YARN-10702.002.patch, YARN-10702.003.patch, 
> YARN-10702.004.patch, YARN-10702.005.patch, YARN-10702.006.patch, 
> simon-scheduler-busy.png
>
>
> Add a cluster metric to track the cpu usage of the ResourceManager Event 
> Processing thread.   This lets us know when the critical path of the RM is 
> running out of headroom.
> This feature was originally added for us internally by [~nroberts] and we've 
> been running with it on production clusters for nearly four years.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10714) Remove dangling dynamic queues on reinitialization

2021-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315775#comment-17315775
 ] 

Hadoop QA commented on YARN-10714:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
39s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
57s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  0s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m  
8s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
51s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/900/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 2 unchanged - 2 fixed = 3 total (was 4) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 55s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | 

[jira] [Created] (YARN-10727) ParentQueue does not validate the queue on removal

2021-04-06 Thread Andras Gyori (Jira)
Andras Gyori created YARN-10727:
---

 Summary: ParentQueue does not validate the queue on removal
 Key: YARN-10727
 URL: https://issues.apache.org/jira/browse/YARN-10727
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Andras Gyori
Assignee: Andras Gyori


With the addition of YARN-10532 ParentQueue has a public method, removeQueue, 
which allows the deletion of a queue at runtime. However, there is no 
validation regarding the queue which is to be removed, therefore it is possible 
to remove a queue from the CSQueueManager that is not a child of the 
ParentQueue. Since it is a public method, there must be validations such as:
 * check, if the parent of the queue to be removed is the current ParentQueue
 * check, if the parent actually contains the queue in its childQueues 
collection



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10714) Remove dangling dynamic queues on reinitialization

2021-04-06 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315674#comment-17315674
 ] 

Andras Gyori commented on YARN-10714:
-

Uploaded a new patch with extra validation and test scenarios.

> Remove dangling dynamic queues on reinitialization
> --
>
> Key: YARN-10714
> URL: https://issues.apache.org/jira/browse/YARN-10714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Andras Gyori
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-10714.001.patch, YARN-10714.002.patch, 
> YARN-10714.003.patch
>
>
> Current logic does not handle orphaned auto created child queues. The 
> following example steps show a scenario in which it is possible to submit 
> applications to an orphaned queue, that has an invalid (already removed) 
> ParentQueue.
>  # Auto create a queue root.a.a-auto
>  # Remove root.a from the config
>  # Reinitialize CS without restarting it (possible via mutation API)
>  # Submit application to root.a.a-auto, while root.a is a non-existent queue



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10714) Remove dangling dynamic queues on reinitialization

2021-04-06 Thread Andras Gyori (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Gyori updated YARN-10714:

Attachment: YARN-10714.003.patch

> Remove dangling dynamic queues on reinitialization
> --
>
> Key: YARN-10714
> URL: https://issues.apache.org/jira/browse/YARN-10714
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Andras Gyori
>Assignee: Andras Gyori
>Priority: Major
> Attachments: YARN-10714.001.patch, YARN-10714.002.patch, 
> YARN-10714.003.patch
>
>
> Current logic does not handle orphaned auto created child queues. The 
> following example steps show a scenario in which it is possible to submit 
> applications to an orphaned queue, that has an invalid (already removed) 
> ParentQueue.
>  # Auto create a queue root.a.a-auto
>  # Remove root.a from the config
>  # Reinitialize CS without restarting it (possible via mutation API)
>  # Submit application to root.a.a-auto, while root.a is a non-existent queue



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-04-06 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315597#comment-17315597
 ] 

Jim Brennan commented on YARN-10702:


The failed unit test for branch-3.3 is unrelated.  Looks like it was fixed in 
[YARN-10337], which was only committed to trunk.


> Add cluster metric for amount of CPU used by RM Event Processor
> ---
>
> Key: YARN-10702
> URL: https://issues.apache.org/jira/browse/YARN-10702
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: Scheduler-Busy.png, YARN-10702-branch-3.3.006.patch, 
> YARN-10702.001.patch, YARN-10702.002.patch, YARN-10702.003.patch, 
> YARN-10702.004.patch, YARN-10702.005.patch, YARN-10702.006.patch, 
> simon-scheduler-busy.png
>
>
> Add a cluster metric to track the cpu usage of the ResourceManager Event 
> Processing thread.   This lets us know when the critical path of the RM is 
> running out of headroom.
> This feature was originally added for us internally by [~nroberts] and we've 
> been running with it on production clusters for nearly four years.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.

2021-04-06 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315488#comment-17315488
 ] 

Hadoop QA commented on YARN-10503:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
23s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 2 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
46s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
32s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
57s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
19s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
40s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
50s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m  7s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 26m  
0s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  4m  
2s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
8s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m  
8s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
14s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
14s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
46s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green}{color} | {color:green} The patch has no ill-formed 
XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient 

[jira] [Comment Edited] (YARN-10723) Change CS nodes page in UI to support custom resource.

2021-04-06 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315382#comment-17315382
 ] 

Qi Zhu edited comment on YARN-10723 at 4/6/21, 9:22 AM:


Thanks [~gandras] for very good suggestions.

Update it in latest patch.

I only have the original GPU page, i will update the screenshot. It will same 
as the patch applied.

 

!image-2021-04-06-17-22-32-733.png|width=613,height=69!

 


was (Author: zhuqi):
Thanks [~gandras] for very good suggestions.

Update it in latest patch.

I only have the original GPU page, i will update the screenshot. It will same 
as the patch applied.

 

> Change CS nodes page in UI to support custom resource.
> --
>
> Key: YARN-10723
> URL: https://issues.apache.org/jira/browse/YARN-10723
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10723.001.patch, YARN-10723.002.patch, 
> YARN-10723.003.patch, YARN-10723.004.patch, image-2021-04-06-17-22-32-733.png
>
>
> Node page now only support gpu for custom resource.
> We should make this supported for all custom resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10723) Change CS nodes page in UI to support custom resource.

2021-04-06 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315382#comment-17315382
 ] 

Qi Zhu commented on YARN-10723:
---

Thanks [~gandras] for very good suggestions.

Update it in latest patch.

I only have the original GPU page, i will update the screenshot. It will same 
as the patch applied.

 

> Change CS nodes page in UI to support custom resource.
> --
>
> Key: YARN-10723
> URL: https://issues.apache.org/jira/browse/YARN-10723
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10723.001.patch, YARN-10723.002.patch, 
> YARN-10723.003.patch, YARN-10723.004.patch
>
>
> Node page now only support gpu for custom resource.
> We should make this supported for all custom resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10723) Change CS nodes page in UI to support custom resource.

2021-04-06 Thread Qi Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qi Zhu updated YARN-10723:
--
Attachment: YARN-10723.004.patch

> Change CS nodes page in UI to support custom resource.
> --
>
> Key: YARN-10723
> URL: https://issues.apache.org/jira/browse/YARN-10723
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10723.001.patch, YARN-10723.002.patch, 
> YARN-10723.003.patch, YARN-10723.004.patch
>
>
> Node page now only support gpu for custom resource.
> We should make this supported for all custom resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10723) Change CS nodes page in UI to support custom resource.

2021-04-06 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315369#comment-17315369
 ] 

Andras Gyori commented on YARN-10723:
-

Thank you [~zhuqi] for working on this issue. The patch looks good to me, can 
you provide a screenshot of custom resource on UI1? I have some minor comments:
 * 248,250 String.valueOf is unnecessary
 * nodesPage#L116 has a null check, but it is unnecessary. If this Map could 
contain a null key, its already troublesome, if it is not possible, then it is 
unnecessary. In either case, you have a integerEntry.getKey().equals before the 
null check, which would throw a NPE if you have a null key

> Change CS nodes page in UI to support custom resource.
> --
>
> Key: YARN-10723
> URL: https://issues.apache.org/jira/browse/YARN-10723
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10723.001.patch, YARN-10723.002.patch, 
> YARN-10723.003.patch
>
>
> Node page now only support gpu for custom resource.
> We should make this supported for all custom resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.

2021-04-06 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315349#comment-17315349
 ] 

Qi Zhu edited comment on YARN-10503 at 4/6/21, 8:51 AM:


Thanks [~gandras] for review and confirm.

And the suggestion is valid, i  moved getCustomResourcesStrings to the 
ResourceUtils in latest patch.

[~pbacsko] If you any other advice?

Thanks.

 


was (Author: zhuqi):
Thanks [~gandras] for review and confirm.

And the suggestion is valid, i  moved getCustomResourcesStrings to the 
ResourceUtils in latest patch.

 

> Support queue capacity in terms of absolute resources with custom 
> resourceType.
> ---
>
> Key: YARN-10503
> URL: https://issues.apache.org/jira/browse/YARN-10503
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
> Attachments: YARN-10503.001.patch, YARN-10503.002.patch, 
> YARN-10503.003.patch, YARN-10503.004.patch, YARN-10503.005.patch, 
> YARN-10503.006.patch, YARN-10503.007.patch, YARN-10503.008.patch, 
> YARN-10503.009.patch, YARN-10503.010.patch
>
>
> Now the absolute resources are memory and cores.
> {code:java}
> /**
>  * Different resource types supported.
>  */
> public enum AbsoluteResourceType {
>   MEMORY, VCORES;
> }{code}
> But in our GPU production clusters, we need to support more resourceTypes.
> It's very import for cluster scaling when with different resourceType 
> absolute demands.
>  
> This Jira will handle GPU first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10657) We should make max application per queue to support node label.

2021-04-06 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315353#comment-17315353
 ] 

Qi Zhu edited comment on YARN-10657 at 4/6/21, 8:49 AM:


Thanks [~gandras] for review.

I agree with you the current approach is not a true node label support, but the 
original only use the last node label maxApplication to use, i may be very 
small will have a big side effect to the user, so i just use the max of all 
label now to ugly solved this problem.

Also , i agree with you that it would involve too much work for a feature that 
is not a high priority item to  a true node label support.

[~pbacsko] [~epayne] 

What your opinions?

Thanks.


was (Author: zhuqi):
Thanks [~gandras] for review.

I agree with you the current approach is not a true node label support, but the 
original only use the last node label maxApplication to use, i may be very 
small will have a big side effect to the user, so i just use the max of all 
label now to ugly solved this problem.

I agree with you that it would involve too much work for a feature that is not 
a high priority item to  a true node label support.

[~pbacsko] [~epayne] 

What your opinions?

Thanks.

> We should make max application per queue to support node label.
> ---
>
> Key: YARN-10657
> URL: https://issues.apache.org/jira/browse/YARN-10657
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10657.001.patch, YARN-10657.002.patch
>
>
> https://issues.apache.org/jira/browse/YARN-10641?focusedCommentId=17291708=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17291708
> As we discussed in above comment:
> We should deep into the label related max applications per queue.
> I think when node label enabled in queue, max applications should consider 
> the max capacity of all labels.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10657) We should make max application per queue to support node label.

2021-04-06 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315353#comment-17315353
 ] 

Qi Zhu commented on YARN-10657:
---

Thanks [~gandras] for review.

I agree with you the current approach is not a true node label support, but the 
original only use the last node label maxApplication to use, i may be very 
small will have a big side effect to the user, so i just use the max of all 
label now to ugly solved this problem.

I agree with you that it would involve too much work for a feature that is not 
a high priority item to  a true node label support.

[~pbacsko] [~epayne] 

What your opinions?

Thanks.

> We should make max application per queue to support node label.
> ---
>
> Key: YARN-10657
> URL: https://issues.apache.org/jira/browse/YARN-10657
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10657.001.patch, YARN-10657.002.patch
>
>
> https://issues.apache.org/jira/browse/YARN-10641?focusedCommentId=17291708=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17291708
> As we discussed in above comment:
> We should deep into the label related max applications per queue.
> I think when node label enabled in queue, max applications should consider 
> the max capacity of all labels.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.

2021-04-06 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315349#comment-17315349
 ] 

Qi Zhu commented on YARN-10503:
---

Thanks [~gandras] for review and confirm.

And the suggestion is valid, i  moved getCustomResourcesStrings to the 
ResourceUtils in latest patch.

 

> Support queue capacity in terms of absolute resources with custom 
> resourceType.
> ---
>
> Key: YARN-10503
> URL: https://issues.apache.org/jira/browse/YARN-10503
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
> Attachments: YARN-10503.001.patch, YARN-10503.002.patch, 
> YARN-10503.003.patch, YARN-10503.004.patch, YARN-10503.005.patch, 
> YARN-10503.006.patch, YARN-10503.007.patch, YARN-10503.008.patch, 
> YARN-10503.009.patch, YARN-10503.010.patch
>
>
> Now the absolute resources are memory and cores.
> {code:java}
> /**
>  * Different resource types supported.
>  */
> public enum AbsoluteResourceType {
>   MEMORY, VCORES;
> }{code}
> But in our GPU production clusters, we need to support more resourceTypes.
> It's very import for cluster scaling when with different resourceType 
> absolute demands.
>  
> This Jira will handle GPU first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.

2021-04-06 Thread Qi Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qi Zhu updated YARN-10503:
--
Attachment: YARN-10503.010.patch

> Support queue capacity in terms of absolute resources with custom 
> resourceType.
> ---
>
> Key: YARN-10503
> URL: https://issues.apache.org/jira/browse/YARN-10503
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
> Attachments: YARN-10503.001.patch, YARN-10503.002.patch, 
> YARN-10503.003.patch, YARN-10503.004.patch, YARN-10503.005.patch, 
> YARN-10503.006.patch, YARN-10503.007.patch, YARN-10503.008.patch, 
> YARN-10503.009.patch, YARN-10503.010.patch
>
>
> Now the absolute resources are memory and cores.
> {code:java}
> /**
>  * Different resource types supported.
>  */
> public enum AbsoluteResourceType {
>   MEMORY, VCORES;
> }{code}
> But in our GPU production clusters, we need to support more resourceTypes.
> It's very import for cluster scaling when with different resourceType 
> absolute demands.
>  
> This Jira will handle GPU first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10657) We should make max application per queue to support node label.

2021-04-06 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315348#comment-17315348
 ] 

Andras Gyori commented on YARN-10657:
-

Thank you [~zhuqi] for working on this issue. After analysing the affected code 
I have the following concerns:
 * The current approach is not a true node label support, it is still using a 
single maxApplication value for all nodeLabels. I am not sure if the current 
approach would make anything better, but could easily create confusion.
 * To be able to support true nodeLabels, one would need to store  a 
maxApplication value for each nodeLabel (eg. in a Map). However, as I see it, 
on application submission, where this maxApplication is retrieved and used as 
some sort of validation in the queue, we do not have a reference to the node 
label. I think it would involve too much work for a feature that is not a high 
priority item.

> We should make max application per queue to support node label.
> ---
>
> Key: YARN-10657
> URL: https://issues.apache.org/jira/browse/YARN-10657
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10657.001.patch, YARN-10657.002.patch
>
>
> https://issues.apache.org/jira/browse/YARN-10641?focusedCommentId=17291708=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17291708
> As we discussed in above comment:
> We should deep into the label related max applications per queue.
> I think when node label enabled in queue, max applications should consider 
> the max capacity of all labels.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10503) Support queue capacity in terms of absolute resources with custom resourceType.

2021-04-06 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17315321#comment-17315321
 ] 

Andras Gyori commented on YARN-10503:
-

Thank you [~zhuqi] for the patch. I think I understand the points made by 
[~ebadger], but I see that the repetition was necessary, because 
AbsoluteResourceType.valueOf would throw an exception if encountering an 
unknown secondary resource. I thought about a solution, but I do not think it 
is worth it.

My only suggestion would be to move getCustomResourcesStrings to the 
ResourceUtils instead. Apart from this, it looks good to me.

 

> Support queue capacity in terms of absolute resources with custom 
> resourceType.
> ---
>
> Key: YARN-10503
> URL: https://issues.apache.org/jira/browse/YARN-10503
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
> Attachments: YARN-10503.001.patch, YARN-10503.002.patch, 
> YARN-10503.003.patch, YARN-10503.004.patch, YARN-10503.005.patch, 
> YARN-10503.006.patch, YARN-10503.007.patch, YARN-10503.008.patch, 
> YARN-10503.009.patch
>
>
> Now the absolute resources are memory and cores.
> {code:java}
> /**
>  * Different resource types supported.
>  */
> public enum AbsoluteResourceType {
>   MEMORY, VCORES;
> }{code}
> But in our GPU production clusters, we need to support more resourceTypes.
> It's very import for cluster scaling when with different resourceType 
> absolute demands.
>  
> This Jira will handle GPU first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org