[jira] [Commented] (YARN-10704) The CS effective capacity for absolute mode in UI should support GPU and other custom resources.

2021-03-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305325#comment-17305325
 ] 

Hadoop QA commented on YARN-10704:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
36s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
42s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 52s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 20m  
4s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
55s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 39s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/828/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 82 unchanged - 0 fixed = 83 total (was 82) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 53s{color} |

[jira] [Commented] (YARN-10704) The CS effective capacity for absolute mode in UI should support GPU and other custom resources.

2021-03-19 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305298#comment-17305298
 ] 

Qi Zhu commented on YARN-10704:
---

Fixed the checkstyle and test in latest patch. :D

> The CS effective capacity for absolute mode in UI should support GPU and 
> other custom resources.
> 
>
> Key: YARN-10704
> URL: https://issues.apache.org/jira/browse/YARN-10704
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10704.001.patch, YARN-10704.002.patch, 
> image-2021-03-19-12-05-28-412.png, image-2021-03-19-12-08-35-273.png
>
>
> Actually there are no information about the effective capacity about GPU in 
> UI for absolute resource mode.
> !image-2021-03-19-12-05-28-412.png|width=873,height=136!
> But we have this information in QueueMetrics:
> !image-2021-03-19-12-08-35-273.png|width=613,height=268!
>  
> It's very important for our GPU users to use in absolute mode, there still 
> have nothing to know GPU absolute information in CS Queue UI. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10704) The CS effective capacity for absolute mode in UI should support GPU and other custom resources.

2021-03-19 Thread Qi Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qi Zhu updated YARN-10704:
--
Attachment: YARN-10704.002.patch

> The CS effective capacity for absolute mode in UI should support GPU and 
> other custom resources.
> 
>
> Key: YARN-10704
> URL: https://issues.apache.org/jira/browse/YARN-10704
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10704.001.patch, YARN-10704.002.patch, 
> image-2021-03-19-12-05-28-412.png, image-2021-03-19-12-08-35-273.png
>
>
> Actually there are no information about the effective capacity about GPU in 
> UI for absolute resource mode.
> !image-2021-03-19-12-05-28-412.png|width=873,height=136!
> But we have this information in QueueMetrics:
> !image-2021-03-19-12-08-35-273.png|width=613,height=268!
>  
> It's very important for our GPU users to use in absolute mode, there still 
> have nothing to know GPU absolute information in CS Queue UI. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10503) Support queue capacity in terms of absolute resources with gpu resourceType.

2021-03-19 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305290#comment-17305290
 ] 

Qi Zhu commented on YARN-10503:
---

Thanks [~epayne] for reply.

I also think YARN-9936 is going beyond this requirement, i will try to extend 
this Jira to enable absolute queue resource conf in a general way for custom 
resources.:D

> Support queue capacity in terms of absolute resources with gpu resourceType.
> 
>
> Key: YARN-10503
> URL: https://issues.apache.org/jira/browse/YARN-10503
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
> Attachments: YARN-10503.001.patch, YARN-10503.002.patch
>
>
> Now the absolute resources are memory and cores.
> {code:java}
> /**
>  * Different resource types supported.
>  */
> public enum AbsoluteResourceType {
>   MEMORY, VCORES;
> }{code}
> But in our GPU production clusters, we need to support more resourceTypes.
> It's very import for cluster scaling when with different resourceType 
> absolute demands.
>  
> This Jira will handle GPU first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10493) RunC container repository v2

2021-03-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305211#comment-17305211
 ] 

Hadoop QA commented on YARN-10493:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
1s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:blue}0{color} | {color:blue} codespell {color} | {color:blue}  0m  
1s{color} |  | {color:blue} codespell was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch appears to include 2 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 14m  
5s{color} |  | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
10s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
40s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m  
7s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 2s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
53s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
26s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
34s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  6m 
43s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 32s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} |  | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
25s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
37s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 11m 
37s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
17s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
17s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch has no blanks issues. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 52s{color} | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2789/2/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt]
 | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 
207 unchanged - 0 fixed = 209 total (was 207) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} |  | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
32s{color} |  | {color:

[jira] [Commented] (YARN-6538) Inter Queue preemption is not happening when DRF is configured

2021-03-19 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305144#comment-17305144
 ] 

Eric Payne commented on YARN-6538:
--

[~novaboy], please provide a specific use case to reproduce this issue. For 
example, please provide cluster size and applicable queue configuration 
parameters:
number of queues, queue capacities, queue max capacities, queue user limit 
factors, queue minimum user limit percents, queue ordering policies, preemption 
parameters for each queue, etc.

> Inter Queue preemption is not happening when DRF is configured
> --
>
> Key: YARN-6538
> URL: https://issues.apache.org/jira/browse/YARN-6538
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, scheduler preemption
>Affects Versions: 2.8.0
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Major
>
> Cluster capacity of . Here memory is more and vcores 
> are less. If applications have more demand, vcores might be exhausted. 
> Inter queue preemption ideally has to be kicked in once vcores is over 
> utilized. However preemption is not happening.
> Analysis:
> In {{AbstractPreemptableResourceCalculator.computeFixpointAllocation}}, 
> {code}
> // assign all cluster resources until no more demand, or no resources are
> // left
> while (!orderedByNeed.isEmpty() && Resources.greaterThan(rc, totGuarant,
> unassigned, Resources.none())) {
> {code}
>  will loop even when vcores are 0 (because memory is still +ve). Hence we are 
> having more vcores in idealAssigned which cause no-preemption cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-19 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305110#comment-17305110
 ] 

Jim Brennan commented on YARN-10702:


Thanks for the suggestions [~gandras]!  I agree this should be configurable.  I 
will put up a new patch with those changes.

I don't think the new thread has a significant impact.  I wasn't trying to 
measure that, but when I was looking at an RM recently where the dispatcher 
thread was very busy, the monitoring thread did not appear to be a significant 
factor, it was popping up as using less than 10% of a single CPU for brief 
periods of time IIRC.  I'll have to take a closer look.  But I think making the 
sampling rate configurable is a good idea.

> Add cluster metric for amount of CPU used by RM Event Processor
> ---
>
> Key: YARN-10702
> URL: https://issues.apache.org/jira/browse/YARN-10702
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: Scheduler-Busy.png, YARN-10702.001.patch, 
> YARN-10702.002.patch, YARN-10702.003.patch, YARN-10702.004.patch, 
> simon-scheduler-busy.png
>
>
> Add a cluster metric to track the cpu usage of the ResourceManager Event 
> Processing thread.   This lets us know when the critical path of the RM is 
> running out of headroom.
> This feature was originally added for us internally by [~nroberts] and we've 
> been running with it on production clusters for nearly four years.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10493) RunC container repository v2

2021-03-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305091#comment-17305091
 ] 

Hadoop QA commented on YARN-10493:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
55s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:blue}0{color} | {color:blue} codespell {color} | {color:blue}  0m  
1s{color} |  | {color:blue} codespell was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch appears to include 2 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 14m 
23s{color} |  | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
11s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
7s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
53s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
44s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
54s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
39s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  3m 
35s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 50s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} |  | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
28s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
28s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
46s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
46s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} blanks {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch has no blanks issues. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 39s{color} | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2789/1/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt]
 | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 7 new + 
207 unchanged - 0 fixed = 214 total (was 207) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
44s{color} |  | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
44s{color} | 
[/results-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkUbuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04.txt|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2789/1/artifact/out/results-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkUbuntu-11.0

[jira] [Commented] (YARN-6538) Inter Queue preemption is not happening when DRF is configured

2021-03-19 Thread Michael Zeoli (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305084#comment-17305084
 ] 

Michael Zeoli commented on YARN-6538:
-

As we transition from Fair Scheduler to Capacity Scheduler, we're running into 
what we believe is this same issue.  We typically assign 1 core to our 
executors, as our work is typically memory bound and multiple cores per 
container offer no performance increase.  Under Fair Scheduler, preemption 
worked well for us.  Under Capacity, we see situations where jobs are starved 
for AM's and/or executors when they should otherwise receive their minimum 
guaranteed capacity via preempted resources from jobs in other queues.

While our configuration may be uncommon, it's certainly a valid use case in the 
grand scheme of YARN and Spark, and this bug seems to create significant issues 
where they did not exist before (in Fair).

 

 

> Inter Queue preemption is not happening when DRF is configured
> --
>
> Key: YARN-6538
> URL: https://issues.apache.org/jira/browse/YARN-6538
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler, scheduler preemption
>Affects Versions: 2.8.0
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Major
>
> Cluster capacity of . Here memory is more and vcores 
> are less. If applications have more demand, vcores might be exhausted. 
> Inter queue preemption ideally has to be kicked in once vcores is over 
> utilized. However preemption is not happening.
> Analysis:
> In {{AbstractPreemptableResourceCalculator.computeFixpointAllocation}}, 
> {code}
> // assign all cluster resources until no more demand, or no resources are
> // left
> while (!orderedByNeed.isEmpty() && Resources.greaterThan(rc, totGuarant,
> unassigned, Resources.none())) {
> {code}
>  will loop even when vcores are 0 (because memory is still +ve). Hence we are 
> having more vcores in idealAssigned which cause no-preemption cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-19 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305033#comment-17305033
 ] 

Andras Gyori edited comment on YARN-10702 at 3/19/21, 5:10 PM:
---

Thank you [~Jim_Brennan] for the contribution! The logic is obviously good as 
you have tested it thoroughly on a live cluster, therefore I have no addition 
to that part. However, would not it be worthwhile to make this optional? Did 
the background thread make any noticeable difference in terms of resource 
usage? As I see it, this thread is running approximately every second. Making 
the sampling rate configurable might be useful.


was (Author: gandras):
Thank you [~Jim_Brennan] for the contribution! The logic is obviously good as 
you have tested it thoroughly on a live cluster, therefore I have no addition 
to that part. However, would not it be worthwhile to make this optional? Did 
the background thread make any noticeable difference in terms of resource 
usage? As I see it, this thread is running approximately every second, which 
could be configured as well.

> Add cluster metric for amount of CPU used by RM Event Processor
> ---
>
> Key: YARN-10702
> URL: https://issues.apache.org/jira/browse/YARN-10702
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: Scheduler-Busy.png, YARN-10702.001.patch, 
> YARN-10702.002.patch, YARN-10702.003.patch, YARN-10702.004.patch, 
> simon-scheduler-busy.png
>
>
> Add a cluster metric to track the cpu usage of the ResourceManager Event 
> Processing thread.   This lets us know when the critical path of the RM is 
> running out of headroom.
> This feature was originally added for us internally by [~nroberts] and we've 
> been running with it on production clusters for nearly four years.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10702) Add cluster metric for amount of CPU used by RM Event Processor

2021-03-19 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305033#comment-17305033
 ] 

Andras Gyori commented on YARN-10702:
-

Thank you [~Jim_Brennan] for the contribution! The logic is obviously good as 
you have tested it thoroughly on a live cluster, therefore I have no addition 
to that part. However, would not it be worthwhile to make this optional? Did 
the background thread make any noticeable difference in terms of resource 
usage? As I see it, this thread is running approximately every second, which 
could be configured as well.

> Add cluster metric for amount of CPU used by RM Event Processor
> ---
>
> Key: YARN-10702
> URL: https://issues.apache.org/jira/browse/YARN-10702
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Minor
> Attachments: Scheduler-Busy.png, YARN-10702.001.patch, 
> YARN-10702.002.patch, YARN-10702.003.patch, YARN-10702.004.patch, 
> simon-scheduler-busy.png
>
>
> Add a cluster metric to track the cpu usage of the ResourceManager Event 
> Processing thread.   This lets us know when the critical path of the RM is 
> running out of headroom.
> This feature was originally added for us internally by [~nroberts] and we've 
> been running with it on production clusters for nearly four years.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups

2021-03-19 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17305024#comment-17305024
 ] 

Ahmed Hussein commented on YARN-10597:
--

That's interesting. I ran the unit tests in YARN-10425 from intellij and they 
all passed.
Just a quick question.
In {{CSMappingPlacementRule.java}} aren't we suppose to pass the configuration 
object to {{Groups.getUserToGroupsMappingService}} ? I am considering the case 
when the singleton was not initialized. In that case 
{{Groups.getUserToGroupsMappingService}} won't parse the parameters 
{{HADOOP_SECURITY_GROUP_MAPPING}} set inside {{conf}} 

{code:java}
- groups = Groups.getUserToGroupsMappingService();
+ groups = Groups.getUserToGroupsMappingService(conf);
{code}


> CSMappingPlacementRule should not create new instance of Groups
> ---
>
> Key: YARN-10597
> URL: https://issues.apache.org/jira/browse/YARN-10597
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: YARN-10597.001.patch
>
>
> As [~ahussein] pointed out in YARN-10425, no new Groups instance should be 
> created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10705) Misleading DEBUG log for container assignment needs to be removed when the container is actually reserved, not assigned in FairScheduler

2021-03-19 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10705:
---
Affects Version/s: 3.4.0

> Misleading DEBUG log for container assignment needs to be removed when the 
> container is actually reserved, not assigned in FairScheduler
> 
>
> Key: YARN-10705
> URL: https://issues.apache.org/jira/browse/YARN-10705
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 3.4.0
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
>
> Following DEBUG logs are logged if a container reservation is made when a 
> node has been offered to the queue in FairScheduler:
> {code}
> 2021-02-10 07:33:55,049 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
> application_1610442362681_2607's resource request is reserved.
> 2021-02-10 07:33:55,049 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
> Assigned container in queue:root.pj_dc_pe container:
> {code}
> The latter log from above seems to indicate a bad container assignment with 
>  resource allocation, whereas, in actual, it is a bad 
> log which shouldn't have been logged in the first place.
> This log comes from [1] after an application attempt with an unmet demand is 
> checked for container assignment/reservation.
> If the container for this app attempt is reserved on the node, then, it 
> returns  from [2].
> From [3]:
> {quote}
>* If an assignment was made, returns the resources allocated to the
>* container.  If a reservation was made, returns
>* FairScheduler.CONTAINER_RESERVED.  If no assignment or reservation 
> was
>* made, returns an empty resource.
> {quote}
> We are checking for the empty resource at [4], but not 
> FairScheduler.CONTAINER_RESERVED before logging out a message for container 
> assignment specifically which is incorrect.
> Instead of:
> {code}
>   if (!assigned.equals(none())) {
> LOG.debug("Assigned container in queue:{} container:{}",
> getName(), assigned);
> break;
>   }
> {code}
> it should be:
> {code}
>   // check if an assignment or a reservation was made.
>   if (!assigned.equals(none())) {
> // only log container assignment if there is
> // an actual assignment, not a reservation.
> if (!assigned.equals(FairScheduler.CONTAINER_RESERVED)
> && LOG.isDebugEnabled()) {
>   LOG.debug("Assigned container in queue:" + getName() + " " +
> "container:" + assigned);
> }
> break;
>   }
> {code}
> [1] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java#L356
> [2] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L911
> [3] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L842
> [4] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java#L355



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10705) Misleading DEBUG log for container assignment needs to be removed when the container is actually reserved, not assigned in FairScheduler

2021-03-19 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja updated YARN-10705:
---
Component/s: yarn

> Misleading DEBUG log for container assignment needs to be removed when the 
> container is actually reserved, not assigned in FairScheduler
> 
>
> Key: YARN-10705
> URL: https://issues.apache.org/jira/browse/YARN-10705
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
>
> Following DEBUG logs are logged if a container reservation is made when a 
> node has been offered to the queue in FairScheduler:
> {code}
> 2021-02-10 07:33:55,049 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
> application_1610442362681_2607's resource request is reserved.
> 2021-02-10 07:33:55,049 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
> Assigned container in queue:root.pj_dc_pe container:
> {code}
> The latter log from above seems to indicate a bad container assignment with 
>  resource allocation, whereas, in actual, it is a bad 
> log which shouldn't have been logged in the first place.
> This log comes from [1] after an application attempt with an unmet demand is 
> checked for container assignment/reservation.
> If the container for this app attempt is reserved on the node, then, it 
> returns  from [2].
> From [3]:
> {quote}
>* If an assignment was made, returns the resources allocated to the
>* container.  If a reservation was made, returns
>* FairScheduler.CONTAINER_RESERVED.  If no assignment or reservation 
> was
>* made, returns an empty resource.
> {quote}
> We are checking for the empty resource at [4], but not 
> FairScheduler.CONTAINER_RESERVED before logging out a message for container 
> assignment specifically which is incorrect.
> Instead of:
> {code}
>   if (!assigned.equals(none())) {
> LOG.debug("Assigned container in queue:{} container:{}",
> getName(), assigned);
> break;
>   }
> {code}
> it should be:
> {code}
>   // check if an assignment or a reservation was made.
>   if (!assigned.equals(none())) {
> // only log container assignment if there is
> // an actual assignment, not a reservation.
> if (!assigned.equals(FairScheduler.CONTAINER_RESERVED)
> && LOG.isDebugEnabled()) {
>   LOG.debug("Assigned container in queue:" + getName() + " " +
> "container:" + assigned);
> }
> break;
>   }
> {code}
> [1] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java#L356
> [2] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L911
> [3] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L842
> [4] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java#L355



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10705) Misleading DEBUG log for container assignment needs to be removed when the container is actually reserved, not assigned in FairScheduler

2021-03-19 Thread Siddharth Ahuja (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Ahuja reassigned YARN-10705:
--

Assignee: Siddharth Ahuja

> Misleading DEBUG log for container assignment needs to be removed when the 
> container is actually reserved, not assigned in FairScheduler
> 
>
> Key: YARN-10705
> URL: https://issues.apache.org/jira/browse/YARN-10705
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Siddharth Ahuja
>Assignee: Siddharth Ahuja
>Priority: Minor
>
> Following DEBUG logs are logged if a container reservation is made when a 
> node has been offered to the queue in FairScheduler:
> {code}
> 2021-02-10 07:33:55,049 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
> application_1610442362681_2607's resource request is reserved.
> 2021-02-10 07:33:55,049 DEBUG 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
> Assigned container in queue:root.pj_dc_pe container:
> {code}
> The latter log from above seems to indicate a bad container assignment with 
>  resource allocation, whereas, in actual, it is a bad 
> log which shouldn't have been logged in the first place.
> This log comes from [1] after an application attempt with an unmet demand is 
> checked for container assignment/reservation.
> If the container for this app attempt is reserved on the node, then, it 
> returns  from [2].
> From [3]:
> {quote}
>* If an assignment was made, returns the resources allocated to the
>* container.  If a reservation was made, returns
>* FairScheduler.CONTAINER_RESERVED.  If no assignment or reservation 
> was
>* made, returns an empty resource.
> {quote}
> We are checking for the empty resource at [4], but not 
> FairScheduler.CONTAINER_RESERVED before logging out a message for container 
> assignment specifically which is incorrect.
> Instead of:
> {code}
>   if (!assigned.equals(none())) {
> LOG.debug("Assigned container in queue:{} container:{}",
> getName(), assigned);
> break;
>   }
> {code}
> it should be:
> {code}
>   // check if an assignment or a reservation was made.
>   if (!assigned.equals(none())) {
> // only log container assignment if there is
> // an actual assignment, not a reservation.
> if (!assigned.equals(FairScheduler.CONTAINER_RESERVED)
> && LOG.isDebugEnabled()) {
>   LOG.debug("Assigned container in queue:" + getName() + " " +
> "container:" + assigned);
> }
> break;
>   }
> {code}
> [1] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java#L356
> [2] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L911
> [3] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L842
> [4] 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java#L355



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-10705) Misleading DEBUG log for container assignment needs to be removed when the container is actually reserved, not assigned in FairScheduler

2021-03-19 Thread Siddharth Ahuja (Jira)
Siddharth Ahuja created YARN-10705:
--

 Summary: Misleading DEBUG log for container assignment needs to be 
removed when the container is actually reserved, not assigned in FairScheduler
 Key: YARN-10705
 URL: https://issues.apache.org/jira/browse/YARN-10705
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Siddharth Ahuja


Following DEBUG logs are logged if a container reservation is made when a node 
has been offered to the queue in FairScheduler:

{code}
2021-02-10 07:33:55,049 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
application_1610442362681_2607's resource request is reserved.
2021-02-10 07:33:55,049 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
Assigned container in queue:root.pj_dc_pe container:
{code}

The latter log from above seems to indicate a bad container assignment with 
 resource allocation, whereas, in actual, it is a bad log 
which shouldn't have been logged in the first place.

This log comes from [1] after an application attempt with an unmet demand is 
checked for container assignment/reservation.

If the container for this app attempt is reserved on the node, then, it returns 
 from [2].

>From [3]:

{quote}
   * If an assignment was made, returns the resources allocated to the
   * container.  If a reservation was made, returns
   * FairScheduler.CONTAINER_RESERVED.  If no assignment or reservation was
   * made, returns an empty resource.
{quote}

We are checking for the empty resource at [4], but not 
FairScheduler.CONTAINER_RESERVED before logging out a message for container 
assignment specifically which is incorrect.

Instead of:

{code}
  if (!assigned.equals(none())) {
LOG.debug("Assigned container in queue:{} container:{}",
getName(), assigned);
break;
  }
{code}

it should be:

{code}
  // check if an assignment or a reservation was made.
  if (!assigned.equals(none())) {
// only log container assignment if there is
// an actual assignment, not a reservation.
if (!assigned.equals(FairScheduler.CONTAINER_RESERVED)
&& LOG.isDebugEnabled()) {
  LOG.debug("Assigned container in queue:" + getName() + " " +
"container:" + assigned);
}
break;
  }
{code}

[1] 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java#L356
[2] 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L911
[3] 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java#L842
[4] 
https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java#L355



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-9927) RM multi-thread event processing mechanism

2021-03-19 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304992#comment-17304992
 ] 

Andras Gyori edited comment on YARN-9927 at 3/19/21, 4:10 PM:
--

Thank you [~hcarrot] for raising this issue and [~zhuqi] for stepping up to 
continue this undertaking.
 These are my feedback and suggestions, which might not mirror the actual 
situation, therefore feel free to discuss the false information. I think the 
current patch is not the best approach of this problem. It does include the 
concerns already raised in this jira, and my insights as well.

The single entry point for an event is AsyncDispatcher#handle, which puts the 
event in the eventQueue, and is processed asynchronously in a single thread. 
There is no way this could be circumvented, because it is used as 
rmContext.getDispatcher() all over the place. We must retain this entry point.
 However I have a strong sense that the performance  bottleneck is actually the 
AsyncDispatcher#eventQueue (a BlockingQueue). In my opinion, the solution is 
exactly the suggestion that is already described in the documentation of 
AsyncDispatcher:
{code:java}
/**
  Dispatches {@link Event}s in a separate thread. Currently only single thread
  does that. Potentially there could be multiple channels for each event type
  class and a thread pool can be used to dispatch the events.
 */
{code}
My suggestion would be:
 # Store a new BlockingQueue for each event type in a HashMap
 # Create a new thread for each of the registered event type / eventQueue
 # Every thread is responsible for one eventQueue processing
 # The Dispatcher would map to an N:N:N (EventQueue:Thread:EventHandler) system 
(or a N:M:N where M is smaller than N in order to reduce the amount of 
threads), where N is the number of EventTypes registered

A more fine-grained solution is possible by making an M*N:M*N:N 
(EventQueue:Thread:EventHandler) system, where M is a number given on 
registration (how many thread should be processing this kind of event) and N is 
the number of EventTypes registered (as far as I am concerned the EventHandlers 
do not use locks internally, and they are thread safe). I am not sure if this 
is feasible, because of the external locks used in EventHandlers (eg. 
NodeEventHandler uses getRMNodes(), which is locked behind a ConcurrentMap -> I 
think this is the feedback which was given by [~adam.antal] and [~epayne]).

A dummy implementation of the aforementioned system would be:
{code:java}
public class ThreadedDispatcher {
private final ConcurrentMap> events;
private final ConcurrentMap> eventHandlers;

  public void register(Class eventType,
  EventHandler handler) {
  new Thread(() -> {
 EventHandler handler = eventHandlers.get(eventType);
 BlockingQueue eventQueue = events.get(eventType);
 while (!stopped && !Thread.currentThread().isInterrupted()) {
Event event = eventQueue.take();
handler.handle(event);
 }
  }).run();

  class GenericEventHandler implements EventHandler {

  public void handle(Event event) {
  events.get(event.getType()).put(event);
  }
  }

  }
{code}
This could also be the less disruptive solution, by simply changing the 
AsyncDispatcher to this ThreadDispatcher and retaining the single entry point 
of the GenericEventHandler#handle. Ideally, nothing needs to be changed apart 
from the initialisation of the rmDispatcher. 

cc: [~pbacsko]


was (Author: gandras):
Thank you [~hcarrot] for raising this issue and [~zhuqi] for stepping up to 
continue this undertaking.
 These are my feedback and suggestions, which might not mirror the actual 
situation, therefore feel free to discuss the false information. I think the 
current patch is not the best approach of this problem. It does include the 
concerns already raised in this jira, and my insights as well.

The single entry point for an event is AsyncDispatcher#handle, which puts the 
event in the eventQueue, and is processed asynchronously in a single thread. 
There is no way this could be circumvented, because it is used as 
rmContext.getDispatcher() all over the place. We must retain this entry point.
 However I have a strong sense that the performance  bottleneck is actually the 
AsyncDispatcher#eventQueue (a BlockingQueue). In my opinion, the solution is 
exactly the suggestion that is already described in the documentation of 
AsyncDispatcher:
{code:java}
/**
  Dispatches {@link Event}s in a separate thread. Currently only single thread
  does that. Potentially there could be multiple channels for each event type
  class and a thread pool can be used to dispatch the events.
 */
{code}
My suggestion would be:
 # Store a new BlockingQueue for each event type in a HashMap
 # Create a new thread for each of the registered ev

[jira] [Commented] (YARN-9927) RM multi-thread event processing mechanism

2021-03-19 Thread Andras Gyori (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304992#comment-17304992
 ] 

Andras Gyori commented on YARN-9927:


Thank you [~hcarrot] for raising this issue and [~zhuqi] for stepping up to 
continue this undertaking.
 These are my feedback and suggestions, which might not mirror the actual 
situation, therefore feel free to discuss the false information. I think the 
current patch is not the best approach of this problem. It does include the 
concerns already raised in this jira, and my insights as well.

The single entry point for an event is AsyncDispatcher#handle, which puts the 
event in the eventQueue, and is processed asynchronously in a single thread. 
There is no way this could be circumvented, because it is used as 
rmContext.getDispatcher() all over the place. We must retain this entry point.
 However I have a strong sense that the performance  bottleneck is actually the 
AsyncDispatcher#eventQueue (a BlockingQueue). In my opinion, the solution is 
exactly the suggestion that is already described in the documentation of 
AsyncDispatcher:
{code:java}
/**
  Dispatches {@link Event}s in a separate thread. Currently only single thread
  does that. Potentially there could be multiple channels for each event type
  class and a thread pool can be used to dispatch the events.
 */
{code}
My suggestion would be:
 # Store a new BlockingQueue for each event type in a HashMap
 # Create a new thread for each of the registered event type / eventQueue
 # Every thread is responsible for one eventQueue processing
 # The Dispatcher would map to an N:N:N (EventQueue:Thread:EventHandler) system 
(or a N:M:N where M is smaller than N in order to reduce the amount of 
threads), where N is the number of EventTypes registered

A more fine-grained solution is possible by making an M*N:M*N:N 
(EventQueue:Thread:EventHandler) system, where M is a number given on 
registration (how many thread should be processing this kind of event) and N is 
the number of EventTypes registered (as far as I am concerned the EventHandlers 
do not use locks internally, and they are thread safe). I am not sure if this 
is feasible, because of the external locks used in EventHandlers (eg. 
NodeEventHandler uses getRMNodes(), which is locked behind a ConcurrentMap -> I 
think this is the feedback which was given by [~adam.antal] and [~epayne]).

A dummy implementation of the aforementioned system would be:
{code:java}
public class ThreadedDispatcher {
private final ConcurrentMap> events;
private final ConcurrentMap> eventHandlers;

  public void register(Class eventType,
  EventHandler handler) {
  new Thread(() -> {
 EventHandler handler = eventHandlers.get(eventType);
 BlockingQueue eventQueue = events.get(eventType);
 while (!stopped && !Thread.currentThread().isInterrupted()) {
Event event = eventQueue.take();
handler.handle(event);
 }
  }).run();

  class GenericEventHandler implements EventHandler {

  public void handle(Event event) {
  events.get(event.getType()).put(event);
  }
  }

  }
{code}
cc: [~pbacsko]

> RM multi-thread event processing mechanism
> --
>
> Key: YARN-9927
> URL: https://issues.apache.org/jira/browse/YARN-9927
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Affects Versions: 3.0.0, 2.9.2
>Reporter: hcarrot
>Assignee: Qi Zhu
>Priority: Major
> Attachments: RM multi-thread event processing mechanism.pdf, 
> YARN-9927.001.patch
>
>
> Recently, we have observed serious event blocking in RM event dispatcher 
> queue. After analysis of RM event monitoring data and RM event processing 
> logic, we found that
> 1) environment: a cluster with thousands of nodes
> 2) RMNodeStatusEvent dominates 90% time consumption of RM event scheduler
> 3) Meanwhile, RM event processing is in a single-thread mode, and It results 
> in the low headroom of RM event scheduler, thus performance of RM.
> So we proposed a RM multi-thread event processing mechanism to improve RM 
> performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10503) Support queue capacity in terms of absolute resources with gpu resourceType.

2021-03-19 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304984#comment-17304984
 ] 

Eric Payne commented on YARN-10503:
---

[~leftnoteasy] and [~sunilg], is there a reason custom resources were not 
included when the absolute resource feature was added?

[~zhuqi], I would prefer that custom resources be treated in a generic way  for 
calculating absolute queue resources. I would rather not treat GPU as a special 
case. However, I think YARN-9936 is going beyond this requirement. Can we use 
this JIRA (YARN-10503) to extend the absolute queue resource feature in a 
general way for custom resources?

> Support queue capacity in terms of absolute resources with gpu resourceType.
> 
>
> Key: YARN-10503
> URL: https://issues.apache.org/jira/browse/YARN-10503
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Critical
> Attachments: YARN-10503.001.patch, YARN-10503.002.patch
>
>
> Now the absolute resources are memory and cores.
> {code:java}
> /**
>  * Different resource types supported.
>  */
> public enum AbsoluteResourceType {
>   MEMORY, VCORES;
> }{code}
> But in our GPU production clusters, we need to support more resourceTypes.
> It's very import for cluster scaling when with different resourceType 
> absolute demands.
>  
> This Jira will handle GPU first.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10493) RunC container repository v2

2021-03-19 Thread Matthew Sharp (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304979#comment-17304979
 ] 

Matthew Sharp commented on YARN-10493:
--

I have an initial PR to address the improvements outlined in the attached pdf.  
I have some thoughts around the manifest caching that I would like to address 
in a follow up Jira.  We have this running internally with the Java CLI tool 
from YARN-10494.  

> RunC container repository v2
> 
>
> Key: YARN-10493
> URL: https://issues.apache.org/jira/browse/YARN-10493
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, yarn
>Affects Versions: 3.3.0
>Reporter: Craig Condit
>Assignee: Matthew Sharp
>Priority: Major
>  Labels: pull-request-available
> Attachments: runc-container-repository-v2-design.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current runc container repository design has scalability and usability 
> issues which will likely limit widespread adoption. We should address this 
> with a new, V2 layout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-10493) RunC container repository v2

2021-03-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YARN-10493:
--
Labels: pull-request-available  (was: )

> RunC container repository v2
> 
>
> Key: YARN-10493
> URL: https://issues.apache.org/jira/browse/YARN-10493
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, yarn
>Affects Versions: 3.3.0
>Reporter: Craig Condit
>Assignee: Matthew Sharp
>Priority: Major
>  Labels: pull-request-available
> Attachments: runc-container-repository-v2-design.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The current runc container repository design has scalability and usability 
> issues which will likely limit widespread adoption. We should address this 
> with a new, V2 layout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups

2021-03-19 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304971#comment-17304971
 ] 

Peter Bacsko edited comment on YARN-10597 at 3/19/21, 3:35 PM:
---

[~shuzirra] is it really that simple? You told me that there were bunch of unit 
test failures when you tried to change it months back. Anyway it's great news 
if the change is tiny.


was (Author: pbacsko):
[~shuzirra] is it really that simple? You told me that there were bunch of unit 
test failures. Anyway it's great news if the change is tiny.

> CSMappingPlacementRule should not create new instance of Groups
> ---
>
> Key: YARN-10597
> URL: https://issues.apache.org/jira/browse/YARN-10597
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: YARN-10597.001.patch
>
>
> As [~ahussein] pointed out in YARN-10425, no new Groups instance should be 
> created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10597) CSMappingPlacementRule should not create new instance of Groups

2021-03-19 Thread Peter Bacsko (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304971#comment-17304971
 ] 

Peter Bacsko commented on YARN-10597:
-

[~shuzirra] is it really that simple? You told me that there were bunch of unit 
test failures. Anyway it's great news if the change is tiny.

> CSMappingPlacementRule should not create new instance of Groups
> ---
>
> Key: YARN-10597
> URL: https://issues.apache.org/jira/browse/YARN-10597
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gergely Pollak
>Assignee: Gergely Pollak
>Priority: Major
> Attachments: YARN-10597.001.patch
>
>
> As [~ahussein] pointed out in YARN-10425, no new Groups instance should be 
> created.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10697) Resources are displayed in bytes in UI for schedulers other than capacity

2021-03-19 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304941#comment-17304941
 ] 

Jim Brennan commented on YARN-10697:


{quote}
So can we introduce a new method in Resource.java which can print it in 
MB|GB|TB?
{quote}
[~BilwaST] I think that is a good suggestion.  There are places where this 
format would be nice.


> Resources are displayed in bytes in UI for schedulers other than capacity
> -
>
> Key: YARN-10697
> URL: https://issues.apache.org/jira/browse/YARN-10697
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bilwa S T
>Assignee: Bilwa S T
>Priority: Major
> Attachments: YARN-10697.001.patch, image-2021-03-17-11-30-57-216.png
>
>
> Resources.newInstance expects MB as memory whereas in MetricsOverviewTable 
> passes resources in bytes . Also we should display memory in GB for better 
> readability for user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-10534) Enable runC container transformations

2021-03-19 Thread Matthew Sharp (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Sharp reassigned YARN-10534:


Assignee: Matthew Sharp

> Enable runC container transformations
> -
>
> Key: YARN-10534
> URL: https://issues.apache.org/jira/browse/YARN-10534
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Matthew Sharp
>Assignee: Matthew Sharp
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The goal of this Jira is to provide an optional plugin to apply runC 
> container transformations. Enabling runC container transformations will 
> provide an easy way to apply site specific customizations to all containers.
> An example of one transformation that many clusters may need could be a 
> Kerberos transformation. This would apply cluster Kerberos configurations and 
> mount them to all runC containers that are submitted, without requiring users 
> to manage them within their own images.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10704) The CS effective capacity for absolute mode in UI should support GPU and other custom resources.

2021-03-19 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304816#comment-17304816
 ] 

Hadoop QA commented on YARN-10704:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
20s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red}{color} | {color:red} The patch doesn't appear to 
include any new or modified tests. Please justify why no new tests are needed 
for this patch. Also please list what manual steps were performed to verify 
this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
59s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 52s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 19m 
57s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
49s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/827/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt{color}
 | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 82 unchanged - 0 fixed = 85 total (was 82) {color} 
|
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 15m  
3s{color} | 
{col

[jira] [Updated] (YARN-10704) The CS effective capacity for absolute mode in UI should support GPU and other custom resources.

2021-03-19 Thread Qi Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qi Zhu updated YARN-10704:
--
Summary: The CS effective capacity for absolute mode in UI should support 
GPU and other custom resources.  (was: The CS effective capacity for absolute 
mode in UI should support GPU.)

> The CS effective capacity for absolute mode in UI should support GPU and 
> other custom resources.
> 
>
> Key: YARN-10704
> URL: https://issues.apache.org/jira/browse/YARN-10704
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10704.001.patch, image-2021-03-19-12-05-28-412.png, 
> image-2021-03-19-12-08-35-273.png
>
>
> Actually there are no information about the effective capacity about GPU in 
> UI for absolute resource mode.
> !image-2021-03-19-12-05-28-412.png|width=873,height=136!
> But we have this information in QueueMetrics:
> !image-2021-03-19-12-08-35-273.png|width=613,height=268!
>  
> It's very important for our GPU users to use in absolute mode, there still 
> have nothing to know GPU absolute information in CS Queue UI. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-10704) The CS effective capacity for absolute mode in UI should support GPU.

2021-03-19 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304701#comment-17304701
 ] 

Qi Zhu edited comment on YARN-10704 at 3/19/21, 8:00 AM:
-

cc [~pbacsko]  [~gandras] [~ebadger]  

Could you help review this, i first realize the no label case, because the 
custom resource metrics still don't support label.

I think it's important for custom resource absolute mode users to get the 
effective  custom resource in UI.

Thanks.


was (Author: zhuqi):
cc [~pbacsko]  [~gandras] [~ebadger]  

Could you help review this, i first realize the no label case, because the 
custom resource metrics still don't support label.

> The CS effective capacity for absolute mode in UI should support GPU.
> -
>
> Key: YARN-10704
> URL: https://issues.apache.org/jira/browse/YARN-10704
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10704.001.patch, image-2021-03-19-12-05-28-412.png, 
> image-2021-03-19-12-08-35-273.png
>
>
> Actually there are no information about the effective capacity about GPU in 
> UI for absolute resource mode.
> !image-2021-03-19-12-05-28-412.png|width=873,height=136!
> But we have this information in QueueMetrics:
> !image-2021-03-19-12-08-35-273.png|width=613,height=268!
>  
> It's very important for our GPU users to use in absolute mode, there still 
> have nothing to know GPU absolute information in CS Queue UI. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-10704) The CS effective capacity for absolute mode in UI should support GPU.

2021-03-19 Thread Qi Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17304701#comment-17304701
 ] 

Qi Zhu commented on YARN-10704:
---

cc [~pbacsko]  [~gandras] [~ebadger]  

Could you help review this, i first realize the no label case, because the 
custom resource metrics still don't support label.

> The CS effective capacity for absolute mode in UI should support GPU.
> -
>
> Key: YARN-10704
> URL: https://issues.apache.org/jira/browse/YARN-10704
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Qi Zhu
>Assignee: Qi Zhu
>Priority: Major
> Attachments: YARN-10704.001.patch, image-2021-03-19-12-05-28-412.png, 
> image-2021-03-19-12-08-35-273.png
>
>
> Actually there are no information about the effective capacity about GPU in 
> UI for absolute resource mode.
> !image-2021-03-19-12-05-28-412.png|width=873,height=136!
> But we have this information in QueueMetrics:
> !image-2021-03-19-12-08-35-273.png|width=613,height=268!
>  
> It's very important for our GPU users to use in absolute mode, there still 
> have nothing to know GPU absolute information in CS Queue UI. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org