[jira] [Commented] (YARN-7728) Expose and expand container preemptions in Capacity Scheduler queue metrics

2018-01-24 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338202#comment-16338202
 ] 

Eric Payne commented on YARN-7728:
--

{quote}I am fine with latest patch. If no issues, I could commit this patch.
{quote}
Hi [~sunilg]. Any update?

> Expose and expand container preemptions in Capacity Scheduler queue metrics
> ---
>
> Key: YARN-7728
> URL: https://issues.apache.org/jira/browse/YARN-7728
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.9.0, 2.8.3, 3.0.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7728.001.patch, YARN-7728.002.patch
>
>
> YARN-1047 exposed queue metrics for the number of preempted containers to the 
> fair scheduler. I would like to also expose these to the capacity scheduler 
> and add metrics for the amount of lost memory seconds and vcore seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7728) Expose and expand container preemptions in Capacity Scheduler queue metrics

2018-01-17 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328522#comment-16328522
 ] 

Sunil G commented on YARN-7728:
---

Sorry [~eepayne] for the delay here.

Yes, it makes sense. I was looking for one more metric earlier where I can get 
an idea that how much preemption happens over a time. But its not needed very 
much and lost cycles is more critical for cluster.

 

I am fine with latest patch. If no issues, I could commit this patch.

> Expose and expand container preemptions in Capacity Scheduler queue metrics
> ---
>
> Key: YARN-7728
> URL: https://issues.apache.org/jira/browse/YARN-7728
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.9.0, 2.8.3, 3.0.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7728.001.patch, YARN-7728.002.patch
>
>
> YARN-1047 exposed queue metrics for the number of preempted containers to the 
> fair scheduler. I would like to also expose these to the capacity scheduler 
> and add metrics for the amount of lost memory seconds and vcore seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7728) Expose and expand container preemptions in Capacity Scheduler queue metrics

2018-01-16 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327877#comment-16327877
 ] 

Eric Payne commented on YARN-7728:
--

Hi [~sunilg]. Did you have a chance to review my comments and the new patch? I 
would be interested in your comments.

> Expose and expand container preemptions in Capacity Scheduler queue metrics
> ---
>
> Key: YARN-7728
> URL: https://issues.apache.org/jira/browse/YARN-7728
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.9.0, 2.8.3, 3.0.0
>Reporter: Eric Payne
>Assignee: Eric Payne
>Priority: Major
> Attachments: YARN-7728.001.patch, YARN-7728.002.patch
>
>
> YARN-1047 exposed queue metrics for the number of preempted containers to the 
> fair scheduler. I would like to also expose these to the capacity scheduler 
> and add metrics for the amount of lost memory seconds and vcore seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7728) Expose and expand container preemptions in Capacity Scheduler queue metrics

2018-01-12 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324624#comment-16324624
 ] 

genericqa commented on YARN-7728:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 144 unchanged - 0 fixed = 146 total (was 144) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m  
2s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}110m 10s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7728 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905922/YARN-7728.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 51d954c543b5 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 3d65dbe |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19231/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/19231/testReport/ |
| Max. process+thread count | 846 (vs. ulimit of 5000) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 

[jira] [Commented] (YARN-7728) Expose and expand container preemptions in Capacity Scheduler queue metrics

2018-01-12 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16324165#comment-16324165
 ] 

Eric Payne commented on YARN-7728:
--

Thanks a lot for the comments, [~sunilg].

bq. n 3.0, we support multiple types, and this covers only cpu and memory. So 
could we cover preemption metrics also in case of multi resources.
I agree with this in principle. However, I made a conscious decision not to do 
this. There are a couple of difficulties that I see. First, this is not done 
for other resource metrics in QueueMetrics (or any of the other system metrics 
I could find). The resource metrics only cover memory and vcores. Second, 
making the metric names match the resource names is a little difficult if the 
resource names could be dynamic. Because of these two things, I feel that 
solving this should be done all at the same time in a more general JIRA.

{quote}
One more doubt is with aggregateVcoreSecondsPreempted. MutableCounterLong is 
used for this. But under one queue, we ll have multiple containers gets 
preempted and each container resource size vary drastically. So are we looking 
for an aggregate resource among all preempted containers in a given time ?
{quote}
I don't think I understand the question. The metrics are updated when each 
container is preempted, and the value keeps increasing over time. Similar to 
memory, it's basically a metric of total lost (virtual) cpu cycles due to 
preemption since the RM was started.

{quote}
 aggregateMegabyteSecondsPreempted: MegaByte seems a bit confusing, MemoryMB is 
used in another places as well. Could we use something similar (like prepending 
memory)
{quote}
Good point. I will update a new patch.

> Expose and expand container preemptions in Capacity Scheduler queue metrics
> ---
>
> Key: YARN-7728
> URL: https://issues.apache.org/jira/browse/YARN-7728
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.9.0, 2.8.3, 3.0.0
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: YARN-7728.001.patch
>
>
> YARN-1047 exposed queue metrics for the number of preempted containers to the 
> fair scheduler. I would like to also expose these to the capacity scheduler 
> and add metrics for the amount of lost memory seconds and vcore seconds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7728) Expose and expand container preemptions in Capacity Scheduler queue metrics

2018-01-12 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323837#comment-16323837
 ] 

Sunil G commented on YARN-7728:
---

Thanks [~eepayne], this will be really helpful when preemption happens.

I have doubt w.r.t multiple resources scenario. In 3.0, we support multiple 
types, and this covers only cpu and memory. So could we cover preemption 
metrics also in case of multi resources.
One more doubt is with aggregateVcoreSecondsPreempted. MutableCounterLong is 
used for this. But under one queue, we ll have multiple containers gets 
preempted and each container resource size vary drastically. So are we looking 
for an aggregate resource among all preempted containers in a given time ?

Minor Nits:
aggregateMegabyteSecondsPreempted: MegaByte seems a bit confusing, MemoryMB is 
used in another places as well. Could we use something similar (like prepending 
memory)


> Expose and expand container preemptions in Capacity Scheduler queue metrics
> ---
>
> Key: YARN-7728
> URL: https://issues.apache.org/jira/browse/YARN-7728
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.9.0, 2.8.3, 3.0.0
>Reporter: Eric Payne
>Assignee: Eric Payne
> Attachments: YARN-7728.001.patch
>
>
> YARN-1047 exposed queue metrics for the number of preempted containers to the 
> fair scheduler. I would like to also expose these to the capacity scheduler 
> and add metrics for the amount of lost memory seconds and vcore seconds.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7728) Expose and expand container preemptions in Capacity Scheduler queue metrics

2018-01-11 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16323234#comment-16323234
 ] 

genericqa commented on YARN-7728:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  2s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 26s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 2 new + 144 unchanged - 0 fixed = 146 total (was 144) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 12s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m  4s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}107m 19s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | YARN-7728 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12905721/YARN-7728.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux db74d9cd4dbe 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / bc285da |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/19206/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit |