[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-07-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369725#comment-15369725
 ] 

Hudson commented on YARN-4712:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #10074 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10074/])
YARN-4712. CPU Usage Metric is not captured properly in YARN-2928. (sjlee: rev 
6f6cc647d6e77f6cc4c66e0534f8c73bc1612a1b)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/timelineservice/NMTimelinePublisher.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/timelineservice/TestNMTimelinePublisher.java


> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, 
> YARN-4712-YARN-2928.v1.006.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-19 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201871#comment-15201871
 ] 

Naganarasimha G R commented on YARN-4712:
-

Thanks for the review and commit [~varun_saxena], [~sjlee0], [~djp], & 
[~sunilg].

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, 
> YARN-4712-YARN-2928.v1.006.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199768#comment-15199768
 ] 

Varun Saxena commented on YARN-4712:


+1 on the latest patch from me too.
>From the point of view of aggregation, I agree with using container CPU per 
>core.

Before committing this, I think we can discuss with others in meeting today and 
find out if they agree.
And if we need other metrics at the container level.


> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, 
> YARN-4712-YARN-2928.v1.006.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-19 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201868#comment-15201868
 ] 

Varun Saxena commented on YARN-4712:


I have committed this to branch YARN-2928.
Thanks [~Naganarasimha] for your contribution.
And thanks [~sjlee0], [~djp] and [~sunilg] for reviews.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Fix For: YARN-2928
>
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, 
> YARN-4712-YARN-2928.v1.006.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-18 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201835#comment-15201835
 ] 

Varun Saxena commented on YARN-4712:


bq.  Varun Saxena, will you do the honors of committing this patch?
Sure. WiIl commit it shortly.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, 
> YARN-4712-YARN-2928.v1.006.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-18 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201695#comment-15201695
 ] 

Sangjin Lee commented on YARN-4712:
---

For the record (and following up on yesterday's offline discussion), I am +1 on 
the patch. If there are different opinions on this, we can reopen the 
discussion. Thanks! [~varun_saxena], will you do the honors of committing this 
patch?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, 
> YARN-4712-YARN-2928.v1.006.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-15 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195852#comment-15195852
 ] 

Sangjin Lee commented on YARN-4712:
---

The latest patch looks good to me. However, we still need others' feedback, and 
more importantly thoughts on which metric we should capture and report...

On a somewhat separate note, I see that this metrics emission interval is tied 
to {{yarn.nodemanager.resource-monitor.interval-ms}} whose default is 3 
seconds. I am afraid that this is probably too aggressive for the timeline 
service v.2. I think it'd be good to have a separate control for publishing 
these metrics to the timeline service. I'll open a JIRA for that.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch, 
> YARN-4712-YARN-2928.v1.006.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194836#comment-15194836
 ] 

Hadoop QA commented on YARN-4712:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
37s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
13s {color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 patch generated 0 new + 24 unchanged - 1 fixed = 24 total (was 25) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 27s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 2s {color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 37s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793474/YARN-4712-YARN-2928.v1.006.patch
 |
| JIRA Issue | YARN-4712 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 3448f00169b7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality 

[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-14 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193764#comment-15193764
 ] 

Sangjin Lee commented on YARN-4712:
---

Also some quick comments on the latest patch:

(ContainersMonitorImpl.java)
- l.469-473: We need to note other usages of {{cpuUsageTotalCoresPercentage}}. 
It is used in tracking the container resource utilization, as well as passed to 
{{ContainerMetrics.forContainer()}}. If we're no longer going to use this for 
the {{NMTimelinePublisher}}, we might need to it differently?

(NMTimelinePublisher.java)
- l.117: we should change the argument name from 
{{cpuUsageTotalCoresPercentage}} to {{cpuUsagePercentPerCore}}


> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-14 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193732#comment-15193732
 ] 

Sangjin Lee commented on YARN-4712:
---

I also think that {{cpuUsagePercentPerCore}} might be a better metric to record 
than {{cpuUsageTotalCorePercentage}}.

One way to understand the difference in using either is with the former the 
unit is the cores and with the latter it is the machines. Other aspects are 
entirely similar. Thus, it follows that {{cpuUsagePercentPerCore}} is a 
finer-grained value than {{cpuUsageTotalCorePercentage}}.

For example, to come up with a relative utilization of an app against the full 
cluster, you need the number of cores as the denominator with the former, and 
the number of machines with the latter. Granted, obtaining the number of cores 
can be more difficult than the number of machines.

Either model breaks down when those units are no longer interchangeable. For 
example, with {{cpuUsageTotalCorePercentage}}, it causes inaccurate values if 
the machines are not of equal size (e.g. machines with different numbers of 
cores). With {{cpuUsagePercentPerCore}}, it can report inaccurate utilization 
of the cluster if clock speeds are different between machines.

\[1\] cpuUsagePercentPerCore
- pro: more accurate and finer-grained reporting of utilization
- con: requires the number of cores to come up with the cluster-wide 
utilization of anything
- con: still doesn’t account for different core performance

\[2\] cpuUsageTotalCoresPercentage
- pro: easier to come up with cluster-wide utilization
- con: coarser-grained metric that breaks down the moment machines are not 
equivalent

\[other points\]
\[1\] stick with pure utilization
One point to consider is whether we should take into account the available 
capacity as opposed to full machine capacity. There are a couple of ways the 
available capacity can be different than the full capacity. One is via 
{{nodeCpuPercentageForYARN}} (coming from the cpu-limit config). Another 
mechanism is via the allocated vcores mechanism. Either way, for example, one 
may allocate only 6 cores out of a 8-core machine. If a container is using 6 
cores, the question is whether that should be reported as 100% utilization or 
75% utilization.

Although an argument can be made for either outcome, I think it might be 
simpler to stick with a pure utilization approach. It would be easier to match 
those numbers against CPU measurements coming from direct means. We should 
consider CPU reported by the NM as plain utilization numbers.

\[2\] stick with physical cores vs. vcores
Another potentially complicating factor is whether we should consider using 
vcores Using vcores would put this closer to YARN’s resource scheduling model. 
However, IMO it would make things unnecessarily more complicated.

Again, in the vein of treating the CPU as plain utilization that can be matched 
against the direct measurements, I think we should stick with physical cores.

Thoughts?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192239#comment-15192239
 ] 

Hadoop QA commented on YARN-4712:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
21s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
56s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 patch generated 1 new + 22 unchanged - 2 fixed = 23 total (was 24) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 34s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 6s {color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 30s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12793186/YARN-4712-YARN-2928.v1.005.patch
 |
| JIRA Issue | YARN-4712 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 6d596507dcba 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-08 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185588#comment-15185588
 ] 

Varun Saxena commented on YARN-4712:


>From the example, it seems it does make more sense to capture 
>cpuUsagePercentPerCore from an aggregation point of view.

As I pointed out in my earlier comment, I was thinking what we report to RM as 
resource utilization would be a useful metric to capture as well. Maybe have it 
only at the container level and do not aggregate it. I was saying this because 
of the factor nodeCpuPercentageForYARN. However from offline discussion with 
Naga, it seems it isn't calculated correctly. But if it is, wouldn't it be 
useful ? 

bq. it would involve complexities when nodes go down intermittently before the 
aggregation and effective usage should not be more than the actual cluster cores
I am assuming we will have to add some system level metrics in the future. 
Maybe we can capture things like reduction/addition in cluster vcores/memory 
there(so that we can reconcile with CPU/memory values reported from container 
and aggregated upwards)

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-08 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15185015#comment-15185015
 ] 

Naganarasimha G R commented on YARN-4712:
-

On second thoughts if no one is testing ATSV2 i can wait till the conclusion of 
the above CPU usage and then get it committed (doesnt make sense to have one 
more issue on CPU again!), Either way i am fine !


> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-07 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184400#comment-15184400
 ] 

Naganarasimha G R commented on YARN-4712:
-

Thanks [~sjlee0] for the clarification with example and agree that there is no 
point in summing up {{cpuUsageTotalCoresPercentage}} but there are issues with 
other approach( aggregating *cpuUsagePercentPerCore*) which you have also 
mentioned
# total number of cores for the cluster would be required to actually arrive at 
a percentage and also it would involve complexities when nodes go down 
intermittently before the aggregation and effective usage should not be more 
than the actual cluster cores 
# total number of cores should not be just the machines cores but actually 
which is made available to YARN (*nodeCpuPercentageForYARN*)
# Still it might not be better comparison as  the type of the core might also 
be different in a heterogenous cluster. Usually we try to have some 
multiplication factor so that vcores overall match. so may be we need to 
consider this factor too ?

But anyway this would require more discussion on what would be the right one to 
choose so i will raise a new jira as this bug we have currently will block some 
one testing the ATSv2 


> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-07 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183611#comment-15183611
 ] 

Sangjin Lee commented on YARN-4712:
---

My main concern with using {{cpuUsageTotalCoresPercentage}} is about 
*aggregation*, and I think using {{cpuUsageTotalCoresPercentage}} breaks down 
in a heterogeneous cluster. Here is an illustrative example.

Suppose you have a 2-node cluster, where the first node has 4 cores and the 
second node has 8 cores. Furthermore, suppose that the container on the 4-core 
node is utilizing all 4 cores and the container on the 8-core node is utilizing 
1 core. Since the entire cluster has 12 cores and the app is using 5 cores, the 
utilization of this app should be 42% (5/12 cores).

However, if we use {{cpuUsageTotalCoresPercentage}}, we have a problem. The 
container on the 4-core node will report 100% utilization on that node, and the 
other container on the 8-core node will report 12.5% utilization. Then, if we 
aggregated the container metrics to the app, the app would have 112.5% 
utilization of the cluster or 56% per node. IMO this is not correct, or at best 
misleading.

If the node capacity in terms of cores is homogeneous, it does not make a 
difference by using either. However, if we have a heterogeneous cluster, the 
latter essentially under-weighs larger machines by using *simple* averages. 
This would result in a misleading and confusing result on aggregation.

I do recognize using {{cpuUsagePercentPerCore}} would require the total number 
of cores for the cluster when aggregated to arrive at a relative percentage 
number. But overall I do feel that {{cpuUsagePercentPerCore}} would be a more 
accurate measure of the cluster utilization when aggregated.

I am OK with separating that discussion to another JIRA.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-07 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183343#comment-15183343
 ] 

Varun Saxena commented on YARN-4712:


Thanks [~Naganarasimha] for the patch.
bq. IMO cpuUsageTotalCoresPercentage is important to gauge how much of the 
cluster's CPU is getting utlized, if its cpuUsagePercentPerCore i beleive it 
doesnt give the cluster's CPU on aggregation from all containers. Infact we 
need to report both and also IMO cpuUsageTotalCoresPercentage is not calculated 
properly it should be
In ContainersMonitorImpl, we calculate CPU per container process.
There are 2 primary CPU values here. 
{{cpuUsagePercentPerCore}} is similar to what we see in top command output i.e. 
if we have a 4 core machine, and 3 of the cores are used by a specific process, 
we will see CPU% as 300%.
This value on Linux will be calculated by reading {{/proc//stat}} from 
where we read the amount of time spent(in terms of CPU jiffies) in kernel and 
user space by the process. And get the effective CPU% based on the sample 
values read earlier and read now.
{{cpuUsageTotalCoresPercentage}} on the other hand is a normalized value based 
on number of processors configured for the node.
{{nodeCpuPercentageForYARN}} is a config which places an upper limit on CPU to 
be used by containers. IIUC this config has to be used with cgroups(although 
its not restricted in code).
This config is factored in while reporting CPU resource utilization in Node HB 
to RM.
In a heterogeneous cluster, all these 3 values maybe useful. Thoughts ?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, 
> YARN-4712-YARN-2928.v1.004.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182347#comment-15182347
 ] 

Hadoop QA commented on YARN-4712:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 33s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
55s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 patch generated 1 new + 22 unchanged - 2 fixed = 23 total (was 24) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_74 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 34s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_74. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 7s {color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 51s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12791682/YARN-4712-YARN-2928.v1.004.patch
 |
| JIRA Issue | YARN-4712 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4b22a6b1e10d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-04 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181424#comment-15181424
 ] 

Sangjin Lee commented on YARN-4712:
---

I agree that we should take care of the UNAVAILABLE metrics via YARN-4308. My 
position is that we should skip reporting the value rather than reporting 0.

I know I might be opening a can of worms, but I'd like to raise a couple of 
points as they are closely related to this.

First, what should we report via {{NMTimelinePublisher}}? There are 2 choices: 
{{cpuUsagePercentPerCore}} (300% in the example mentioned in the comment) and 
{{cpuUsageTotalCoresPercentage}} (50% in the same example). I see that we're 
storing {{cpuUsageTotalCoresPercentage}}. I wonder if that is the best choice 
here.

For example, consider a cluster with workers with substantially different 
capacity (number of cores). If we used the latter and tried to aggregate them 
later for the application or the flow, this would lead to a highly misleading 
sum. 50% of a 6-core node is very different than 50% of a 24-core node.

Most of YARN's CPU accounting is based on cores rather than nodes/machines. IMO 
{{cpuUsagePercentPerCore}} would be a better value to emit. Thoughts?

The second point is the following line in the existing code:
{code}
cpuMetric.setId(ContainerMetric.CPU.toString() + pId);
{code}

I vaguely remember reading this line and being puzzled. Why are we appending 
the process id to the metric id? Doesn't this cause issues when we do the 
aggregation? For example, suppose we have a container #1 (process id = 1234) on 
some machine whose CPU usage is 10%, and container #2 (process id = 5678) on 
another machine whose CPU usage is 20%. The object model will be

{noformat}
(container #1) -> (metric) -> ("CPU1234" => 10)
(container #2) -> (metric) -> ("CPU5678" => 20)
{noformat}

But we want to add them for the parent application. It would be real awkward to 
add these metrics with different keys. Why is process id needed here in the 
first place?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178165#comment-15178165
 ] 

Varun Saxena commented on YARN-4712:


Yeah even I think we should make minimal changes here i.e. only those required 
specifically for YARN-2928. Because this is something which has to be done 
primarily in trunk.
I will let [~djp] comment on it as he was reviewing YARN-4308 as well.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178143#comment-15178143
 ] 

Sunil G commented on YARN-4712:
---

Hi [~Naganarasimha Garla] and [~varun_saxena]

I think changes in {{ContainersMonitorImpl}} need to be trunk also. A patch was 
given in YARN-4308, but there were few discussion on  YARN-3304 and most people 
agreed for -1 there. Hence YARN-4308 is in limbo state, and I think for first 
time we can send 0 rather than -1. So patch there can go to trunk i think. In 
that case, next trunk sync will fetch that change. Thoughts?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-03 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178065#comment-15178065
 ] 

Sunil G commented on YARN-4712:
---

[~varun_saxena]  I was also planning to handle this pblm from it root cause 
end,  which is in NM reporting side,  as YARN-4308. Ideally to me,  it's not 
good to send UNAVAILABLE,  bcz we send it for every first reading. In some 
error care,  when there s no reading s there,  I think we may need this error 
code also.  But I would like to fix sending - 1 in first reponse at least. 

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-03 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178055#comment-15178055
 ] 

Varun Saxena commented on YARN-4712:


[~Naganarasimha],
UNAVAILABLE related issue exists in trunk. Shouldnt we fix all issues related 
to UNAVAILABLE as part of JIRA in trunk ? YARN-4308 is raised specifically for 
this purpose if I am not wrong. I was seeing this JIRA more for handling the 
issue with sending float as CPU metric and just bringing in YARN-4308 fix in 
the interim.
However I am open to fixing it here. But this can duplicate effort and cause 
clash if JIRA in trunk goes before this JIRA.
Thoughts [~djp] ?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177970#comment-15177970
 ] 

Hadoop QA commented on YARN-4712:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 11m 
4s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 36s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
14s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 15s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 patch generated 1 new + 20 unchanged - 4 fixed = 21 total (was 24) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 32s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 16s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 20s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 34s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12791179/YARN-4712-YARN-2928.v1.003.patch
 |
| JIRA Issue | YARN-4712 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 60f598883cac 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-02 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15177177#comment-15177177
 ] 

Naganarasimha G R commented on YARN-4712:
-

Thanks for the comments [~djp] & [~varun_saxena],\
bq. Regarding checkstyle, you can fix them for now.
As you can note in the latest patch line length issues are already taken care 
of.

bq. We shouldn't let Eclipse's bug affect our code convention.
Well its not that i dont want to do it, but i presume Eclipse optimizes it in 
some ways and does only when required, Anyway have taken care of it but it 
would more easy to rely on the editors formatter if accepted  :) 

bq.  it seems more things need to be fixed for UNAVAILABLE case,
agree , milliVcoresUsed can be set to 0 in UNAVAILABLE case, right ?

bq. It sounds weird if cpuUsageTotalCoresPercentage is -1 in UNAVAILABLE case.
we have set it -1 to indicate not to store this  value in the ATS. if its 
*unavaiable do we need to store it as 0 or not store at all* ?

bq. it make cpu metric to be either 0 or 1 which is not expected here?
As [~varun_saxena] explained it directly gives as percent values (no need to 
multiply with 100) and we round of to only remove the decimals values

[~djp], if you can confirm on these queries, i can finish the patch

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175997#comment-15175997
 ] 

Varun Saxena commented on YARN-4712:


bq. it make cpu metric to be either 0 or 1 which is not expected here?
Are you saying this because you are expecting cpuUsageTotalCoresPercentage to 
be in the range 0-1 ? I was thinking the same initially and hence in the 
initial patch we were multiplying this value with 100. But that doesnt seem to 
be the case. Upon testing found that this value is not between 0-1. If there 
are 4 cores and 2 cores are fully used, this value will be 50.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175980#comment-15175980
 ] 

Junping Du commented on YARN-4712:
--

Back on this patch, it seems more things need to be fixed for UNAVAILABLE case, 
like code blow:
{code}
// Multiply by 1000 to avoid losing data when converting to int
int milliVcoresUsed = (int) (cpuUsageTotalCoresPercentage * 1000
* maxVCoresAllottedForContainers /nodeCpuPercentageForYARN);
{code}
It sounds weird if cpuUsageTotalCoresPercentage is -1 in UNAVAILABLE case.

In addition, code below sounds not right:
{code}
+cpuMetric.addValue(currentTimeMillis,
+(long) Math.round(cpuUsageTotalCoresPercentage));
{code}
it make cpu metric to be either 0 or 1 which is not expected here?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175973#comment-15175973
 ] 

Junping Du commented on YARN-4712:
--

That's correct. We shouldn't let Eclipse's bug affect our code convention. My 
practice is show the print margin of 80 chars and address it myself ahead. Or 
you can run a local checkstyle tools before submit the patch.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175878#comment-15175878
 ] 

Varun Saxena commented on YARN-4712:


[~djp], the checkstyle issues I was referring to were line > 80 characters, the 
ones fixed in last patch.
Naga was telling me offline that he uses eclipse formatter given on Hadoop Wiki 
page which does not always take care of > 80 characters issue especially if its 
just 5-6 characters extra. And as he uses it, these checkstyle issues(for > 80 
chars) keep on cropping up in every patch.

So for this branch, we need to fix line > 80 characters issue. Right ?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-02 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175864#comment-15175864
 ] 

Junping Du commented on YARN-4712:
--

Regarding checkstyle, I think YARN-2928 dev branch should follow the same 
standard/criteria as trunk branch or it will have trouble when merge back. The 
common practice for trunk on checksyte issues is we need to fix them as much as 
we can. However, for some annoy warnings like "method too long" (like this 
case), "method parameter too many", etc., we don't need to worry about it 
unless there is strong justification for refactor the code.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175841#comment-15175841
 ] 

Varun Saxena commented on YARN-4712:


Regarding checkstyle, you can fix them for now. We can confirm with [~sjlee0] 
in tomorrow's meeting if for this branch we need to follow checkstyle or do not 
consider checkstyle issues which appear despite using the eclipse formatter 
given on Hadoop Wiki.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-02 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175838#comment-15175838
 ] 

Varun Saxena commented on YARN-4712:


Thanks [~Naganarasimha] for the patch.
Looks good overall. A couple of nits.

# In {{NMTimelinePublisher}}, cast to long is not required as Math#round 
returns an int/long depending on input value.
# TestNMTimelinePublisher line 84, use 
ResourceCalculatorProcessTree.UNAVAILABLE instead of -1.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch, 
> YARN-4712-YARN-2928.v1.002.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-03-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15175029#comment-15175029
 ] 

Hadoop QA commented on YARN-4712:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
22s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 patch generated 1 new + 22 unchanged - 2 fixed = 23 total (was 24) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 46s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 14s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
19s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 35m 27s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12790851/YARN-4712-YARN-2928.v1.002.patch
 |
| JIRA Issue | YARN-4712 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c78f28d74503 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-28 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171214#comment-15171214
 ] 

Varun Saxena commented on YARN-4712:


[~Naganarasimha],
I was incorrectly assuming that CPU % is reported to NMTimelinePublisher in the 
range of 0-1. This doesn't seem to be the case though.
If we have 8 cores, and 4 are fully used, CPU % reported will be 50%. I do not 
think we should multiply this value with 100 then. Right ?
We can just normalize the value by either rounding or flooring it ?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-28 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171188#comment-15171188
 ] 

Varun Saxena commented on YARN-4712:


[~Naganarasimha], thanks for the patch.
Looks fine overall. Few minor comments/nits.

* The javac and 2 of the checkstyle issues seem fixable.
* Moreover, should we use Math#round instead of floor ? I would personally 
prefer round.
* In the test, we can include an assertion with CPU value of less than 1 or a 
value reported with decimals. This way we can verify behavior of conversion of 
value after applying round or floor(whichever we use). Somebody should not be 
able to change the behavior inadvertently then.
* Also, we are using a deterministic sleep in the test(looping 5 times over). 
Is 750 ms enough for a slow machine ? Maybe use a while loop and wait till 
condition meets and add an overall larger timeout to the test or add more 
iterations, just to avoid test failing on a slow machine ?
* Nit : In assertEquals message being passed is empty (""), either we can have 
some message or just call the assertEquals version which is without message.
* In the test, for the line 
{{publisher.reportContainerResourceUsage(aContainer, PID, 1024L, -1F);}}, use 
the constant {{ResourceCalculatorProcessTree.UNAVAILABLE}} instead of -1 to 
avoid failure in case we change this constant's value.
* {{verifyPublishedResourceUsageMetrics(timelineClient, 1, 1024L, -999);}}. 
Here -999 can be used to denote a special meaning that cpu usage is not 
published. We do not explicitly check for it within  
verifyPublishedResourceUsageMetrics method. I mean we are indicating that 
numberOfResourceMetrics published is 1. But which one(MEMORY or CPU) is not 
checked. Maybe this -999 or better still we can use 
{{ResourceCalculatorProcessTree.UNAVAILABLE}} to verify if we have to check 
CPU/MEMORY or not. And if we get unexpected metric, we can fail the test.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171037#comment-15171037
 ] 

Hadoop QA commented on YARN-4712:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
49s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
16s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
54s {color} | {color:green} YARN-2928 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
24s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 31s {color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.8.0_72
 with JDK v1.8.0_72 generated 1 new + 1 unchanged - 1 fixed = 2 total (was 2) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 55s {color} 
| {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdk1.7.0_95
 with JDK v1.7.0_95 generated 1 new + 2 unchanged - 0 fixed = 3 total (was 2) 
{color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 patch generated 3 new + 22 unchanged - 2 fixed = 25 total (was 24) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 32s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 7s {color} | 
{color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK 
v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | 

[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-28 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171030#comment-15171030
 ] 

Naganarasimha G R commented on YARN-4712:
-

Hi [~varun_saxena], [~sunilg] & [~sjlee0], 
Attaching a patch, please review. I have taken a slightly different approach 
than YARN-4308 for issue 1.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
> Attachments: YARN-4712-YARN-2928.v1.001.patch
>
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-26 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169452#comment-15169452
 ] 

Sangjin Lee commented on YARN-4712:
---

Yes, at least for now we should emit the CPU as a percentage (i.e. multiply by 
100).

As for the "unavailable", can we not simply skip sending the CPU if the value 
is equal to unavailable?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-25 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168524#comment-15168524
 ] 

Varun Saxena commented on YARN-4712:


[~Naganarasimha],
We had decided in YARN-4053 that we wont support doubles/floats as of now 
because of the complexities involved due to its implementation.
Moreover, we found that most of the times floats would come for some metric 
which is represented as a percentage(%) or a ratio.
In this case the denominator is fixed so you can easily normalize the value and 
store it as an integer. And divide it by 100(if its a %) when you read it back.
Moreover, in case of percentages a precision higher than 2 decimal places, wont 
be mighty important.

We thought if user/client wanted to store a value/ratio where denominator is 
not fixed, he can merely store numerator and denominator separately and do the 
necessary computation after reading it back.

We had decided until and unless there was a strong use case, we will go with 
only longs for now.
And we could not think of any use cases for current metrics as of now.
And there is a way around it as well if somebody wants to store a float by 
normalizing the value and storing numerator and denominator separately.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>  Labels: yarn-2928-1st-milestone
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-24 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166740#comment-15166740
 ] 

Naganarasimha G R commented on YARN-4712:
-

One way to avoid ( in particular to CPU usage) is multiply with 100 and floor 
it and then type cast it to int. 
But we need to further think YARN-4053 's aproach is not a limitation for 
others to load the metrics as it doesnt support decimals !
cc /[~sjlee0] & [~varun_saxena]

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-24 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166682#comment-15166682
 ] 

Naganarasimha G R commented on YARN-4712:
-

Thanks [~sunilg],
Yes the first scenario is same as that jira, we should not proceed ahead with 
the calculations (divide by processors) if its -1 as usage, hope to see that 
jira to be committed.
2nd we need to discuss whether long is sufficient or we need to support double

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-22 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158230#comment-15158230
 ] 

Sunil G commented on YARN-4712:
---

Thanks Varun. Yes, such metrics has to be in double.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-22 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156642#comment-15156642
 ] 

Varun Saxena commented on YARN-4712:


I dont thnk its related to YARN-4308.
This is related to ATSv2 specific case wherein we are storing metrics as longs 
but this is double.
We need to discuss this though.

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928

2016-02-21 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156478#comment-15156478
 ] 

Sunil G commented on YARN-4712:
---

Hi [~Naganarasimha Garla]
For point 1, YARN-4308 was also trying to indicate a possible -ve CPU usage. Is 
it  similar?

> CPU Usage Metric is not captured properly in YARN-2928
> --
>
> Key: YARN-4712
> URL: https://issues.apache.org/jira/browse/YARN-4712
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Naganarasimha G R
>Assignee: Naganarasimha G R
>
> There are 2 issues with CPU usage collection 
> * I was able to observe that that many times CPU usage got from 
> {{pTree.getCpuUsagePercent()}} is 
> ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do 
> the calculation  i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore 
> /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE 
> check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not 
> encountered. so proper checks needs to be handled
> * {{EntityColumnPrefix.METRIC}} uses always LongConverter but 
> ContainerMonitor is publishing decimal values for the CPU usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)