[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0010-YARN-4308.patch Thanks [~templedf]. I have added messages for assert conditions. There were couple of methods in {{MockCPUResourceCalculatorProcessTree}} which were not needed for this test. I have removed them too as part of this new patch. [~Naganarasimha Garla], thank you for the thoughts. I had a similar line of thoughts before choosing this new Mock class for CPU resource util class. - There were 4 more methods which needed to be overridden along with {{getCPUPercentage}}. So a separate mock class was looking more cleaner and readable. - Currently we made some comments and java doc to point out that -1 has to be returned only in the first run where data is not enough to produce CPU usage. Still any new impl's can try to come with cases where -1 may be needed to return in b/w cycles also. So a separate mock class can help in writing all these specific test logics as needed. This is just done to make a CPU specific test class so that any mock work can be done as common. Its not very much of a strong reason,but it seems good to take it separate. More thoughts are welcome :) > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch, > 0003-YARN-4308.patch, 0004-YARN-4308.patch, 0005-YARN-4308.patch, > 0006-YARN-4308.patch, 0007-YARN-4308.patch, 0008-YARN-4308.patch, > 0009-YARN-4308.patch, 0010-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0009-YARN-4308.patch Fixed few checkstyle issues. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch, > 0003-YARN-4308.patch, 0004-YARN-4308.patch, 0005-YARN-4308.patch, > 0006-YARN-4308.patch, 0007-YARN-4308.patch, 0008-YARN-4308.patch, > 0009-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0008-YARN-4308.patch Thanks [~templedf] for pointing out java doc issue. Handled all comments and I have some points to add to the sleep issue which you mentioned. {{MonitoringThread}} iterates through the process tree of all running containers and sets its utilization. It also checks for memory over utilization and kills such containers. However in our current scenario, we are verifying only CPU values. I also wanted to avoid these sleeps in first place and wanted to verify based on some events or processed values. However we have only containerResourceUtilization value to look for as a change and a default value of {{ResourceUtilization.newInstance(0, 0, 0.0f)}} is set already. So if CPU readings are coming as 0, this will be still 0. Hence I can do this check only for the test case which I added as CPU value of 50 was returned by {{MockCPUResourceCalculatorProcessTree}}. We can see whether it can be generalized for similar cases in future. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch, > 0003-YARN-4308.patch, 0004-YARN-4308.patch, 0005-YARN-4308.patch, > 0006-YARN-4308.patch, 0007-YARN-4308.patch, 0008-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0007-YARN-4308.patch [~templedf], Thank you very much. Extremely sorry for that debug log, my bad. I added to debug some tests. Attaching new patch by removing unwanted logs, and also fixed checkstyle/javac warnings. Kindly help to check the same. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch, > 0003-YARN-4308.patch, 0004-YARN-4308.patch, 0005-YARN-4308.patch, > 0006-YARN-4308.patch, 0007-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0006-YARN-4308.patch I guess I missed one file in earlier patch. Reattaching an updated patch. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch, > 0003-YARN-4308.patch, 0004-YARN-4308.patch, 0005-YARN-4308.patch, > 0006-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0005-YARN-4308.patch Hi [~templedf] [~Naganarasimha Garla] I have added a test case to check whether UNAVAILABLE return value for CPU percentage is handled properly in ContainerMonitorImpl. Pls help to check the same. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch, > 0003-YARN-4308.patch, 0004-YARN-4308.patch, 0005-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0004-YARN-4308.patch Thanks [~Naganarasimha Garla] and [~templedf] Updating a new patch addressing the comments. Now using java doc comment in {{ResourceCalculatorProcessTree}} and its child classes. (not adding comments in test classes). Pls help to check the same and kindly let me know if any issues. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch, > 0003-YARN-4308.patch, 0004-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0003-YARN-4308.patch Updating new patch as per the discussion. Kindly help to check the same. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch, > 0003-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0002-YARN-4308.patch Thank you [~kasha]. Yes, I think your patch can go in w/o any change. I am re-attaching the same here. Only making some test case changes to make sure tests are passing. Thanks once again for sharing the thoughts and patch. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch, 0002-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4308) ContainersAggregated CPU resource utilization reports negative usage in first few heartbeats
[ https://issues.apache.org/jira/browse/YARN-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4308: -- Attachment: 0001-YARN-4308.patch Attaching a patch to handle this corner case. > ContainersAggregated CPU resource utilization reports negative usage in first > few heartbeats > > > Key: YARN-4308 > URL: https://issues.apache.org/jira/browse/YARN-4308 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4308.patch > > > NodeManager reports ContainerAggregated CPU resource utilization as -ve value > in first few heartbeats cycles. I added a new debug print and received below > values from heartbeats. > {noformat} > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: > ContainersResource Utilization : CpuTrackerUsagePercent : -1.0 > INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:ContainersResource > Utilization : CpuTrackerUsagePercent : 198.94598 > {noformat} > Its better we send 0 as CPU usage rather than sending a negative values in > heartbeats eventhough its happening in only first few heartbeats. -- This message was sent by Atlassian JIRA (v6.3.4#6332)