[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192771#comment-15192771 ] Sunil G commented on YARN-4807: --- Yes [~kasha]. Many of the waiteState has cumulative timeout value hardcoded and also per-round waitSecs also is hardcoded differently from method to method. So total no: of rounds to wait for resource allocations were found by (timeout/per-round-wait) value. As I see its, no: of times we wait OR total number of rounds also can contribute little bit to the duration of the tests. So if we are starting with 10ms, we can define how many rounds are also needed for a minimum try. With few dry-runs, a good figure could be calculated. bq.I have seen test cases where there is no reference to an RM or MockRM. Yes, there are places even from MokRM we use this. SO as you suggested, its better to focus on improving test time first and later some cleanup can be tried. > MockAM#waitForState sleep duration is too long > -- > > Key: YARN-4807 > URL: https://issues.apache.org/jira/browse/YARN-4807 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Yufei Gu > Labels: newbie > > MockAM#waitForState sleep duration (500 ms) is too long. Also, there is > significant duplication with MockRM#waitForState. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192753#comment-15192753 ] Varun Vasudev commented on YARN-4807: - I'd prefer to start with Karthik's approach. Let's make the sleep time as low as possible. > MockAM#waitForState sleep duration is too long > -- > > Key: YARN-4807 > URL: https://issues.apache.org/jira/browse/YARN-4807 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Yufei Gu > Labels: newbie > > MockAM#waitForState sleep duration (500 ms) is too long. Also, there is > significant duplication with MockRM#waitForState. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4773) Log aggregation performs extraneous filesystem operations when rolling log aggregation is disabled
[ https://issues.apache.org/jira/browse/YARN-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192735#comment-15192735 ] Jun Gong commented on YARN-4773: hi [~bibinchundatt], if the aggregation restarts again, previous tmp file will be overridden because tmp file name does not change. > Log aggregation performs extraneous filesystem operations when rolling log > aggregation is disabled > -- > > Key: YARN-4773 > URL: https://issues.apache.org/jira/browse/YARN-4773 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Jun Gong >Priority: Minor > Attachments: YARN-4773.01.patch > > > I noticed when log aggregation occurs for an application the nodemanager is > listing the application's log directory in HDFS. Apparently this is for > removing old logs before uploading new ones. This is a wasteful operation > when rolling log aggregation is disabled, since there will be no prior logs > in HDFS -- aggregation only occurs once when rolling log aggregation is > disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4808) SchedulerNode can use a few more cosmetic changes
[ https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192693#comment-15192693 ] Inigo Goiri commented on YARN-4808: --- I'd say is slightly cleaner using just {{RMNode}} to hold utilization data and not having to update Not 100% sure we are updating everything but {{TestMiniYarnClusterNodeUtilization} from YARN-3980 should be checking that so we should be good. > SchedulerNode can use a few more cosmetic changes > - > > Key: YARN-4808 > URL: https://issues.apache.org/jira/browse/YARN-4808 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4808-1.patch > > > We have made some cosmetic changes to SchedulerNode recently. While working > on YARN-4511, realized we could improve it a little more: > # Remove volatile variables - don't see the need for them being volatile > # Some methods end up doing very similar things, so consolidating them > # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity > to include the un-utilized resources, and having two totals can be a little > confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4773) Log aggregation performs extraneous filesystem operations when rolling log aggregation is disabled
[ https://issues.apache.org/jira/browse/YARN-4773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192685#comment-15192685 ] Bibin A Chundatt commented on YARN-4773: [~hex108]/[~jlowe] {quote} however we do not need call AppLogAggregatorImpl#cleanOldLogs because there have been no containers' logs uploaded before. {quote} While aggregation was going on NM got killed and restarted again .Will thr be a possiblity for .tmp file exists in folder?? > Log aggregation performs extraneous filesystem operations when rolling log > aggregation is disabled > -- > > Key: YARN-4773 > URL: https://issues.apache.org/jira/browse/YARN-4773 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Jun Gong >Priority: Minor > Attachments: YARN-4773.01.patch > > > I noticed when log aggregation occurs for an application the nodemanager is > listing the application's log directory in HDFS. Apparently this is for > removing old logs before uploading new ones. This is a wasteful operation > when rolling log aggregation is disabled, since there will be no prior logs > in HDFS -- aggregation only occurs once when rolling log aggregation is > disabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3054) Preempt policy in FairScheduler may cause mapreduce job never finish
[ https://issues.apache.org/jira/browse/YARN-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192683#comment-15192683 ] Peng Zhang commented on YARN-3054: -- Thanks [~kasha] I agreed with "have a preemption priority or even a preemption cost per container". And in my temporary fix, I preempt latest scheduled container instead of low or high priority containers. I think this will make containers for a mount of resources(at least steady fair share) steady. So MapReduce job progress will proceed. > Preempt policy in FairScheduler may cause mapreduce job never finish > > > Key: YARN-3054 > URL: https://issues.apache.org/jira/browse/YARN-3054 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Affects Versions: 2.6.0 >Reporter: Peng Zhang > > Preemption policy is related with schedule policy now. Using comparator of > schedule policy to find preemption candidate cannot guarantee a subset of > containers never be preempted. And this may cause tasks to be preempted > periodically before they finish. So job cannot make any progress. > I think preemption in YARN should got below assurance: > 1. Mapreduce jobs can get additional resources when others are idle; > 2. Mapreduce jobs for one user in one queue can still progress with its min > share when others preempt resources back. > Maybe always preempt the latest app and container can get this? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4808) SchedulerNode can use a few more cosmetic changes
[ https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192599#comment-15192599 ] Hadoop QA commented on YARN-4808: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 2s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 1 new + 259 unchanged - 1 fixed = 260 total (was 260) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 38s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 72m 8s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 159m 50s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker |
[jira] [Commented] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192582#comment-15192582 ] Shiwei Guo commented on YARN-3933: -- I noticed this, will fix it soon. > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.6.0, 2.7.0, 2.5.2, 2.7.1 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Attachments: YARN-3933.001.patch, YARN-3933.002.patch > > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4809) De-duplicate container completion across schedulers
Karthik Kambatla created YARN-4809: -- Summary: De-duplicate container completion across schedulers Key: YARN-4809 URL: https://issues.apache.org/jira/browse/YARN-4809 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Karthik Kambatla CapacityScheduler and FairScheduler implement containerCompleted the exact same way. Duplication across the schedulers can be avoided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4808) SchedulerNode can use a few more cosmetic changes
[ https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-4808: --- Attachment: yarn-4808-1.patch > SchedulerNode can use a few more cosmetic changes > - > > Key: YARN-4808 > URL: https://issues.apache.org/jira/browse/YARN-4808 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4808-1.patch > > > We have made some cosmetic changes to SchedulerNode recently. While working > on YARN-4511, realized we could improve it a little more: > # Remove volatile variables - don't see the need for them being volatile > # Some methods end up doing very similar things, so consolidating them > # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity > to include the un-utilized resources, and having two totals can be a little > confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4808) SchedulerNode can use a few more cosmetic changes
[ https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192557#comment-15192557 ] Karthik Kambatla commented on YARN-4808: [~elgoiri], [~leftnoteasy] - could you guys take a quick look? > SchedulerNode can use a few more cosmetic changes > - > > Key: YARN-4808 > URL: https://issues.apache.org/jira/browse/YARN-4808 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4808-1.patch > > > We have made some cosmetic changes to SchedulerNode recently. While working > on YARN-4511, realized we could improve it a little more: > # Remove volatile variables - don't see the need for them being volatile > # Some methods end up doing very similar things, so consolidating them > # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity > to include the un-utilized resources, and having two totals can be a little > confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4808) SchedulerNode can use a few more cosmetic changes
[ https://issues.apache.org/jira/browse/YARN-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-4808: --- Attachment: yarn-4808-1.patch > SchedulerNode can use a few more cosmetic changes > - > > Key: YARN-4808 > URL: https://issues.apache.org/jira/browse/YARN-4808 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4808-1.patch > > > We have made some cosmetic changes to SchedulerNode recently. While working > on YARN-4511, realized we could improve it a little more: > # Remove volatile variables - don't see the need for them being volatile > # Some methods end up doing very similar things, so consolidating them > # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity > to include the un-utilized resources, and having two totals can be a little > confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4808) SchedulerNode can use a few more cosmetic changes
Karthik Kambatla created YARN-4808: -- Summary: SchedulerNode can use a few more cosmetic changes Key: YARN-4808 URL: https://issues.apache.org/jira/browse/YARN-4808 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.8.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla We have made some cosmetic changes to SchedulerNode recently. While working on YARN-4511, realized we could improve it a little more: # Remove volatile variables - don't see the need for them being volatile # Some methods end up doing very similar things, so consolidating them # Renaming totalResource to capacity. YARN-4511 plans to add inflatedCapacity to include the un-utilized resources, and having two totals can be a little confusing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1297) Miscellaneous Fair Scheduler speedups
[ https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192547#comment-15192547 ] Yufei Gu commented on YARN-1297: Thanks, [~asuresh]. > Miscellaneous Fair Scheduler speedups > - > > Key: YARN-1297 > URL: https://issues.apache.org/jira/browse/YARN-1297 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Sandy Ryza >Assignee: Yufei Gu > Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, > YARN-1297.4.patch, YARN-1297.4.patch, YARN-1297.patch, YARN-1297.patch > > > I ran the Fair Scheduler's core scheduling loop through a profiler tool and > identified a bunch of minimally invasive changes that can shave off a few > milliseconds. > The main one is demoting a couple INFO log messages to DEBUG, which brought > my benchmark down from 16000 ms to 6000. > A few others (which had way less of an impact) were > * Most of the time in comparisons was being spent in Math.signum. I switched > this to direct ifs and elses and it halved the percent of time spent in > comparisons. > * I removed some unnecessary instantiations of Resource objects > * I made it so that queues' usage wasn't calculated from the applications up > each time getResourceUsage was called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192416#comment-15192416 ] Karthik Kambatla commented on YARN-4807: Actually, the problem seems worse. We have multiple {{waitForState}} methods in MockRM - for app, appattempt, container. There is some duplication there, but what is worse each method has its own sleepTime and cumulative_wait_time. It would be nice to use a constant for sleep time instead of hard coded values. If it is not a problem, I would like for us to start with a really small value like 10 ms. It seems to be working for one of the {{waitForState}} methods in MockRM. Regarding removing MockAM#waitForState methods altogether, I am not so sure if that will be straight-forward or desired. I have seen test cases where there is no reference to an RM or MockRM. > MockAM#waitForState sleep duration is too long > -- > > Key: YARN-4807 > URL: https://issues.apache.org/jira/browse/YARN-4807 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Yufei Gu > Labels: newbie > > MockAM#waitForState sleep duration (500 ms) is too long. Also, there is > significant duplication with MockRM#waitForState. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4795) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton updated YARN-4795: --- Attachment: YARN-4795.001.patch > ContainerMetrics drops records > -- > > Key: YARN-4795 > URL: https://issues.apache.org/jira/browse/YARN-4795 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Attachments: YARN-4795.001.patch > > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4802) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4802. Resolution: Duplicate > ContainerMetrics drops records > -- > > Key: YARN-4802 > URL: https://issues.apache.org/jira/browse/YARN-4802 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4803) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4803. Resolution: Duplicate > ContainerMetrics drops records > -- > > Key: YARN-4803 > URL: https://issues.apache.org/jira/browse/YARN-4803 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4801) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4801. Resolution: Duplicate > ContainerMetrics drops records > -- > > Key: YARN-4801 > URL: https://issues.apache.org/jira/browse/YARN-4801 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4796) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4796. Resolution: Duplicate > ContainerMetrics drops records > -- > > Key: YARN-4796 > URL: https://issues.apache.org/jira/browse/YARN-4796 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4800) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4800. Resolution: Duplicate > ContainerMetrics drops records > -- > > Key: YARN-4800 > URL: https://issues.apache.org/jira/browse/YARN-4800 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4797) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4797. Resolution: Duplicate > ContainerMetrics drops records > -- > > Key: YARN-4797 > URL: https://issues.apache.org/jira/browse/YARN-4797 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4798) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4798. Resolution: Duplicate > ContainerMetrics drops records > -- > > Key: YARN-4798 > URL: https://issues.apache.org/jira/browse/YARN-4798 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4799) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4799. Resolution: Duplicate > ContainerMetrics drops records > -- > > Key: YARN-4799 > URL: https://issues.apache.org/jira/browse/YARN-4799 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4545) Allow YARN distributed shell to use ATS v1.5 APIs
[ https://issues.apache.org/jira/browse/YARN-4545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192356#comment-15192356 ] Steve Loughran commented on YARN-4545: -- LGTM, +1 > Allow YARN distributed shell to use ATS v1.5 APIs > - > > Key: YARN-4545 > URL: https://issues.apache.org/jira/browse/YARN-4545 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-4545-YARN-4265.001.patch, > YARN-4545-trunk.001.patch, YARN-4545-trunk.002.patch, > YARN-4545-trunk.003.patch, YARN-4545-trunk.004.patch, > YARN-4545-trunk.005.patch, YARN-4545-trunk.006.patch, > YARN-4545-trunk.007.patch, YARN-4545-trunk.008.patch, > YARN-4545-trunk.009.patch > > > We can use YARN distributed shell as a demo for the ATS v1.5 APIs. We need to > allow distributed shell post data with ATS v1.5 API if 1.5 is enabled in the > system. We also need to provide a sample plugin to read those data out. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4678) Cluster used capacity is > 100 when container reserved
[ https://issues.apache.org/jira/browse/YARN-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192334#comment-15192334 ] Hadoop QA commented on YARN-4678: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: patch generated 8 new + 333 unchanged - 1 fixed = 341 total (was 334) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 20s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 71m 54s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 158m 24s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesForCSWithPartitions | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | |
[jira] [Commented] (YARN-4807) MockAM#waitForState sleep duration is too long
[ https://issues.apache.org/jira/browse/YARN-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192270#comment-15192270 ] Sunil G commented on YARN-4807: --- As I see, {{MockRM#waitForState}} is also considering 500ms to 1sec in various methods. I think that can be made smaller to 200ms. Recently myslef and [~rohithsharma] had worked on few random test cases failures in CS and all were fixed when we properly made use of {{waitForState}} with correct timeout. I think 200ms is more than enough, and we can remove MockAM method and consolidate all into MockRM. thoughts? > MockAM#waitForState sleep duration is too long > -- > > Key: YARN-4807 > URL: https://issues.apache.org/jira/browse/YARN-4807 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Yufei Gu > Labels: newbie > > MockAM#waitForState sleep duration (500 ms) is too long. Also, there is > significant duplication with MockRM#waitForState. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4751) In 2.7, Labeled queue usage not shown properly in capacity scheduler UI
[ https://issues.apache.org/jira/browse/YARN-4751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192268#comment-15192268 ] Sunil G commented on YARN-4751: --- Thanks [~eepayne] for pointing out the dependencies. Yes, its come with a chain of patches. We are trying to discuss in YARN-3216 about getting it ready for 2.7 line. If we can confirm that, I can make a clean patch with minimal changes needed to get AM Resource Percent for labels and it can get almost all new changes what we have done recently. But its more a difference approach as we cant cherry pick the needed patch, so it may be complex 2.7 line later. I would like to hear your thoughts on that also. Coming to this patch, I think we needed only to verify the resource usages in NodeLabel scenario. In trunk {{TestCapacitySchedulerNodeLabelUpdate}} covers various cases. And it uses {{NullRMNodeLabelsManager}}, so many events wont be done and it ll be good w.r.t unit test cases. Could we consider this test class? > In 2.7, Labeled queue usage not shown properly in capacity scheduler UI > --- > > Key: YARN-4751 > URL: https://issues.apache.org/jira/browse/YARN-4751 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 2.7.3 >Reporter: Eric Payne >Assignee: Eric Payne > Attachments: 2.7 CS UI No BarGraph.jpg, > YARH-4752-branch-2.7.001.patch, YARH-4752-branch-2.7.002.patch > > > In 2.6 and 2.7, the capacity scheduler UI does not have the queue graphs > separated by partition. When applications are running on a labeled queue, no > color is shown in the bar graph, and several of the "Used" metrics are zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4678) Cluster used capacity is > 100 when container reserved
[ https://issues.apache.org/jira/browse/YARN-4678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4678: -- Attachment: 0003-YARN-4678.patch Updating patch by removing reservedCapacity from LeafQueue also. Hence this will be in line with ParentQueue / root. Since we have more information about reservedCapacity in ClusterMetrics and in QueueInfo area, I think this will be fine. [~brahmareddy], could you pls confirm if its fine. > Cluster used capacity is > 100 when container reserved > --- > > Key: YARN-4678 > URL: https://issues.apache.org/jira/browse/YARN-4678 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Sunil G > Attachments: 0001-YARN-4678.patch, 0002-YARN-4678.patch, > 0003-YARN-4678.patch > > > *Scenario:* > * Start cluster with Three NM's each having 8GB (cluster memory:24GB). > * Configure queues with elasticity and userlimitfactor=10. > * disable pre-emption. > * run two job with different priority in different queue at the same time > ** yarn jar hadoop-mapreduce-examples-2.7.2.jar pi -Dyarn.app.priority=LOW > -Dmapreduce.job.queuename=QueueA -Dmapreduce.map.memory.mb=4096 > -Dyarn.app.mapreduce.am.resource.mb=1536 > -Dmapreduce.job.reduce.slowstart.completedmaps=1.0 10 1 > ** yarn jar hadoop-mapreduce-examples-2.7.2.jar pi -Dyarn.app.priority=HIGH > -Dmapreduce.job.queuename=QueueB -Dmapreduce.map.memory.mb=4096 > -Dyarn.app.mapreduce.am.resource.mb=1536 3 1 > * observe the cluster capacity which was used in RM web UI -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4781) Support intra-queue preemption for fairness ordering policy.
[ https://issues.apache.org/jira/browse/YARN-4781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192255#comment-15192255 ] Sunil G commented on YARN-4781: --- Thanks [~leftnoteasy] for sharing more thoughts on same. IIUC, this will majorly deal inbalances within a queue w.r.t resource usage when FairOrderingPolicy is used, correct?. I think this will be very helpful, So may be existing PCPP will take of identifying this inbalances or a new SchedulerMonitor will be created for same. If a new SchedulerMonitor is planned, YARN-2009 also can make use of the same. Thoughts? > Support intra-queue preemption for fairness ordering policy. > > > Key: YARN-4781 > URL: https://issues.apache.org/jira/browse/YARN-4781 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > > We introduced fairness queue policy since YARN-3319, which will let large > applications make progresses and not starve small applications. However, if a > large application takes the queue’s resources, and containers of the large > app has long lifespan, small applications could still wait for resources for > long time and SLAs cannot be guaranteed. > Instead of wait for application release resources on their own, we need to > preempt resources of queue with fairness policy enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928
[ https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192239#comment-15192239 ] Hadoop QA commented on YARN-4712: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 21s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 56s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: patch generated 1 new + 22 unchanged - 2 fixed = 23 total (was 24) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 34s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 6s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 30s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12793186/YARN-4712-YARN-2928.v1.005.patch | | JIRA Issue | YARN-4712 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 6d596507dcba 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (YARN-4712) CPU Usage Metric is not captured properly in YARN-2928
[ https://issues.apache.org/jira/browse/YARN-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4712: Attachment: YARN-4712-YARN-2928.v1.005.patch Hi [~sjlee0] & [~varun_saxena], Considering the discussions till now for aggregation {{cpuUsagePercentPerCore}} would be ideal for aggregation. I have uploaded a new patch doing the same. If anything else needs to be captured please inform. > CPU Usage Metric is not captured properly in YARN-2928 > -- > > Key: YARN-4712 > URL: https://issues.apache.org/jira/browse/YARN-4712 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4712-YARN-2928.v1.001.patch, > YARN-4712-YARN-2928.v1.002.patch, YARN-4712-YARN-2928.v1.003.patch, > YARN-4712-YARN-2928.v1.004.patch, YARN-4712-YARN-2928.v1.005.patch > > > There are 2 issues with CPU usage collection > * I was able to observe that that many times CPU usage got from > {{pTree.getCpuUsagePercent()}} is > ResourceCalculatorProcessTree.UNAVAILABLE(i.e. -1) but ContainersMonitor do > the calculation i.e. {{cpuUsageTotalCoresPercentage = cpuUsagePercentPerCore > /resourceCalculatorPlugin.getNumProcessors()}} because of which UNAVAILABLE > check in {{NMTimelinePublisher.reportContainerResourceUsage}} is not > encountered. so proper checks needs to be handled > * {{EntityColumnPrefix.METRIC}} uses always LongConverter but > ContainerMonitor is publishing decimal values for the CPU usage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4805) Don't go through all schedulers in ParameterizedTestBase
[ https://issues.apache.org/jira/browse/YARN-4805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192221#comment-15192221 ] Karthik Kambatla commented on YARN-4805: Comparing against earlier builds from today, this change seems to shave 22 minutes off the build. [~jianhe] - will you be able to review this? Thanks. > Don't go through all schedulers in ParameterizedTestBase > > > Key: YARN-4805 > URL: https://issues.apache.org/jira/browse/YARN-4805 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4805-1.patch > > > ParameterizedSchedulerTestBase was created to make sure tests that were > written with CapacityScheduler in mind don't fail when run against > FairScheduler. Before this was introduced, tests would fail because > FairScheduler requires an allocation file. > However, the tests that extend it take about 10 minutes per scheduler. So, > instead of running against both schedulers, we could setup the scheduler > appropriately so the tests pass against both schedulers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4796) ContainerMetrics drops records
[ https://issues.apache.org/jira/browse/YARN-4796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192215#comment-15192215 ] Varun Vasudev commented on YARN-4796: - [~templedf] - I think this got filed twice by mistake. YARN-4795 seems to be the same issue. > ContainerMetrics drops records > -- > > Key: YARN-4796 > URL: https://issues.apache.org/jira/browse/YARN-4796 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > The metrics2 system was implemented to deal with persistent sources. > {{ContainerMetrics}} is an ephemeral source, and so it causes problems. > Specifically, the {{ContainerMetrics}} only reports metrics once after the > container has been stopped. This behavior is a problem because the metrics2 > system can ask sources for reports that will be quietly dropped by the sinks > that care. (It's a metrics2 feature, not a bug.) If that final report is > silently dropped, it's lost, because the {{ContainerMetrics}} won't report > anything else ever anymore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3122) Metrics for container's actual CPU usage
[ https://issues.apache.org/jira/browse/YARN-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192201#comment-15192201 ] Naganarasimha G R commented on YARN-3122: - Any thoughts on the above comment [~adhoot] & [~kasha]. Is it a bug ? > Metrics for container's actual CPU usage > > > Key: YARN-3122 > URL: https://issues.apache.org/jira/browse/YARN-3122 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Anubhav Dhoot >Assignee: Anubhav Dhoot > Fix For: 2.7.0 > > Attachments: YARN-3122.001.patch, YARN-3122.002.patch, > YARN-3122.003.patch, YARN-3122.004.patch, YARN-3122.005.patch, > YARN-3122.006.patch, YARN-3122.007.patch, YARN-3122.prelim.patch, > YARN-3122.prelim.patch > > > It would be nice to capture resource usage per container, for a variety of > reasons. This JIRA is to track CPU usage. > YARN-2965 tracks the resource usage on the node, and the two implementations > should reuse code as much as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)