[jira] [Updated] (YARN-3508) Preemption processing occuring on the main RM dispatcher
[ https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3508: --- Attachment: (was: YARN-3508.01.patch) Preemption processing occuring on the main RM dispatcher Key: YARN-3508 URL: https://issues.apache.org/jira/browse/YARN-3508 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Varun Saxena We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event queue. The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. Preemption processing should occur on the scheduler event dispatcher thread or a separate thread to avoid delaying the processing of other events in the primary dispatcher queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3706) Generalize native HBase writer for additional tables
[ https://issues.apache.org/jira/browse/YARN-3706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563824#comment-14563824 ] Joep Rottinghuis commented on YARN-3706: Just realized an additional problem. If we use the same separator character for compound column qualifiers (e!eventId!eventInfoKey) as well as in separator for values (relatedToKey-id7!id8!id9) then we have problems recognizing what is part of the key and what is the value. I think I have to introduce a value separator and add an additional argument to the cleanse method for this Generalize native HBase writer for additional tables Key: YARN-3706 URL: https://issues.apache.org/jira/browse/YARN-3706 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Joep Rottinghuis Assignee: Joep Rottinghuis Priority: Minor Attachments: YARN-3706-YARN-2928.001.patch When reviewing YARN-3411 we noticed that we could change the class hierarchy a little in order to accommodate additional tables easily. In order to get ready for benchmark testing we left the original layout in place, as performance would not be impacted by the code hierarchy. Here is a separate jira to address the hierarchy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3508) Preemption processing occuring on the main RM dispatcher
[ https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3508: --- Attachment: YARN-3508.01.patch Preemption processing occuring on the main RM dispatcher Key: YARN-3508 URL: https://issues.apache.org/jira/browse/YARN-3508 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Varun Saxena Attachments: YARN-3508.01.patch We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event queue. The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. Preemption processing should occur on the scheduler event dispatcher thread or a separate thread to avoid delaying the processing of other events in the primary dispatcher queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3736) Persist the Plan information, ie. accepted reservations to the RMStateStore for failover
Subru Krishnan created YARN-3736: Summary: Persist the Plan information, ie. accepted reservations to the RMStateStore for failover Key: YARN-3736 URL: https://issues.apache.org/jira/browse/YARN-3736 Project: Hadoop YARN Issue Type: Sub-task Reporter: Subru Krishnan Assignee: Anubhav Dhoot We need to persist the current state of the plan, i.e. the accepted ReservationAllocations corresponding RLESpareseResourceAllocations to the RMStateStore so that we can recover them on RM failover. This involves making all the reservation system data structures protobuf friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3737) Add support for recovery of reserved apps (running under dynamic queues) to Fair Scheduler
[ https://issues.apache.org/jira/browse/YARN-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-3737: - Component/s: (was: capacityscheduler) Description: YARN-3736 persists the current state of the Plan to the RMStateStore. This JIRA covers recovery of the Plan, i.e. dynamic reservation queues with associated apps as part Fair Scheduler failover mechanism. (was: We need to persist the current state of the plan, i.e. the accepted ReservationAllocations corresponding RLESpareseResourceAllocations to the RMStateStore so that we can recover them on RM failover. This involves making all the reservation system data structures protobuf friendly.) Summary: Add support for recovery of reserved apps (running under dynamic queues) to Fair Scheduler (was: Persist the Plan information, ie. accepted reservations to the RMStateStore for failover) Add support for recovery of reserved apps (running under dynamic queues) to Fair Scheduler -- Key: YARN-3737 URL: https://issues.apache.org/jira/browse/YARN-3737 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler, resourcemanager Reporter: Subru Krishnan Assignee: Anubhav Dhoot YARN-3736 persists the current state of the Plan to the RMStateStore. This JIRA covers recovery of the Plan, i.e. dynamic reservation queues with associated apps as part Fair Scheduler failover mechanism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3508) Preemption processing occuring on the main RM dispatcher
[ https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3508: --- Attachment: YARN-3508.01.patch Preemption processing occuring on the main RM dispatcher Key: YARN-3508 URL: https://issues.apache.org/jira/browse/YARN-3508 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Varun Saxena Attachments: YARN-3508.01.patch We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event queue. The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. Preemption processing should occur on the scheduler event dispatcher thread or a separate thread to avoid delaying the processing of other events in the primary dispatcher queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-3733: - Attachment: YARN-3733.patch Attached the patch fixing the issue. Kindly review the patch. On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Blocker Attachments: YARN-3733.patch Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3739) Add recovery of reservation system to RM failover process
Subru Krishnan created YARN-3739: Summary: Add recovery of reservation system to RM failover process Key: YARN-3739 URL: https://issues.apache.org/jira/browse/YARN-3739 Project: Hadoop YARN Issue Type: Sub-task Reporter: Subru Krishnan Assignee: Subru Krishnan YARN-1051 introduced a reservation system in the YARN RM. This JIRA tracks the recovery of the reservation system in case of a RM failover. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563720#comment-14563720 ] Sangjin Lee commented on YARN-2556: --- [~lichangleo], thanks much for the latest patch! I feel that it's real close. I like that it's much more modular and can support v2 more easily. I do have a few more comments and suggestions. (TimelineServicePerformance.java) - l.35-36: there should be no timeline type imports here; unused imports? - l.201-212: it's an oversight, but this belongs in the SimpleEntityWriter class, and should be moved there (JobHistoryFileReplayMapperV1.java) - Can you refactor this as much as possible so that v1 and v2 do not duplicate any shared code? For example, JobFiles, constants, and some operations inside map() are clearly common between v1 and v2. It might involve extracting some common (helper) methods. - l.162: We found this with the v2 code, but there is a bug here: it is possible that the JobFiles instance may not have both the jobhistory file and the configuration file. We should skip processing the JobFiles instance if either is null. Tool to measure the performance of the timeline server -- Key: YARN-2556 URL: https://issues.apache.org/jira/browse/YARN-2556 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Chang Li Labels: BB2015-05-TBR Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.11.patch, YARN-2556.2.patch, YARN-2556.3.patch, YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity. I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start. This could be done as an example or test job that could be tied into gridmix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3716) Node-label-expression should be included by ResourceRequestPBImpl.toString
[ https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3716: - Target Version/s: 2.8.0 Summary: Node-label-expression should be included by ResourceRequestPBImpl.toString (was: Output node label expression in ResourceRequestPBImpl.toString) Node-label-expression should be included by ResourceRequestPBImpl.toString -- Key: YARN-3716 URL: https://issues.apache.org/jira/browse/YARN-3716 Project: Hadoop YARN Issue Type: Sub-task Components: api Reporter: Xianyin Xin Assignee: Xianyin Xin Priority: Minor Attachments: YARN-3716.001.patch It's convenient for debug and log trace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563769#comment-14563769 ] Hadoop QA commented on YARN-3733: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 14s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 28s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 53s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 33s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 1m 56s | Tests failed in hadoop-yarn-common. | | | | 40m 21s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.util.resource.TestResourceCalculator | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735960/YARN-3733.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / ae14543 | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8119/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8119/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8119/console | This message was automatically generated. On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Blocker Attachments: YARN-3733.patch Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3737) Persist the Plan information, ie. accepted reservations to the RMStateStore for failover
Subru Krishnan created YARN-3737: Summary: Persist the Plan information, ie. accepted reservations to the RMStateStore for failover Key: YARN-3737 URL: https://issues.apache.org/jira/browse/YARN-3737 Project: Hadoop YARN Issue Type: Sub-task Reporter: Subru Krishnan Assignee: Anubhav Dhoot We need to persist the current state of the plan, i.e. the accepted ReservationAllocations corresponding RLESpareseResourceAllocations to the RMStateStore so that we can recover them on RM failover. This involves making all the reservation system data structures protobuf friendly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3508) Preemption processing occuring on the main RM dispatcher
[ https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3508: --- Attachment: YARN-3508.01.patch Preemption processing occuring on the main RM dispatcher Key: YARN-3508 URL: https://issues.apache.org/jira/browse/YARN-3508 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Varun Saxena Attachments: YARN-3508.01.patch We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event queue. The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. Preemption processing should occur on the scheduler event dispatcher thread or a separate thread to avoid delaying the processing of other events in the primary dispatcher queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3508) Preemption processing occuring on the main RM dispatcher
[ https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3508: --- Attachment: (was: YARN-3508.01.patch) Preemption processing occuring on the main RM dispatcher Key: YARN-3508 URL: https://issues.apache.org/jira/browse/YARN-3508 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Varun Saxena Attachments: YARN-3508.01.patch We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event queue. The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. Preemption processing should occur on the scheduler event dispatcher thread or a separate thread to avoid delaying the processing of other events in the primary dispatcher queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3017) ContainerID in ResourceManager Log Has Slightly Different Format From AppAttemptID
[ https://issues.apache.org/jira/browse/YARN-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563646#comment-14563646 ] Hadoop QA commented on YARN-3017: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 3s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 40s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 49s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 56s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 9s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 21s | Tests passed in hadoop-yarn-api. | | {color:red}-1{color} | yarn tests | 1m 53s | Tests failed in hadoop-yarn-common. | | | | 45m 28s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.logaggregation.TestAggregatedLogsBlock | | | hadoop.yarn.util.TestConverterUtils | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735943/YARN-3017.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 94e7d49 | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8117/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8117/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8117/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8117/console | This message was automatically generated. ContainerID in ResourceManager Log Has Slightly Different Format From AppAttemptID -- Key: YARN-3017 URL: https://issues.apache.org/jira/browse/YARN-3017 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.8.0 Reporter: MUFEED USMAN Priority: Minor Labels: PatchAvailable Attachments: YARN-3017.patch Not sure if this should be filed as a bug or not. In the ResourceManager log in the events surrounding the creation of a new application attempt, ... ... 2014-11-14 17:45:37,258 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1412150883650_0001_02 ... ... The application attempt has the ID format _1412150883650_0001_02. Whereas the associated ContainerID goes by _1412150883650_0001_02_. ... ... 2014-11-14 17:45:37,260 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1412150883650_0001_02_01, NodeId: n67:55933, NodeHttpAddress: n67:8042, Resource: memory:2048, vCores:1, disks:0.0, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.70.67:55933 }, ] for AM appattempt_1412150883650_0001_02 ... ... Curious to know if this is kept like that for a reason. If not while using filtering tools to, say, grep events surrounding a specific attempt by the numeric ID part information may slip out during troubleshooting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3738) Add support for recovery of reserved apps (running under dynamic queues) to Capacity Scheduler
Subru Krishnan created YARN-3738: Summary: Add support for recovery of reserved apps (running under dynamic queues) to Capacity Scheduler Key: YARN-3738 URL: https://issues.apache.org/jira/browse/YARN-3738 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler, resourcemanager Reporter: Subru Krishnan Assignee: Subru Krishnan YARN-3736 persists the current state of the Plan to the RMStateStore. This JIRA covers recovery of the Plan, i.e. dynamic reservation queues with associated apps as part Capacity Scheduler failover mechanism. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3560) Not able to navigate to the cluster from tracking url (proxy) generated after submission of job
[ https://issues.apache.org/jira/browse/YARN-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563661#comment-14563661 ] Hadoop QA commented on YARN-3560: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 6m 12s | Pre-patch trunk has 1 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 33s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 19s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 36s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 6s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | mapreduce tests | 9m 17s | Tests passed in hadoop-mapreduce-client-app. | | | | 27m 15s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735941/YARN-3560.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 94e7d49 | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8118/artifact/patchprocess/trunkFindbugsWarningshadoop-mapreduce-client-app.html | | hadoop-mapreduce-client-app test log | https://builds.apache.org/job/PreCommit-YARN-Build/8118/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8118/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8118/console | This message was automatically generated. Not able to navigate to the cluster from tracking url (proxy) generated after submission of job --- Key: YARN-3560 URL: https://issues.apache.org/jira/browse/YARN-3560 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Anushri Priority: Minor Attachments: YARN-3560.patch, YARN-3560.patch a standalone web proxy server is enabled in the cluster when a job is submitted the url generated contains proxy track this url in the web page , if we try to navigate to the cluster links [about. applications, or scheduler] it gets redirected to some default port instead of actual RM web port configured as such it throws webpage not available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563662#comment-14563662 ] Hadoop QA commented on YARN-2556: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 6m 58s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 6 new or modified test files. | | {color:green}+1{color} | javac | 8m 47s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 39s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 42s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 55s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | mapreduce tests | 86m 41s | Tests failed in hadoop-mapreduce-client-jobclient. | | | | 106m 46s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.mapred.join.TestDatamerge | | | hadoop.mapred.TestLazyOutput | | | hadoop.mapred.TestReduceFetch | | | hadoop.mapred.TestJobName | | Timed out tests | org.apache.hadoop.mapreduce.v2.TestMRJobs | | | org.apache.hadoop.mapred.TestMRIntermediateDataEncryption | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735924/YARN-2556.11.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 246cefa | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/8116/artifact/patchprocess/whitespace.txt | | hadoop-mapreduce-client-jobclient test log | https://builds.apache.org/job/PreCommit-YARN-Build/8116/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8116/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8116/console | This message was automatically generated. Tool to measure the performance of the timeline server -- Key: YARN-2556 URL: https://issues.apache.org/jira/browse/YARN-2556 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Chang Li Labels: BB2015-05-TBR Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.11.patch, YARN-2556.2.patch, YARN-2556.3.patch, YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity. I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start. This could be done as an example or test job that could be tied into gridmix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3723) Need to clearly document primaryFilter and otherInfo value type
[ https://issues.apache.org/jira/browse/YARN-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563298#comment-14563298 ] Hudson commented on YARN-3723: -- FAILURE: Integrated in Hadoop-trunk-Commit #7916 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7916/]) YARN-3723. Need to clearly document primaryFilter and otherInfo value (xgong: rev 3077c299da4c5142503c9f92aad4b82349b2fde2) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md Need to clearly document primaryFilter and otherInfo value type --- Key: YARN-3723 URL: https://issues.apache.org/jira/browse/YARN-3723 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Fix For: 2.7.1 Attachments: YARN-3723.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3585) NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled
[ https://issues.apache.org/jira/browse/YARN-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-3585: - Attachment: YARN-3585.patch NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled -- Key: YARN-3585 URL: https://issues.apache.org/jira/browse/YARN-3585 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Rohith Priority: Critical Attachments: YARN-3585.patch With NM recovery enabled, after decommission, nodemanager log show stop but process cannot end. non daemon thread: {noformat} DestroyJavaVM prio=10 tid=0x7f3460011800 nid=0x29ec waiting on condition [0x] leveldb prio=10 tid=0x7f3354001800 nid=0x2a97 runnable [0x] VM Thread prio=10 tid=0x7f3460167000 nid=0x29f8 runnable Gang worker#0 (Parallel GC Threads) prio=10 tid=0x7f346002 nid=0x29ed runnable Gang worker#1 (Parallel GC Threads) prio=10 tid=0x7f3460022000 nid=0x29ee runnable Gang worker#2 (Parallel GC Threads) prio=10 tid=0x7f3460024000 nid=0x29ef runnable Gang worker#3 (Parallel GC Threads) prio=10 tid=0x7f3460025800 nid=0x29f0 runnable Gang worker#4 (Parallel GC Threads) prio=10 tid=0x7f3460027800 nid=0x29f1 runnable Gang worker#5 (Parallel GC Threads) prio=10 tid=0x7f3460029000 nid=0x29f2 runnable Gang worker#6 (Parallel GC Threads) prio=10 tid=0x7f346002b000 nid=0x29f3 runnable Gang worker#7 (Parallel GC Threads) prio=10 tid=0x7f346002d000 nid=0x29f4 runnable Concurrent Mark-Sweep GC Thread prio=10 tid=0x7f3460120800 nid=0x29f7 runnable Gang worker#0 (Parallel CMS Threads) prio=10 tid=0x7f346011c800 nid=0x29f5 runnable Gang worker#1 (Parallel CMS Threads) prio=10 tid=0x7f346011e800 nid=0x29f6 runnable VM Periodic Task Thread prio=10 tid=0x7f346019f800 nid=0x2a01 waiting on condition {noformat} and jni leveldb thread stack {noformat} Thread 12 (Thread 0x7f33dd842700 (LWP 10903)): #0 0x003d8340b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x7f33dfce2a3b in leveldb::(anonymous namespace)::PosixEnv::BGThreadWrapper(void*) () from /tmp/libleveldbjni-64-1-6922178968300745716.8 #2 0x003d83407851 in start_thread () from /lib64/libpthread.so.0 #3 0x003d830e811d in clone () from /lib64/libc.so.6 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563404#comment-14563404 ] Chang Li commented on YARN-2556: [~sjlee0] Thanks so much for review and providing valuable suggestions! I have removed entityWriter from TimelineServicePerformance. For now we are going to have separate simpleEntityWriter and JobHistoryFileReplayMapper mappers files/classes for V1 and V2. Let me know what else I can try to make it better. Thanks! Tool to measure the performance of the timeline server -- Key: YARN-2556 URL: https://issues.apache.org/jira/browse/YARN-2556 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Chang Li Labels: BB2015-05-TBR Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.11.patch, YARN-2556.2.patch, YARN-2556.3.patch, YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity. I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start. This could be done as an example or test job that could be tied into gridmix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563475#comment-14563475 ] Sangjin Lee commented on YARN-3721: --- Another note: I see that there are used but undeclared direct dependencies (indicating we're getting them implicitly via transitive dependencies): {noformat} [WARNING] Used undeclared dependencies found: [WARNING]org.apache.hbase:hbase-common:jar:1.0.1:compile [WARNING]org.apache.hbase:hbase-server:test-jar:tests:1.0.1:compile [WARNING]commons-lang:commons-lang:jar:2.6:compile [WARNING]commons-cli:commons-cli:jar:1.2:compile {noformat} This is not a good practice for several reasons. I'll fix this at some point on our branch, but it doesn't need to be done as part of this JIRA. build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch, YARN-3721-YARN-2928.002.patch, YARN-3721-YARN-2928.002.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3725) App submission via REST API is broken in secure mode due to Timeline DT service address is empty
[ https://issues.apache.org/jira/browse/YARN-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563268#comment-14563268 ] Zhijie Shen commented on YARN-3725: --- [~jeagles], would you please take a look at this jira? App submission via REST API is broken in secure mode due to Timeline DT service address is empty Key: YARN-3725 URL: https://issues.apache.org/jira/browse/YARN-3725 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.7.0 Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Blocker Attachments: YARN-3725.1.patch YARN-2971 changes TimelineClient to use the service address from Timeline DT to renew the DT instead of configured address. This break the procedure of submitting an YARN app via REST API in the secure mode. The problem is that service address is set by the client instead of the server in Java code. REST API response is an encode token Sting, such that it's so inconvenient to deserialize it and set the service address and serialize it again. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)
[ https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2900: -- Target Version/s: (was: 2.6.1) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500) --- Key: YARN-2900 URL: https://issues.apache.org/jira/browse/YARN-2900 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2900-b2.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch Caused by: java.lang.NullPointerException at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118) at org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222) at org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) at org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218) ... 59 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563422#comment-14563422 ] Sangjin Lee commented on YARN-3721: --- I'll commit it if I don't hear from anyone in the next 30 minutes. build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch, YARN-3721-YARN-2928.002.patch, YARN-3721-YARN-2928.002.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated YARN-2556: --- Attachment: YARN-2556.11.patch Tool to measure the performance of the timeline server -- Key: YARN-2556 URL: https://issues.apache.org/jira/browse/YARN-2556 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Chang Li Labels: BB2015-05-TBR Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.11.patch, YARN-2556.2.patch, YARN-2556.3.patch, YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity. I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start. This could be done as an example or test job that could be tied into gridmix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563415#comment-14563415 ] Li Lu commented on YARN-3721: - Since YARN-3726 is in, maybe we can commit this one soon? [~zjshen] any further concerns? Thanks! build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch, YARN-3721-YARN-2928.002.patch, YARN-3721-YARN-2928.002.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3723) Need to clearly document primaryFilter and otherInfo value type
[ https://issues.apache.org/jira/browse/YARN-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563282#comment-14563282 ] Xuan Gong commented on YARN-3723: - Committed into trunk/branch-2/branch-2.7. Thanks, zhijie Need to clearly document primaryFilter and otherInfo value type --- Key: YARN-3723 URL: https://issues.apache.org/jira/browse/YARN-3723 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Fix For: 2.7.1 Attachments: YARN-3723.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3518) default rm/am expire interval should not less than default resourcemanager connect wait time
[ https://issues.apache.org/jira/browse/YARN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563322#comment-14563322 ] Hadoop QA commented on YARN-3518: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 13s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 9m 21s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 13m 33s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 30s | The applied patch generated 5 new checkstyle issues (total was 212, now 216). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 2m 2s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 5m 34s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 27s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 7m 14s | Tests passed in hadoop-yarn-client. | | {color:green}+1{color} | yarn tests | 2m 18s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 6m 37s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 68m 54s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735899/YARN-3518.004.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / f1cea9c | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8114/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8114/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8114/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8114/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8114/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8114/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8114/console | This message was automatically generated. default rm/am expire interval should not less than default resourcemanager connect wait time Key: YARN-3518 URL: https://issues.apache.org/jira/browse/YARN-3518 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Reporter: sandflee Assignee: sandflee Labels: BB2015-05-TBR, configuration, newbie Attachments: YARN-3518.001.patch, YARN-3518.002.patch, YARN-3518.003.patch, YARN-3518.004.patch take am for example, if am can't connect to RM, after am expire (600s), RM relaunch am, and there will be two am at the same time util resourcemanager connect max wait time(900s) passed. DEFAULT_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS = 15 * 60 * 1000; DEFAULT_RM_AM_EXPIRY_INTERVAL_MS = 60; DEFAULT_RM_NM_EXPIRY_INTERVAL_MS = 60; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3723) Need to clearly document primaryFilter and otherInfo value type
[ https://issues.apache.org/jira/browse/YARN-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563272#comment-14563272 ] Xuan Gong commented on YARN-3723: - +1 lgtm. Will commit Need to clearly document primaryFilter and otherInfo value type --- Key: YARN-3723 URL: https://issues.apache.org/jira/browse/YARN-3723 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Attachments: YARN-3723.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3585) NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled
[ https://issues.apache.org/jira/browse/YARN-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563474#comment-14563474 ] Hadoop QA commented on YARN-3585: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 16m 36s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 9m 13s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 9s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 28s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 29s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 2m 9s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 1m 2s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 1m 35s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 6m 40s | Tests failed in hadoop-yarn-server-nodemanager. | | | | 49m 27s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-nodemanager | | Failed unit tests | hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735922/YARN-3585.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 5df1fad | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8115/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8115/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8115/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8115/console | This message was automatically generated. NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled -- Key: YARN-3585 URL: https://issues.apache.org/jira/browse/YARN-3585 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Assignee: Rohith Priority: Critical Attachments: YARN-3585.patch With NM recovery enabled, after decommission, nodemanager log show stop but process cannot end. non daemon thread: {noformat} DestroyJavaVM prio=10 tid=0x7f3460011800 nid=0x29ec waiting on condition [0x] leveldb prio=10 tid=0x7f3354001800 nid=0x2a97 runnable [0x] VM Thread prio=10 tid=0x7f3460167000 nid=0x29f8 runnable Gang worker#0 (Parallel GC Threads) prio=10 tid=0x7f346002 nid=0x29ed runnable Gang worker#1 (Parallel GC Threads) prio=10 tid=0x7f3460022000 nid=0x29ee runnable Gang worker#2 (Parallel GC Threads) prio=10 tid=0x7f3460024000 nid=0x29ef runnable Gang worker#3 (Parallel GC Threads) prio=10 tid=0x7f3460025800 nid=0x29f0 runnable Gang worker#4 (Parallel GC Threads) prio=10 tid=0x7f3460027800 nid=0x29f1 runnable Gang worker#5 (Parallel GC Threads) prio=10 tid=0x7f3460029000 nid=0x29f2 runnable Gang worker#6 (Parallel GC Threads) prio=10 tid=0x7f346002b000 nid=0x29f3 runnable Gang worker#7 (Parallel GC Threads) prio=10 tid=0x7f346002d000 nid=0x29f4 runnable Concurrent Mark-Sweep GC Thread prio=10 tid=0x7f3460120800 nid=0x29f7 runnable Gang worker#0 (Parallel CMS Threads) prio=10 tid=0x7f346011c800 nid=0x29f5 runnable Gang worker#1 (Parallel CMS Threads) prio=10 tid=0x7f346011e800 nid=0x29f6 runnable VM Periodic Task Thread prio=10 tid=0x7f346019f800 nid=0x2a01 waiting on condition {noformat} and jni leveldb thread stack {noformat} Thread 12 (Thread 0x7f33dd842700 (LWP 10903)): #0 0x003d8340b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x7f33dfce2a3b in leveldb::(anonymous namespace)::PosixEnv::BGThreadWrapper(void*) () from /tmp/libleveldbjni-64-1-6922178968300745716.8 #2
[jira] [Updated] (YARN-3017) ContainerID in ResourceManager Log Has Slightly Different Format From AppAttemptID
[ https://issues.apache.org/jira/browse/YARN-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Shahid Khan updated YARN-3017: --- Attachment: YARN-3017.patch Please find the fix in the attached patch ContainerID in ResourceManager Log Has Slightly Different Format From AppAttemptID -- Key: YARN-3017 URL: https://issues.apache.org/jira/browse/YARN-3017 Project: Hadoop YARN Issue Type: Improvement Reporter: MUFEED USMAN Priority: Minor Attachments: YARN-3017.patch Not sure if this should be filed as a bug or not. In the ResourceManager log in the events surrounding the creation of a new application attempt, ... ... 2014-11-14 17:45:37,258 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1412150883650_0001_02 ... ... The application attempt has the ID format _1412150883650_0001_02. Whereas the associated ContainerID goes by _1412150883650_0001_02_. ... ... 2014-11-14 17:45:37,260 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1412150883650_0001_02_01, NodeId: n67:55933, NodeHttpAddress: n67:8042, Resource: memory:2048, vCores:1, disks:0.0, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.70.67:55933 }, ] for AM appattempt_1412150883650_0001_02 ... ... Curious to know if this is kept like that for a reason. If not while using filtering tools to, say, grep events surrounding a specific attempt by the numeric ID part information may slip out during troubleshooting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)
[ https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2900: -- Target Version/s: 2.7.1 Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500) --- Key: YARN-2900 URL: https://issues.apache.org/jira/browse/YARN-2900 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2900-b2.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch Caused by: java.lang.NullPointerException at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118) at org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222) at org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) at org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218) ... 59 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2900) Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500)
[ https://issues.apache.org/jira/browse/YARN-2900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563296#comment-14563296 ] Zhijie Shen commented on YARN-2900: --- [~mitdesai], are you still working on this jira? Any luck to find the cause of the problem mentioned before? Application (Attempt and Container) Not Found in AHS results in Internal Server Error (500) --- Key: YARN-2900 URL: https://issues.apache.org/jira/browse/YARN-2900 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2900-b2.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch, YARN-2900.patch Caused by: java.lang.NullPointerException at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.convertToApplicationReport(ApplicationHistoryManagerImpl.java:128) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerImpl.getApplication(ApplicationHistoryManagerImpl.java:118) at org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:222) at org.apache.hadoop.yarn.server.webapp.WebServices$2.run(WebServices.java:219) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) at org.apache.hadoop.yarn.server.webapp.WebServices.getApp(WebServices.java:218) ... 59 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2573) Integrate ReservationSystem with the RM failover mechanism
[ https://issues.apache.org/jira/browse/YARN-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Subru Krishnan updated YARN-2573: - Issue Type: Improvement (was: Sub-task) Parent: (was: YARN-2572) Integrate ReservationSystem with the RM failover mechanism -- Key: YARN-2573 URL: https://issues.apache.org/jira/browse/YARN-2573 Project: Hadoop YARN Issue Type: Improvement Components: capacityscheduler, fairscheduler, resourcemanager Reporter: Subru Krishnan Assignee: Subru Krishnan YARN-1051 introduces the ReservationSystem and the current implementation is completely in-memory based. YARN-149 brings in the notion of RM HA with a highly available state store. This JIRA proposes persisting the Plan into the RMStateStore and recovering it post RM failover -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3560) Not able to navigate to the cluster from tracking url (proxy) generated after submission of job
[ https://issues.apache.org/jira/browse/YARN-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Shahid Khan updated YARN-3560: --- Attachment: YARN-3560.patch The same implementation is already done. Attaching the patch file having the test case of the same implementation. Not able to navigate to the cluster from tracking url (proxy) generated after submission of job --- Key: YARN-3560 URL: https://issues.apache.org/jira/browse/YARN-3560 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Anushri Priority: Minor Attachments: YARN-3560.patch, YARN-3560.patch a standalone web proxy server is enabled in the cluster when a job is submitted the url generated contains proxy track this url in the web page , if we try to navigate to the cluster links [about. applications, or scheduler] it gets redirected to some default port instead of actual RM web port configured as such it throws webpage not available -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564218#comment-14564218 ] Devaraj K commented on YARN-3733: - Thanks [~bibinchundatt], [~rohithsharma] and [~sunilg] for reporting and fixing, appreciate your efforts. Some comments on the patch. 1. {code:xml} +if (Float.isNaN(l) Float.isNaN(r)) { + return 0; +} else if (Float.isNaN(l)) { + return -1; +} else if (Float.isNaN(r)) { + return 1; +} + +// TODO what if both l and r infinity? Should infinity compared? how? + {code} Here l and r are getting derived from lhs, rhs and clusterResource which are not infinite. Can we check for lhs/rhs emptiness and compare these before ending up with infinite values? 2. The newly added code is duplicated in two places, can you eliminate the duplicate code? 3. In the Test class, Can you add the message for all assertEquals() using this API. {code:xml} Assert.assertEquals(String message, expected, actual) {code} On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Blocker Attachments: YARN-3733.patch Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Description: Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. (was: Executing {{rmadmin -transitionToActive --forcemanual}} and {{rmadmin -transitionToActive --forcemanual}} in automatic-failover.enabled mode changes the active/standby state of ResouceManager while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion.) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Executing {{rmadmin -transitionToStandby --forcemanual}} in automatic-failover.enabled mode makes ResouceManager standby while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin in order to enable other candidates to promote, otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3722) Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils
[ https://issues.apache.org/jira/browse/YARN-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563955#comment-14563955 ] Masatake Iwasaki commented on YARN-3722: Thanks, [~devaraj.k]. Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils -- Key: YARN-3722 URL: https://issues.apache.org/jira/browse/YARN-3722 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Fix For: 2.8.0 Attachments: YARN-3722.001.patch The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to {{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to {{o.a.h.yarn.webapp.util}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564039#comment-14564039 ] Xianyin Xin commented on YARN-3547: --- Thanks [~kasha], [~leftnoteasy]. [~kasha], do you have any other concerns on YARN-3547.005.patch? FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch, YARN-3547.005.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3528) Tests with 12345 as hard-coded port break jenkins
[ https://issues.apache.org/jira/browse/YARN-3528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564091#comment-14564091 ] Robert Kanter commented on YARN-3528: - I don't think we can simply put 0 everywhere. In some places it looks like we set a config (e.g. NM address) and then some other code actually starts using the port based on the config. In that case, I think we can do something like this to choose a port: {code:java} public int getPort() throws RuntimeException { Random rand = new Random(); int port = -1; int tries = 0; while(port == -1) { ServerSocket s = null; try { int tryPort = 49152 + rand.nextInt(65535 - 49152); System.out.println(Using port + tryPort); s = new ServerSocket(tryPort); port = tryPort; } catch (IOException e) { tries++; if (tries = 10) { System.out.println(Port is already in use; giving up); throw new RuntimeException(e); } else { System.out.println(Port is already in use; trying again); } } finally { IOUtils.closeQuietly(s); } } return port; } {code} It's possible for something to steal the port between the time this method finds an open one and the time that we actually start using it, but it's not likely. Thoughts? Tests with 12345 as hard-coded port break jenkins - Key: YARN-3528 URL: https://issues.apache.org/jira/browse/YARN-3528 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0 Environment: ASF Jenkins Reporter: Steve Loughran Assignee: Brahma Reddy Battula Priority: Blocker Labels: test A lot of the YARN tests have hard-coded the port 12345 for their services to come up on. This makes it impossible to have scheduled or precommit tests to run consistently on the ASF jenkins hosts. Instead the tests fail regularly and appear to get ignored completely. A quick grep of 12345 shows up many places in the test suite where this practise has developed. * All {{BaseContainerManagerTest}} subclasses * {{TestNodeManagerShutdown}} * {{TestContainerManager}} + others This needs to be addressed through portscanning and dynamic port allocation. Please can someone do this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564128#comment-14564128 ] Karthik Kambatla commented on YARN-3547: Per recent discussion, let us go with the approach in YARN-3547.004.patch? Review comments for 004 patch: # What is the readLock protecting? If it is only runnableApps, we should release the lock as soon as we are done iterating through runnableApps. # Use {{(!pending.equals(Resources.none())}} instead of an elaborate check? # We can avoid importing {{Set}} by just using {{TreeSet}}. Since it is all local, that should be fine. FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch, YARN-3547.005.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3740) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS
[ https://issues.apache.org/jira/browse/YARN-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-3740: Attachment: YARN-3740.1.patch trivial patch without testcases Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS --- Key: YARN-3740 URL: https://issues.apache.org/jira/browse/YARN-3740 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, webapp, yarn Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-3740.1.patch YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not an incompatibility change since YARN-3700 is in 2.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563934#comment-14563934 ] Wangda Tan commented on YARN-3547: -- I'm fine with the approach suggested by [~kasha]. [~xinxianyin], you can figure out with Karthik about how to continue with the patch. FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch, YARN-3547.005.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3510) Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness
[ https://issues.apache.org/jira/browse/YARN-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch updated YARN-3510: -- Attachment: YARN-3510.2.patch Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness Key: YARN-3510 URL: https://issues.apache.org/jira/browse/YARN-3510 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Craig Welch Assignee: Craig Welch Attachments: YARN-3510.2.patch The ProportionalCapacityPreemptionPolicy preempts as many containers from applications as it can during it's preemption run. For fifo this makes sense, as it is prempting in reverse order therefore maintaining the primacy of the oldest. For fair ordering this does not have the desired effect - instead, it should preempt a number of containers from each application which maintains a fair balance /close to a fair balance between them -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3705) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state
[ https://issues.apache.org/jira/browse/YARN-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3705: --- Summary: forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state (was: forcemanual transition of RM active/standby state in automatic-failover mode should change elector state) forcemanual transitionToStandby in RM-HA automatic-failover mode should change elector state Key: YARN-3705 URL: https://issues.apache.org/jira/browse/YARN-3705 Project: Hadoop YARN Issue Type: Sub-task Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Executing {{rmadmin -transitionToActive --forcemanual}} and {{rmadmin -transitionToActive --forcemanual}} in automatic-failover.enabled mode changes the active/standby state of ResouceManager while keeping the state of ActiveStandbyElector. It should make elector to quit and rejoin otherwise forcemanual transition should not be allowed in automatic-failover mode in order to avoid confusion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3508) Preemption processing occuring on the main RM dispatcher
[ https://issues.apache.org/jira/browse/YARN-3508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564007#comment-14564007 ] Hadoop QA commented on YARN-3508: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 43s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:red}-1{color} | javac | 9m 9s | The applied patch generated 6 additional warning messages. | | {color:green}+1{color} | javadoc | 11m 16s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 30s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 55s | The applied patch generated 3 new checkstyle issues (total was 53, now 55). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 2m 12s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 39s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 1m 48s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 51m 54s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 96m 10s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-resourcemanager | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735991/YARN-3508.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 9acd24f | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/8120/artifact/patchprocess/diffJavacWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8120/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8120/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8120/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8120/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8120/console | This message was automatically generated. Preemption processing occuring on the main RM dispatcher Key: YARN-3508 URL: https://issues.apache.org/jira/browse/YARN-3508 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.6.0 Reporter: Jason Lowe Assignee: Varun Saxena Attachments: YARN-3508.01.patch We recently saw the RM for a large cluster lag far behind on the AsyncDispacher event queue. The AsyncDispatcher thread was consistently blocked on the highly-contended CapacityScheduler lock trying to dispatch preemption-related events for RMContainerPreemptEventDispatcher. Preemption processing should occur on the scheduler event dispatcher thread or a separate thread to avoid delaying the processing of other events in the primary dispatcher queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3218) Implement ConcatenatableAggregatedLogFormat Reader and Writer
[ https://issues.apache.org/jira/browse/YARN-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-3218: Attachment: YARN-3218.002.patch The 002 patch adds the {{ConcatenatableAggregatedLogFormat}} Reader and Writer. It's designed to have a similar API to the {{AggregatedLogFormat}} Reader and Writer to make replacing it easier. Implement ConcatenatableAggregatedLogFormat Reader and Writer - Key: YARN-3218 URL: https://issues.apache.org/jira/browse/YARN-3218 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.8.0 Reporter: Robert Kanter Assignee: Robert Kanter Attachments: YARN-3218.001.patch, YARN-3218.002.patch We need to create a Reader and Writer for the {{ConcatenatableAggregatedLogFormat}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564006#comment-14564006 ] Naganarasimha G R commented on YARN-3044: - Hi [~zjshen], Seems like test case failures are not related to this jira and find bugs reported is not of significance and earlier code had the same approach , can you please take a look @ the latest patch. /cc [~vrushalic], Test code failure in TestHBaseTimelineWriterImpl.testWriteEntityToHBase seems to be related to YARN-3411, can you please have a look ? [Event producers] Implement RM writing app lifecycle events to ATS -- Key: YARN-3044 URL: https://issues.apache.org/jira/browse/YARN-3044 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3044-YARN-2928.004.patch, YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, YARN-3044-YARN-2928.007.patch, YARN-3044-YARN-2928.008.patch, YARN-3044-YARN-2928.009.patch, YARN-3044.20150325-1.patch, YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3218) Implement ConcatenatableAggregatedLogFormat Reader and Writer
[ https://issues.apache.org/jira/browse/YARN-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564062#comment-14564062 ] Hadoop QA commented on YARN-3218: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 16s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 55s | The applied patch generated 21 new checkstyle issues (total was 14, now 35). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 35s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 33s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 58s | Tests passed in hadoop-yarn-common. | | | | 40m 28s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12736010/YARN-3218.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 788bfa0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8124/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8124/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8124/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8124/console | This message was automatically generated. Implement ConcatenatableAggregatedLogFormat Reader and Writer - Key: YARN-3218 URL: https://issues.apache.org/jira/browse/YARN-3218 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.8.0 Reporter: Robert Kanter Assignee: Robert Kanter Attachments: YARN-3218.001.patch, YARN-3218.002.patch We need to create a Reader and Writer for the {{ConcatenatableAggregatedLogFormat}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3510) Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness
[ https://issues.apache.org/jira/browse/YARN-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564107#comment-14564107 ] Hadoop QA commented on YARN-3510: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 34s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 50s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 53s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 47s | The applied patch generated 5 new checkstyle issues (total was 64, now 69). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 1m 29s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 48m 37s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 87m 45s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-resourcemanager | | Failed unit tests | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesFairScheduler | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12736012/YARN-3510.5.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 788bfa0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8123/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8123/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8123/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8123/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8123/console | This message was automatically generated. Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness Key: YARN-3510 URL: https://issues.apache.org/jira/browse/YARN-3510 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Craig Welch Assignee: Craig Welch Attachments: YARN-3510.2.patch, YARN-3510.3.patch, YARN-3510.5.patch The ProportionalCapacityPreemptionPolicy preempts as many containers from applications as it can during it's preemption run. For fifo this makes sense, as it is prempting in reverse order therefore maintaining the primacy of the oldest. For fair ordering this does not have the desired effect - instead, it should preempt a number of containers from each application which maintains a fair balance /close to a fair balance between them -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3723) Need to clearly document primaryFilter and otherInfo value type
[ https://issues.apache.org/jira/browse/YARN-3723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563918#comment-14563918 ] Hudson commented on YARN-3723: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #200 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/200/]) YARN-3723. Need to clearly document primaryFilter and otherInfo value (xgong: rev 3077c299da4c5142503c9f92aad4b82349b2fde2) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/TimelineServer.md * hadoop-yarn-project/CHANGES.txt Need to clearly document primaryFilter and otherInfo value type --- Key: YARN-3723 URL: https://issues.apache.org/jira/browse/YARN-3723 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Critical Fix For: 2.7.1 Attachments: YARN-3723.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3740) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS
[ https://issues.apache.org/jira/browse/YARN-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-3740: Labels: (was: new) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS --- Key: YARN-3740 URL: https://issues.apache.org/jira/browse/YARN-3740 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, webapp, yarn Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-3740.1.patch YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not an incompatibility change since YARN-3700 is in 2.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3740) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS
[ https://issues.apache.org/jira/browse/YARN-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-3740: Labels: new (was: ) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS --- Key: YARN-3740 URL: https://issues.apache.org/jira/browse/YARN-3740 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, webapp, yarn Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-3740.1.patch YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not an incompatibility change since YARN-3700 is in 2.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3510) Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness
[ https://issues.apache.org/jira/browse/YARN-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch updated YARN-3510: -- Attachment: YARN-3510.3.patch Remove some unnecessary changes to other preemption tests introduced while exploring behavior... Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness Key: YARN-3510 URL: https://issues.apache.org/jira/browse/YARN-3510 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Craig Welch Assignee: Craig Welch Attachments: YARN-3510.2.patch, YARN-3510.3.patch The ProportionalCapacityPreemptionPolicy preempts as many containers from applications as it can during it's preemption run. For fifo this makes sense, as it is prempting in reverse order therefore maintaining the primacy of the oldest. For fair ordering this does not have the desired effect - instead, it should preempt a number of containers from each application which maintains a fair balance /close to a fair balance between them -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-3733: - Attachment: YARN-3733.patch On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Blocker Attachments: YARN-3733.patch Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3690) 'mvn site' fails on JDK8
[ https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula updated YARN-3690: --- Attachment: YARN-3690-002.patch 'mvn site' fails on JDK8 Key: YARN-3690 URL: https://issues.apache.org/jira/browse/YARN-3690 Project: Hadoop YARN Issue Type: Bug Components: api, site Environment: CentOS 7.0, Oracle JDK 8u45. Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3690-002.patch, YARN-3690-patch 'mvn site' failed by the following error: {noformat} [ERROR] /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18: error: package org.apache.hadoop.yarn.factories has already been annotated [ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN }) [ERROR] ^ [ERROR] java.lang.AssertionError [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126) [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45) [ERROR] at com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161) [ERROR] at com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215) [ERROR] at com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952) [ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64) [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876) [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143) [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129) [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512) [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471) [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78) [ERROR] at com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186) [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346) [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219) [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205) [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64) [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54) [ERROR] javadoc: error - fatal error [ERROR] [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m @options @packages [ERROR] [ERROR] Refer to the generated Javadoc files in '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir. [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3740) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS
[ https://issues.apache.org/jira/browse/YARN-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-3740: Description: YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not an incompatibility change since YARN-3700 is in 2.8 was: YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not a incompatibility changes since YARN-3700 is in 2.8 Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS --- Key: YARN-3740 URL: https://issues.apache.org/jira/browse/YARN-3740 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, webapp, yarn Reporter: Xuan Gong Assignee: Xuan Gong YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not an incompatibility change since YARN-3700 is in 2.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3740) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS
[ https://issues.apache.org/jira/browse/YARN-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-3740: Summary: Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS (was: fix the type of configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS --- Key: YARN-3740 URL: https://issues.apache.org/jira/browse/YARN-3740 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, webapp, yarn Reporter: Xuan Gong Assignee: Xuan Gong YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not a incompatibility changes since YARN-3700 is in 2.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3740) fix the type of configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS
Xuan Gong created YARN-3740: --- Summary: fix the type of configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS Key: YARN-3740 URL: https://issues.apache.org/jira/browse/YARN-3740 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not a incompatibility changes since YARN-3700 is in 2.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition
[ https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564031#comment-14564031 ] Robert Kanter commented on YARN-3713: - +1 Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition Key: YARN-3713 URL: https://issues.apache.org/jira/browse/YARN-3713 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Priority: Minor Labels: cleanup Attachments: YARN-3713.000.patch remove duplicate function call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is already called at ContainerImpl#addDiagnostics. {code} private void addDiagnostics(String... diags) { for (String s : diags) { this.diagnostics.append(s); } try { stateStore.storeContainerDiagnostics(containerId, diagnostics); } catch (IOException e) { LOG.warn(Unable to update diagnostics in state store for + containerId, e); } } {code} So we don't need call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition#transition. {code} container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n); try { container.stateStore.storeContainerDiagnostics(container.containerId, container.diagnostics); } catch (IOException e) { LOG.warn(Unable to update state store diagnostics for + container.containerId, e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3690) 'mvn site' fails on JDK8
[ https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564095#comment-14564095 ] Akira AJISAKA commented on YARN-3690: - Thanks [~brahmareddy] for taking this issue. After the patch, mvn site still fails. {code} [ERROR] /home/centos/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factory/providers/package-info.java:18: error: package org.apache.hadoop.yarn.factory.providers has already been annotated [ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN }) [ERROR] ^ [ERROR] java.lang.AssertionError [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126) {code} Would you fix org.apache.hadoop.yarn.factory.providers package also? I found the following packages should be fixed as well. * org.apache.hadoop.yarn.util * org.apache.hadoop.yarn.client.api.impl * org.apache.hadoop.yarn.client.api 'mvn site' fails on JDK8 Key: YARN-3690 URL: https://issues.apache.org/jira/browse/YARN-3690 Project: Hadoop YARN Issue Type: Bug Components: api, site Environment: CentOS 7.0, Oracle JDK 8u45. Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3690-patch 'mvn site' failed by the following error: {noformat} [ERROR] /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18: error: package org.apache.hadoop.yarn.factories has already been annotated [ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN }) [ERROR] ^ [ERROR] java.lang.AssertionError [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126) [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45) [ERROR] at com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161) [ERROR] at com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215) [ERROR] at com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952) [ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64) [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876) [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143) [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129) [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512) [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471) [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78) [ERROR] at com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186) [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346) [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219) [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205) [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64) [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54) [ERROR] javadoc: error - fatal error [ERROR] [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m @options @packages [ERROR] [ERROR] Refer to the generated Javadoc files in '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir. [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3716) Node-label-expression should be included by ResourceRequestPBImpl.toString
[ https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564001#comment-14564001 ] Hudson commented on YARN-3716: -- FAILURE: Integrated in Hadoop-trunk-Commit #7922 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7922/]) YARN-3716. Node-label-expression should be included by ResourceRequestPBImpl.toString. (Xianyin Xin via wangda) (wangda: rev 788bfa0359c1789fa48f21724a8117fe3fd9e532) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/records/impl/pb/ResourceRequestPBImpl.java Node-label-expression should be included by ResourceRequestPBImpl.toString -- Key: YARN-3716 URL: https://issues.apache.org/jira/browse/YARN-3716 Project: Hadoop YARN Issue Type: Sub-task Components: api Reporter: Xianyin Xin Assignee: Xianyin Xin Priority: Minor Fix For: 2.8.0 Attachments: YARN-3716.001.patch It's convenient for debug and log trace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3510) Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness
[ https://issues.apache.org/jira/browse/YARN-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564110#comment-14564110 ] Hadoop QA commented on YARN-3510: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 26s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 53s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 1s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 49s | The applied patch generated 5 new checkstyle issues (total was 64, now 69). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 1m 29s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 57m 50s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 97m 7s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-resourcemanager | | Failed unit tests | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesApps | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodeLabels | | | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesNodes | | Timed out tests | org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12736012/YARN-3510.5.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 788bfa0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8122/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8122/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8122/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8122/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8122/console | This message was automatically generated. Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness Key: YARN-3510 URL: https://issues.apache.org/jira/browse/YARN-3510 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Craig Welch Assignee: Craig Welch Attachments: YARN-3510.2.patch, YARN-3510.3.patch, YARN-3510.5.patch The ProportionalCapacityPreemptionPolicy preempts as many containers from applications as it can during it's preemption run. For fifo this makes sense, as it is prempting in reverse order therefore maintaining the primacy of the oldest. For fair ordering this does not have the desired effect - instead, it should preempt a number of containers from each application which maintains a fair balance /close to a fair balance between them -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-3733: - Attachment: (was: YARN-3733.patch) On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Blocker Attachments: YARN-3733.patch Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3740) Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS
[ https://issues.apache.org/jira/browse/YARN-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564003#comment-14564003 ] Hadoop QA commented on YARN-3740: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 31s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 47s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 59s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 24s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 34s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 3m 10s | Tests passed in hadoop-yarn-server-applicationhistoryservice. | | | | 45m 22s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735996/YARN-3740.1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 788bfa0 | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8121/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-applicationhistoryservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8121/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8121/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8121/console | This message was automatically generated. Fixed the typo with the configuration name: APPLICATION_HISTORY_PREFIX_MAX_APPS --- Key: YARN-3740 URL: https://issues.apache.org/jira/browse/YARN-3740 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, webapp, yarn Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-3740.1.patch YARN-3700 introduces a new configuration named APPLICATION_HISTORY_PREFIX_MAX_APPS, which need be changed to APPLICATION_HISTORY_MAX_APPS. This is not an incompatibility change since YARN-3700 is in 2.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564014#comment-14564014 ] Vrushali C commented on YARN-3044: -- Hi [~Naganarasimha Garla] bq. Test code failure in TestHBaseTimelineWriterImpl.testWriteEntityToHBase seems to be related to YARN-3411, can you please have a look ? yes, this was fixed yesterday in YARN-3726. Could you please rebase your patch to pull in latest changes? thanks Vrushali [Event producers] Implement RM writing app lifecycle events to ATS -- Key: YARN-3044 URL: https://issues.apache.org/jira/browse/YARN-3044 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3044-YARN-2928.004.patch, YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, YARN-3044-YARN-2928.007.patch, YARN-3044-YARN-2928.008.patch, YARN-3044-YARN-2928.009.patch, YARN-3044.20150325-1.patch, YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3510) Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness
[ https://issues.apache.org/jira/browse/YARN-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch updated YARN-3510: -- Attachment: YARN-3510.5.patch Found a couple little missed things improved documentation of the new configuration parameter. Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness Key: YARN-3510 URL: https://issues.apache.org/jira/browse/YARN-3510 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Craig Welch Assignee: Craig Welch Attachments: YARN-3510.2.patch, YARN-3510.3.patch, YARN-3510.5.patch The ProportionalCapacityPreemptionPolicy preempts as many containers from applications as it can during it's preemption run. For fifo this makes sense, as it is prempting in reverse order therefore maintaining the primacy of the oldest. For fair ordering this does not have the desired effect - instead, it should preempt a number of containers from each application which maintains a fair balance /close to a fair balance between them -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3558) Additional containers getting reserved from RM in case of Fair scheduler
[ https://issues.apache.org/jira/browse/YARN-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564112#comment-14564112 ] Xianyin Xin commented on YARN-3558: --- By check the log, i think i found the cause. First we note that the pending request containers host by AM and scheduler may not be consistent at some time. For example, at time 01'', app submit request with 3 containers, then the scheduler allocate 2, and pending 1. At time 02'', AM update the request where the #containers is still 3, after updated the request info, AM gets back the allocated 2 container. But now the pending containers is still 3 in AppSchedulingInfo, even though the real request is 1. In the next heartbeat at 03'', AM then update the request with 1 container. If there're many tasks, such inconsistent will be corrected to some extent. However, this is not the only reason of this jira. Near the end of map tasks, AM updates request with 1 container (in fact this container had been allocated, but did not been fetched by AM) at 15:10:38,606, scheduler make two reservations on two nodes for this container request (container 19 and 20). However, this 1 containers has been fulfilled, then at 15:10:39,622, AM updates the request with 0 container. But during this second, 19 and 20 are reserved. There are two problem here: 1, request host by AM and Scheduler are not consistent; 2, conservations are made on many nodes. We should consider the reasonability of the two, especially the first. Additional containers getting reserved from RM in case of Fair scheduler Key: YARN-3558 URL: https://issues.apache.org/jira/browse/YARN-3558 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler, resourcemanager Affects Versions: 2.7.0 Environment: OS :Suse 11 Sp3 Setup : 2 RM 2 NM Scheduler : Fair scheduler Reporter: Bibin A Chundatt Attachments: Amlog.txt, rm.log Submit PI job with 16 maps Total container expected : 16 MAPS + 1 Reduce + 1 AM Total containers reserved by RM is 21 Below set of containers are not being used for execution container_1430213948957_0001_01_20 container_1430213948957_0001_01_19 RM Containers reservation and states {code} Processing container_1430213948957_0001_01_01 of type START Processing container_1430213948957_0001_01_01 of type ACQUIRED Processing container_1430213948957_0001_01_01 of type LAUNCHED Processing container_1430213948957_0001_01_02 of type START Processing container_1430213948957_0001_01_03 of type START Processing container_1430213948957_0001_01_02 of type ACQUIRED Processing container_1430213948957_0001_01_03 of type ACQUIRED Processing container_1430213948957_0001_01_04 of type START Processing container_1430213948957_0001_01_05 of type START Processing container_1430213948957_0001_01_04 of type ACQUIRED Processing container_1430213948957_0001_01_05 of type ACQUIRED Processing container_1430213948957_0001_01_02 of type LAUNCHED Processing container_1430213948957_0001_01_04 of type LAUNCHED Processing container_1430213948957_0001_01_06 of type RESERVED Processing container_1430213948957_0001_01_03 of type LAUNCHED Processing container_1430213948957_0001_01_05 of type LAUNCHED Processing container_1430213948957_0001_01_07 of type START Processing container_1430213948957_0001_01_07 of type ACQUIRED Processing container_1430213948957_0001_01_07 of type LAUNCHED Processing container_1430213948957_0001_01_08 of type RESERVED Processing container_1430213948957_0001_01_02 of type FINISHED Processing container_1430213948957_0001_01_06 of type START Processing container_1430213948957_0001_01_06 of type ACQUIRED Processing container_1430213948957_0001_01_06 of type LAUNCHED Processing container_1430213948957_0001_01_04 of type FINISHED Processing container_1430213948957_0001_01_09 of type START Processing container_1430213948957_0001_01_09 of type ACQUIRED Processing container_1430213948957_0001_01_09 of type LAUNCHED Processing container_1430213948957_0001_01_10 of type RESERVED Processing container_1430213948957_0001_01_03 of type FINISHED Processing container_1430213948957_0001_01_08 of type START Processing container_1430213948957_0001_01_08 of type ACQUIRED Processing container_1430213948957_0001_01_08 of type LAUNCHED Processing container_1430213948957_0001_01_05 of type FINISHED Processing container_1430213948957_0001_01_11 of type START Processing container_1430213948957_0001_01_11 of type ACQUIRED Processing container_1430213948957_0001_01_11 of type LAUNCHED Processing
[jira] [Commented] (YARN-3690) 'mvn site' fails on JDK8
[ https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564147#comment-14564147 ] Brahma Reddy Battula commented on YARN-3690: [~ajisakaa] updated the patch..kindly review. 'mvn site' fails on JDK8 Key: YARN-3690 URL: https://issues.apache.org/jira/browse/YARN-3690 Project: Hadoop YARN Issue Type: Bug Components: api, site Environment: CentOS 7.0, Oracle JDK 8u45. Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3690-002.patch, YARN-3690-patch 'mvn site' failed by the following error: {noformat} [ERROR] /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18: error: package org.apache.hadoop.yarn.factories has already been annotated [ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN }) [ERROR] ^ [ERROR] java.lang.AssertionError [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126) [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45) [ERROR] at com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161) [ERROR] at com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215) [ERROR] at com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952) [ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64) [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876) [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143) [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129) [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512) [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471) [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78) [ERROR] at com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186) [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346) [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219) [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205) [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64) [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54) [ERROR] javadoc: error - fatal error [ERROR] [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m @options @packages [ERROR] [ERROR] Refer to the generated Javadoc files in '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir. [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3510) Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness
[ https://issues.apache.org/jira/browse/YARN-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563983#comment-14563983 ] Craig Welch commented on YARN-3510: --- The attached patch supports an optional configuration option for the preemption policy, {code} yarn.resourcemanager.monitor.capacity.preemption.preempt_evenly{code}, which (when set to true) causes the policy to only preempt one live container per application per round, and to do multiple rounds until the desired resources are obtained (or no further progress is occurring), so that preemption should generally maintain existing relative usage between apps. This is in contrast to the default behavior (when unset or set to false) (equivalent to the existing behavior), which is to take as much as possible from each app in order of the preemption iterator. The default works well for the fifo case, but will unbalance usage between apps in the fair case. Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness Key: YARN-3510 URL: https://issues.apache.org/jira/browse/YARN-3510 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Craig Welch Assignee: Craig Welch Attachments: YARN-3510.2.patch The ProportionalCapacityPreemptionPolicy preempts as many containers from applications as it can during it's preemption run. For fifo this makes sense, as it is prempting in reverse order therefore maintaining the primacy of the oldest. For fair ordering this does not have the desired effect - instead, it should preempt a number of containers from each application which maintains a fair balance /close to a fair balance between them -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chang Li updated YARN-2556: --- Attachment: YARN-2556.12.patch Tool to measure the performance of the timeline server -- Key: YARN-2556 URL: https://issues.apache.org/jira/browse/YARN-2556 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Chang Li Labels: BB2015-05-TBR Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.11.patch, YARN-2556.12.patch, YARN-2556.2.patch, YARN-2556.3.patch, YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity. I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start. This could be done as an example or test job that could be tied into gridmix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564211#comment-14564211 ] Xianyin Xin commented on YARN-3547: --- Sorry [~kasha], i correct the comment in https://issues.apache.org/jira/browse/YARN-3547?focusedCommentId=14564204page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14564204. I firstly use Resources.greaterThan(Resources.none()), which is not suitable for zero memory container request. But {{(!pending.equals(Resources.none())}} can. FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch, YARN-3547.005.patch, YARN-3547.006.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3510) Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness
[ https://issues.apache.org/jira/browse/YARN-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564248#comment-14564248 ] Hadoop QA commented on YARN-3510: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 14s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 46s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 1m 28s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 50m 23s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 88m 34s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-resourcemanager | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12736042/YARN-3510.6.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / d725dd8 | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8127/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8127/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8127/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8127/console | This message was automatically generated. Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness Key: YARN-3510 URL: https://issues.apache.org/jira/browse/YARN-3510 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Craig Welch Assignee: Craig Welch Attachments: YARN-3510.2.patch, YARN-3510.3.patch, YARN-3510.5.patch, YARN-3510.6.patch The ProportionalCapacityPreemptionPolicy preempts as many containers from applications as it can during it's preemption run. For fifo this makes sense, as it is prempting in reverse order therefore maintaining the primacy of the oldest. For fair ordering this does not have the desired effect - instead, it should preempt a number of containers from each application which maintains a fair balance /close to a fair balance between them -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564204#comment-14564204 ] Xianyin Xin commented on YARN-3547: --- Sorry [~kasha], i misunderstood the discussion. Anyway, thanks for your reviewing. Only comment on point 2: In fact i firstly use {{(!pending.equals(Resources.none())}}, however, a test case in {{TestFairScheduler}} is tesing zero memory container request 0MB, 1core. To avoid this, i changed the check. But it seems the test has been removed recently. Now all of the 3 points are addressed. FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch, YARN-3547.005.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server
[ https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564205#comment-14564205 ] Chang Li commented on YARN-2556: [~sjlee0] Thanks so much for patiently guiding me to improve the work! I have updated my patch according to your suggestions and create a JobHistoryFileReplayHelper to extract those common code between v1 and v2 JobHistoryFileReplay mappers. Please let me know what you think of the latest patch. Thanks! Tool to measure the performance of the timeline server -- Key: YARN-2556 URL: https://issues.apache.org/jira/browse/YARN-2556 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Jonathan Eagles Assignee: Chang Li Labels: BB2015-05-TBR Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.11.patch, YARN-2556.12.patch, YARN-2556.2.patch, YARN-2556.3.patch, YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, yarn2556.patch, yarn2556_wip.patch We need to be able to understand the capacity model for the timeline server to give users the tools they need to deploy a timeline server with the correct capacity. I propose we create a mapreduce job that can measure timeline server write and read performance. Transactions per second, I/O for both read and write would be a good start. This could be done as an example or test job that could be tied into gridmix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianyin Xin updated YARN-3547: -- Attachment: YARN-3547.006.patch Upload YARN-3547.006.patch. FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch, YARN-3547.005.patch, YARN-3547.006.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3741) consider nulling member maps/sets of TimelineEntity
Sangjin Lee created YARN-3741: - Summary: consider nulling member maps/sets of TimelineEntity Key: YARN-3741 URL: https://issues.apache.org/jira/browse/YARN-3741 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Sangjin Lee Currently there are multiple collection members of TimelineEntity that are always instantiated, regardless of whether they are used or not: info, configs, metrics, events, isRelatedToEntities, and relatesToEntities. Since TimelineEntities will be created very often and in lots of cases many of these members will be empty, creating these empty collections is wasteful in terms of garbage collector pressure. It would be good to start out with null members, and instantiate these collections only if they are actually used. Of course, we need to make that contract very clear and refactor all client code to handle that scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3690) 'mvn site' fails on JDK8
[ https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564192#comment-14564192 ] Hadoop QA commented on YARN-3690: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 20s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 46s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 46s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 27s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 5s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 26s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 6m 54s | Tests passed in hadoop-yarn-client. | | {color:green}+1{color} | yarn tests | 1m 58s | Tests passed in hadoop-yarn-common. | | | | 54m 15s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12736040/YARN-3690-002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / d725dd8 | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8126/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8126/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8126/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8126/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8126/console | This message was automatically generated. 'mvn site' fails on JDK8 Key: YARN-3690 URL: https://issues.apache.org/jira/browse/YARN-3690 Project: Hadoop YARN Issue Type: Bug Components: api, site Environment: CentOS 7.0, Oracle JDK 8u45. Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Attachments: YARN-3690-002.patch, YARN-3690-patch 'mvn site' failed by the following error: {noformat} [ERROR] /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18: error: package org.apache.hadoop.yarn.factories has already been annotated [ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN }) [ERROR] ^ [ERROR] java.lang.AssertionError [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126) [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45) [ERROR] at com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161) [ERROR] at com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215) [ERROR] at com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952) [ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64) [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876) [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143) [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129) [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512) [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471) [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78) [ERROR] at com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186) [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346) [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219) [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
[jira] [Updated] (YARN-3510) Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness
[ https://issues.apache.org/jira/browse/YARN-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch updated YARN-3510: -- Attachment: YARN-3510.6.patch Ran all of the tests listed as failing or timing out on my box with the patch and the all pass, must be a build server issue or something of that nature. Clicking on the findbugs link indicates that there are no findbugs issues (0 listed), is there something wrong with the feedback process? Fixed all of the checkstyle issues except one which I don't think is important. Create an extension of ProportionalCapacityPreemptionPolicy which preempts a number of containers from each application in a way which respects fairness Key: YARN-3510 URL: https://issues.apache.org/jira/browse/YARN-3510 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Craig Welch Assignee: Craig Welch Attachments: YARN-3510.2.patch, YARN-3510.3.patch, YARN-3510.5.patch, YARN-3510.6.patch The ProportionalCapacityPreemptionPolicy preempts as many containers from applications as it can during it's preemption run. For fifo this makes sense, as it is prempting in reverse order therefore maintaining the primacy of the oldest. For fair ordering this does not have the desired effect - instead, it should preempt a number of containers from each application which maintains a fair balance /close to a fair balance between them -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564163#comment-14564163 ] Hadoop QA commented on YARN-3733: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 12s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 54s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 33s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 58s | Tests passed in hadoop-yarn-common. | | | | 40m 27s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12736037/YARN-3733.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / d725dd8 | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8125/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8125/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8125/console | This message was automatically generated. On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Blocker Attachments: YARN-3733.patch Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3699) Decide if flow version should be part of row key or column
[ https://issues.apache.org/jira/browse/YARN-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564216#comment-14564216 ] Vrushali C commented on YARN-3699: -- Hi [~djp], Wanted to check in and see if you got a chance to read [~jrottinghuis]'s comments above. I believe we don't need the flow version as part of the row key. [~gtCarrera9] also mentioned he is good with not having the flow version as part of the row key, it was perhaps an oversight to have included it in the Phoenix schema's row key. thanks Vrushali Decide if flow version should be part of row key or column --- Key: YARN-3699 URL: https://issues.apache.org/jira/browse/YARN-3699 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vrushali C Based on discussions in YARN-3411 with [~djp], filing jira for continuing discussion on putting the flow version in rowkey or column. Either phoenix/hbase approach will update the jira with the conclusions.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers
[ https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562380#comment-14562380 ] Varun Saxena commented on YARN-3051: Thanks for the replies. [Storage abstraction] Create backing storage read interface for ATS readers --- Key: YARN-3051 URL: https://issues.apache.org/jira/browse/YARN-3051 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Varun Saxena Attachments: YARN-3051-YARN-2928.003.patch, YARN-3051-YARN-2928.03.patch, YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch Per design in YARN-2928, create backing storage read interface that can be implemented by multiple backing storage implementations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562367#comment-14562367 ] Sangjin Lee commented on YARN-3721: --- See https://issues.apache.org/jira/browse/YARN-3411?focusedCommentId=14514872page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14514872 :) build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch, YARN-3721-YARN-2928.002.patch, YARN-3721-YARN-2928.002.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3732) Change NodeHeartbeatResponse.java and RegisterNodeManagerResponse.java as abstract classes
Devaraj K created YARN-3732: --- Summary: Change NodeHeartbeatResponse.java and RegisterNodeManagerResponse.java as abstract classes Key: YARN-3732 URL: https://issues.apache.org/jira/browse/YARN-3732 Project: Hadoop YARN Issue Type: Improvement Reporter: Devaraj K Assignee: Devaraj K Priority: Minor All the other protocol record classes are abstract classes. Change NodeHeartbeatResponse.java and RegisterNodeManagerResponse.java as abstract classes to make it consistent with other protocol record classes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-3733: --- Affects Version/s: 2.7.0 On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Critical Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith reassigned YARN-3733: Assignee: Rohith On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Critical Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
Bibin A Chundatt created YARN-3733: -- Summary: On RM restart AM getting more than maximum possible memory when many tasks in queue Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Priority: Critical Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562353#comment-14562353 ] Zhijie Shen commented on YARN-3721: --- +1 for the approach. BTW, there always exists a risk: HBase is the downstream of Hadoop, and compile against certain version of Hadoop. Say we use HBase X.X.X, and it depends on Hadoop Y.Y.Y. However, on trunk/branch-2, if we have made some incompatible change to mini dfs cluster for a later release Y.Y.Z, our test cases are very likely to be broken, because HBase test utils compiled against Y.Y.Y is not longer compatible with Y.Y.Z runtime libs. build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch, YARN-3721-YARN-2928.002.patch, YARN-3721-YARN-2928.002.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3066) Hadoop leaves orphaned tasks running after job is killed
[ https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562618#comment-14562618 ] Alan Burlison commented on YARN-3066: - Bugtraq is long gone, everything is now in the bug database accessibly via My Oracle Support (https://support.oracle.com) Hadoop leaves orphaned tasks running after job is killed Key: YARN-3066 URL: https://issues.apache.org/jira/browse/YARN-3066 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1 Reporter: Dmitry Sivachenko When spawning user task, node manager checks for setsid(1) utility and spawns task program via it. See hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java for instance: String exec = Shell.isSetsidAvailable? exec setsid : exec; FreeBSD, unlike Linux, does not have setsid(1) utility. So plain exec is used to spawn user task. If that task spawns other external programs (this is common case if a task program is a shell script) and user kills job via mapred job -kill Job, these child processes remain running. 1) Why do you silently ignore the absence of setsid(1) and spawn task process via exec: this is the guarantee to have orphaned processes when job is prematurely killed. 2) FreeBSD has a replacement third-party program called ssid (which does almost the same as Linux's setsid). It would be nice to detect which binary is present during configure stage and put @SETSID@ macros into java file to use the correct name. I propose to make Shell.isSetsidAvailable test more strict and fail to start if it is not found: at least we will know about the problem at start rather than guess why there are orphaned tasks running forever. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3735) Retain JRE Fatal error logs upon container failure
Srikanth Sundarrajan created YARN-3735: -- Summary: Retain JRE Fatal error logs upon container failure Key: YARN-3735 URL: https://issues.apache.org/jira/browse/YARN-3735 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Srikanth Sundarrajan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3722) Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils
[ https://issues.apache.org/jira/browse/YARN-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562750#comment-14562750 ] Hudson commented on YARN-3722: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #941 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/941/]) YARN-3722. Merge multiple TestWebAppUtils into (devaraj: rev 7e509f58439b1089462f51f9a0f9782faec7d198) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWebAppUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/webapp/util/TestWebAppUtils.java * hadoop-yarn-project/CHANGES.txt Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils -- Key: YARN-3722 URL: https://issues.apache.org/jira/browse/YARN-3722 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Fix For: 2.8.0 Attachments: YARN-3722.001.patch The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to {{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to {{o.a.h.yarn.webapp.util}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3581) Deprecate -directlyAccessNodeLabelStore in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562746#comment-14562746 ] Hudson commented on YARN-3581: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #941 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/941/]) YARN-3581. Deprecate -directlyAccessNodeLabelStore in RMAdminCLI. (Naganarasimha G R via wangda) (wangda: rev cab7674e54c4fe56838668462de99a6787841309) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ClusterCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestClusterCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java * hadoop-yarn-project/CHANGES.txt Deprecate -directlyAccessNodeLabelStore in RMAdminCLI - Key: YARN-3581 URL: https://issues.apache.org/jira/browse/YARN-3581 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Naganarasimha G R Fix For: 2.8.0 Attachments: YARN-3581.20150525-1.patch, YARN-3581.20150528-1.patch In 2.6.0, we added an option called -directlyAccessNodeLabelStore to make RM can start with label-configured queue settings. After YARN-2918, we don't need this option any more, admin can configure queue setting, start RM and configure node label via RMAdminCLI without any error. In addition, this option is very restrictive, first it needs to run on the same node where RM is running if admin configured to store labels in local disk. Second, when admin run the option when RM is running, multiple process write to a same file can happen, this could make node label store becomes invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2355) MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container
[ https://issues.apache.org/jira/browse/YARN-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562748#comment-14562748 ] Hudson commented on YARN-2355: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #941 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/941/]) YARN-2355. MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container (Darrell Taylor via aw) (aw: rev d6e3164d4a18271299c63377326ca56e8a980830) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java * hadoop-yarn-project/CHANGES.txt MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container -- Key: YARN-2355 URL: https://issues.apache.org/jira/browse/YARN-2355 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Darrell Taylor Labels: newbie Fix For: 3.0.0 Attachments: YARN-2355.001.patch After YARN-2074, YARN-614 and YARN-611, the application cannot judge whether it has the chance to try based on MAX_APP_ATTEMPTS_ENV alone. We should be able to notify the application of the up-to-date remaining retry quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3722) Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils
[ https://issues.apache.org/jira/browse/YARN-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-3722: Summary: Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils (was: Merge multiple TestWebAppUtils) Hadoop Flags: Reviewed +1, patch looks good to me. Merge multiple TestWebAppUtils into o.a.h.yarn.webapp.util.TestWebAppUtils -- Key: YARN-3722 URL: https://issues.apache.org/jira/browse/YARN-3722 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3722.001.patch The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to {{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to {{o.a.h.yarn.webapp.util}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3734) Force dirhandler.checkDirs() when there is a localization failure
Lavkesh Lahngir created YARN-3734: - Summary: Force dirhandler.checkDirs() when there is a localization failure Key: YARN-3734 URL: https://issues.apache.org/jira/browse/YARN-3734 Project: Hadoop YARN Issue Type: Bug Reporter: Lavkesh Lahngir Assignee: Lavkesh Lahngir -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3733) On RM restart AM getting more than maximum possible memory when many tasks in queue
[ https://issues.apache.org/jira/browse/YARN-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562622#comment-14562622 ] Rohith commented on YARN-3733: -- Verified the RM logs from [~bibinchundatt] offline. The sequence of events ocured are # 30 applications are submitted to RM1 concurrently. *pendingApplications=18 and activeApplications=12*. Active applications are started RUNNING state. # RM1 switched to standby, RM2 transitioned to Active state. Currently active RM is RM2. # Previous submitted 30 applications started recovering. As part of recovery process, all the 30 applications submitted to schedulers and all these applications become active i.e *activeApplications=30 and pendingApplications=0* which is not expected to happen. # NM registered with RM and running AM's registered with RM. # Since 30 applications are activated, schedulers tries to launch all the activated applications ApplicatonMater and occupied full cluster capacity. Basically the issue AM limit check in LeafQueue#activateApplications is not working as expected for {{DominantResourceAllocator}}. In order to confirm this, written simple program for both Default and Dominant resource allocator like below memory configurations. Output of the program is For DefaultResourceAllocator, result is false which Limits the applications being activated when AM resource Limit is exceeded. For DominatReosurceAllocator, result is true which allows all the applications to be activated even if AM resource Limit is exceeded. {noformat} 2015-05-28 14:00:52,704 DEBUG org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: application AMResource memory:4096, vCores:1 maxAMResourcePerQueuePercent 0.5 amLimit memory:0, vCores:0 lastClusterResource memory:0, vCores:0 amIfStarted memory:4096, vCores:1 {noformat} {code} package com.test.hadoop; import org.apache.hadoop.yarn.api.records.Resource; import org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator; import org.apache.hadoop.yarn.util.resource.DominantResourceCalculator; import org.apache.hadoop.yarn.util.resource.ResourceCalculator; import org.apache.hadoop.yarn.util.resource.Resources; public class TestResourceCalculator { public static void main(String[] args) { // Default Resource Allocator ResourceCalculator defaultResourceCalculator = new DefaultResourceCalculator(); // Dominant Resource Allocator ResourceCalculator dominantResourceCalculator = new DominantResourceCalculator(); Resource lastClusterResource = Resource.newInstance(0, 0); Resource amIfStarted = Resource.newInstance(4096, 1); Resource amLimit = Resource.newInstance(0, 0); // expected result false, but actual also false System.out.println(DefaultResourceCalculator : + Resources.lessThanOrEqual(defaultResourceCalculator, lastClusterResource, amIfStarted, amLimit)); // expected result false, but actual also true for DominantResourceAllocator System.out.println(DominantResourceCalculator : + Resources.lessThanOrEqual(dominantResourceCalculator, lastClusterResource, amIfStarted, amLimit)); } } {code} On RM restart AM getting more than maximum possible memory when many tasks in queue - Key: YARN-3733 URL: https://issues.apache.org/jira/browse/YARN-3733 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Suse 11 Sp3 , 2 NM , 2 RM one NM - 3 GB 6 v core Reporter: Bibin A Chundatt Assignee: Rohith Priority: Critical Steps to reproduce = 1. Install HA with 2 RM 2 NM (3072 MB * 2 total cluster) 2. Configure map and reduce size to 512 MB after changing scheduler minimum size to 512 MB 3. Configure capacity scheduler and AM limit to .5 (DominantResourceCalculator is configured) 4. Submit 30 concurrent task 5. Switch RM Actual = For 12 Jobs AM gets allocated and all 12 starts running No other Yarn child is initiated , *all 12 Jobs in Running state for ever* Expected === Only 6 should be running at a time since max AM allocated is .5 (3072 MB) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3581) Deprecate -directlyAccessNodeLabelStore in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562664#comment-14562664 ] Hudson commented on YARN-3581: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #211 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/211/]) YARN-3581. Deprecate -directlyAccessNodeLabelStore in RMAdminCLI. (Naganarasimha G R via wangda) (wangda: rev cab7674e54c4fe56838668462de99a6787841309) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ClusterCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestRMAdminCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestClusterCLI.java Deprecate -directlyAccessNodeLabelStore in RMAdminCLI - Key: YARN-3581 URL: https://issues.apache.org/jira/browse/YARN-3581 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Naganarasimha G R Fix For: 2.8.0 Attachments: YARN-3581.20150525-1.patch, YARN-3581.20150528-1.patch In 2.6.0, we added an option called -directlyAccessNodeLabelStore to make RM can start with label-configured queue settings. After YARN-2918, we don't need this option any more, admin can configure queue setting, start RM and configure node label via RMAdminCLI without any error. In addition, this option is very restrictive, first it needs to run on the same node where RM is running if admin configured to store labels in local disk. Second, when admin run the option when RM is running, multiple process write to a same file can happen, this could make node label store becomes invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2355) MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container
[ https://issues.apache.org/jira/browse/YARN-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562666#comment-14562666 ] Hudson commented on YARN-2355: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #211 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/211/]) YARN-2355. MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container (Darrell Taylor via aw) (aw: rev d6e3164d4a18271299c63377326ca56e8a980830) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/amlauncher/AMLauncher.java MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container -- Key: YARN-2355 URL: https://issues.apache.org/jira/browse/YARN-2355 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Darrell Taylor Labels: newbie Fix For: 3.0.0 Attachments: YARN-2355.001.patch After YARN-2074, YARN-614 and YARN-611, the application cannot judge whether it has the chance to try based on MAX_APP_ATTEMPTS_ENV alone. We should be able to notify the application of the up-to-date remaining retry quota. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be
[ https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562674#comment-14562674 ] Hudson commented on YARN-3626: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #211 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/211/]) YARN-3626. On Windows localized resources are not moved to the front of the classpath when they should be. Contributed by Craig Welch. (cnauroth: rev 4102e5882e17b75507ae5cf8b8979485b3e24cbc) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-common/src/main/java/org/apache/hadoop/mapreduce/v2/util/MRApps.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java On Windows localized resources are not moved to the front of the classpath when they should be -- Key: YARN-3626 URL: https://issues.apache.org/jira/browse/YARN-3626 Project: Hadoop YARN Issue Type: Bug Components: yarn Environment: Windows Reporter: Craig Welch Assignee: Craig Welch Fix For: 2.7.1 Attachments: YARN-3626.0.patch, YARN-3626.11.patch, YARN-3626.14.patch, YARN-3626.15.patch, YARN-3626.16.patch, YARN-3626.4.patch, YARN-3626.6.patch, YARN-3626.9.patch In response to the mapreduce.job.user.classpath.first setting the classpath is ordered differently so that localized resources will appear before system classpath resources when tasks execute. On Windows this does not work because the localized resources are not linked into their final location when the classpath jar is created. To compensate for that localized jar resources are added directly to the classpath generated for the jar rather than being discovered from the localized directories. Unfortunately, they are always appended to the classpath, and so are never preferred over system resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)