[jira] [Commented] (YARN-230) Make changes for RM restart phase 1
[ https://issues.apache.org/jira/browse/YARN-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534044#comment-13534044 ] Tom White commented on YARN-230: Arun, yes it looks good to me, +1. We can address any changes that come up in later JIRAs. Make changes for RM restart phase 1 --- Key: YARN-230 URL: https://issues.apache.org/jira/browse/YARN-230 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Bikas Saha Attachments: PB-impl.patch, Recovery.patch, Store.patch, Test.patch, YARN-230.1.patch, YARN-230.4.patch, YARN-230.5.patch As described in YARN-128, phase 1 of RM restart puts in place mechanisms to save application state and read them back after restart. Upon restart, the NM's are asked to reboot and the previously running AM's are restarted. After this is done, RM HA and work preserving restart can continue in parallel. For more details please refer to the design document in YARN-128 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-230) Make changes for RM restart phase 1
[ https://issues.apache.org/jira/browse/YARN-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534098#comment-13534098 ] Bikas Saha commented on YARN-230: - Thanks guys! Make changes for RM restart phase 1 --- Key: YARN-230 URL: https://issues.apache.org/jira/browse/YARN-230 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Bikas Saha Attachments: PB-impl.patch, Recovery.patch, Store.patch, Test.patch, YARN-230.1.patch, YARN-230.4.patch, YARN-230.5.patch As described in YARN-128, phase 1 of RM restart puts in place mechanisms to save application state and read them back after restart. Upon restart, the NM's are asked to reboot and the previously running AM's are restarted. After this is done, RM HA and work preserving restart can continue in parallel. For more details please refer to the design document in YARN-128 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-223) Change processTree interface to work better with native code
[ https://issues.apache.org/jira/browse/YARN-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534156#comment-13534156 ] Bikas Saha commented on YARN-223: - +1 for the code and approach. The patch changes some public members. Not sure about those changes since they may not meet back-compat requirements. Change processTree interface to work better with native code Key: YARN-223 URL: https://issues.apache.org/jira/browse/YARN-223 Project: Hadoop YARN Issue Type: Bug Reporter: Radim Kolar Assignee: Radim Kolar Priority: Critical Attachments: pstree-update4.txt, pstree-update6.txt, pstree-update6.txt Problem is that on every update of processTree new object is required. This is undesired when working with processTree implementation in native code. replace ProcessTree.getProcessTree() with updateProcessTree(). No new object allocation is needed and it simplify application code a bit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-3) Add support for CPU isolation/monitoring of containers
[ https://issues.apache.org/jira/browse/YARN-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534171#comment-13534171 ] Vinod Kumar Vavilapalli commented on YARN-3: Will review by EOD today. Thanks for the tip. Add support for CPU isolation/monitoring of containers -- Key: YARN-3 URL: https://issues.apache.org/jira/browse/YARN-3 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun C Murthy Assignee: Andrew Ferguson Attachments: mapreduce-4334-design-doc.txt, mapreduce-4334-design-doc-v2.txt, MAPREDUCE-4334-executor-v1.patch, MAPREDUCE-4334-executor-v2.patch, MAPREDUCE-4334-executor-v3.patch, MAPREDUCE-4334-executor-v4.patch, MAPREDUCE-4334-pre1.patch, MAPREDUCE-4334-pre2.patch, MAPREDUCE-4334-pre2-with_cpu.patch, MAPREDUCE-4334-pre3.patch, MAPREDUCE-4334-pre3-with_cpu.patch, MAPREDUCE-4334-v1.patch, MAPREDUCE-4334-v2.patch, YARN-3-lce_only-v1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-270) RM scheduler event handler thread gets behind
[ https://issues.apache.org/jira/browse/YARN-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534264#comment-13534264 ] Vinod Kumar Vavilapalli commented on YARN-270: -- Thanks for filing this Thomas. IIRC, The event-handler's upper limit is about 0.6 million, somehow we only focus on number of nodes and never thought about the scaling issue with large number of applications. There are multiple solutions for this, in the order of importance: - Make NodeManagers to *NOT* blindly heartbeat irrespective of whether previous heartbeat is processed or not. - Figure out any obvious bottlenecks in the scheduling code. - When all else fails, try to parallelize the scheduler dispatcher. RM scheduler event handler thread gets behind - Key: YARN-270 URL: https://issues.apache.org/jira/browse/YARN-270 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.5 Reporter: Thomas Graves Assignee: Thomas Graves We had a couple of incidents on a 2800 node cluster where the RM scheduler event handler thread got behind processing events and basically become unusable. It was still processing apps, but taking a long time (1 hr 45 minutes) to accept new apps. this actually happened twice within 5 days. We are using the capacity scheduler and at the time had between 400 and 500 applications running. There were another 250 apps that were in the SUBMITTED state in the RM but the scheduler hadn't processed those to put in pending state yet. We had about 15 queues none of them hierarchical. We also had plenty of space lefts on the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-275) Make NodeManagers to NOT blindly heartbeat irrespective of whether previous heartbeat is processed or not.
Vinod Kumar Vavilapalli created YARN-275: Summary: Make NodeManagers to NOT blindly heartbeat irrespective of whether previous heartbeat is processed or not. Key: YARN-275 URL: https://issues.apache.org/jira/browse/YARN-275 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli We need NMs to back off. The event handler mechanism is very scalable but not infinitely so :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-275) Make NodeManagers to NOT blindly heartbeat irrespective of whether previous heartbeat is processed or not.
[ https://issues.apache.org/jira/browse/YARN-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-275: - Component/s: nodemanager Make NodeManagers to NOT blindly heartbeat irrespective of whether previous heartbeat is processed or not. -- Key: YARN-275 URL: https://issues.apache.org/jira/browse/YARN-275 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong We need NMs to back off. The event handler mechanism is very scalable but not infinitely so :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-275) Make NodeManagers to NOT blindly heartbeat irrespective of whether previous heartbeat is processed or not.
[ https://issues.apache.org/jira/browse/YARN-275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned YARN-275: Assignee: Xuan Gong (was: Vinod Kumar Vavilapalli) Xuan, can you please take this up? Thanks. Make NodeManagers to NOT blindly heartbeat irrespective of whether previous heartbeat is processed or not. -- Key: YARN-275 URL: https://issues.apache.org/jira/browse/YARN-275 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Vinod Kumar Vavilapalli Assignee: Xuan Gong We need NMs to back off. The event handler mechanism is very scalable but not infinitely so :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager
[ https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli reassigned YARN-198: Assignee: Senthil V Kumar If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager --- Key: YARN-198 URL: https://issues.apache.org/jira/browse/YARN-198 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Ramgopal N Assignee: Senthil V Kumar Priority: Minor If we are navigating to Nodemanager by clicking on the node link in RM,there is no link provided on the NM to navigate back to RM. If there is a link to navigate back to RM it would be good -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-270) RM scheduler event handler thread gets behind
[ https://issues.apache.org/jira/browse/YARN-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534337#comment-13534337 ] Nathan Roberts commented on YARN-270: - Could we also add some additional flow control within the RM to prevent this work from getting into the event queues in the first place? Having the clients throttle on their end is important in the short term but in the long run we need a flow control strategy that can exert back pressure at all stages of the pipeline. RM scheduler event handler thread gets behind - Key: YARN-270 URL: https://issues.apache.org/jira/browse/YARN-270 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.5 Reporter: Thomas Graves Assignee: Thomas Graves We had a couple of incidents on a 2800 node cluster where the RM scheduler event handler thread got behind processing events and basically become unusable. It was still processing apps, but taking a long time (1 hr 45 minutes) to accept new apps. this actually happened twice within 5 days. We are using the capacity scheduler and at the time had between 400 and 500 applications running. There were another 250 apps that were in the SUBMITTED state in the RM but the scheduler hadn't processed those to put in pending state yet. We had about 15 queues none of them hierarchical. We also had plenty of space lefts on the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-223) Change processTree interface to work better with native code
[ https://issues.apache.org/jira/browse/YARN-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534397#comment-13534397 ] Radim Kolar commented on YARN-223: -- its private api @InterfaceAudience.Private @InterfaceStability.Unstable Change processTree interface to work better with native code Key: YARN-223 URL: https://issues.apache.org/jira/browse/YARN-223 Project: Hadoop YARN Issue Type: Bug Reporter: Radim Kolar Assignee: Radim Kolar Priority: Critical Attachments: pstree-update4.txt, pstree-update6.txt, pstree-update6.txt Problem is that on every update of processTree new object is required. This is undesired when working with processTree implementation in native code. replace ProcessTree.getProcessTree() with updateProcessTree(). No new object allocation is needed and it simplify application code a bit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-271) Fair scheduler hits IllegalStateException trying to reserve different apps on same node
[ https://issues.apache.org/jira/browse/YARN-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-271: Description: After the fair scheduler reserves a container on a node, it doesn't check for reservations it just made when trying to make more reservations during the same heartbeat. (was: After the fair scheduler reserves a container on a node, it doesn't check for reservations it just made, when trying to make more reservations during the same heartbeat.) Fair scheduler hits IllegalStateException trying to reserve different apps on same node --- Key: YARN-271 URL: https://issues.apache.org/jira/browse/YARN-271 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-271.patch After the fair scheduler reserves a container on a node, it doesn't check for reservations it just made when trying to make more reservations during the same heartbeat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
nemon lou created YARN-276: -- Summary: Capacity Scheduler can hang when submit many jobs concurrently Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.0.1-alpha, 3.0.0 Reporter: nemon lou In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534585#comment-13534585 ] Hadoop QA commented on YARN-276: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561406/YARN-276.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestApplicationCleanup org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates org.apache.hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.security.TestApplicationTokens org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCResponseId org.apache.hadoop.yarn.server.resourcemanager.TestAMAuthorization org.apache.hadoop.yarn.server.resourcemanager.security.TestClientTokens {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/227//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/227//console This message is automatically generated. Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-270) RM scheduler event handler thread gets behind
[ https://issues.apache.org/jira/browse/YARN-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534586#comment-13534586 ] Vinod Kumar Vavilapalli commented on YARN-270: -- Nathan, unfortunately, the dispatcher framework cannot exert back pressure in general, each event producer needs to control itself. OTOH, YARN-275 is indeed a long term fix. NMs back off just like the TTs do in 1.*. RM scheduler event handler thread gets behind - Key: YARN-270 URL: https://issues.apache.org/jira/browse/YARN-270 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 0.23.5 Reporter: Thomas Graves Assignee: Thomas Graves We had a couple of incidents on a 2800 node cluster where the RM scheduler event handler thread got behind processing events and basically become unusable. It was still processing apps, but taking a long time (1 hr 45 minutes) to accept new apps. this actually happened twice within 5 days. We are using the capacity scheduler and at the time had between 400 and 500 applications running. There were another 250 apps that were in the SUBMITTED state in the RM but the scheduler hadn't processed those to put in pending state yet. We had about 15 queues none of them hierarchical. We also had plenty of space lefts on the cluster. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-272) Fair scheduler log messages try to print objects without overridden toString methods
[ https://issues.apache.org/jira/browse/YARN-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-272: Attachment: YARN-272.patch Fair scheduler log messages try to print objects without overridden toString methods Key: YARN-272 URL: https://issues.apache.org/jira/browse/YARN-272 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-272.patch A lot of junk gets printed out like this: 2012-12-11 17:31:52,998 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp: Application application_1355270529654_0003 reserved container org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl@324f0f97 on node host: c1416.hal.cloudera.com:46356 #containers=7 available=0 used=8192, currently has 4 at priority org.apache.hadoop.yarn.api.records.impl.pb.PriorityPBImpl@33; currentReservation 4096 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-271) Fair scheduler hits IllegalStateException trying to reserve different apps on same node
[ https://issues.apache.org/jira/browse/YARN-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534624#comment-13534624 ] Sandy Ryza commented on YARN-271: - The last patch doesn't fix the problem entirely. Added one that does and includes a test. Fair scheduler hits IllegalStateException trying to reserve different apps on same node --- Key: YARN-271 URL: https://issues.apache.org/jira/browse/YARN-271 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-271-1.patch, YARN-271.patch After the fair scheduler reserves a container on a node, it doesn't check for reservations it just made when trying to make more reservations during the same heartbeat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-271) Fair scheduler hits IllegalStateException trying to reserve different apps on same node
[ https://issues.apache.org/jira/browse/YARN-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-271: Attachment: YARN-271-1.patch Fair scheduler hits IllegalStateException trying to reserve different apps on same node --- Key: YARN-271 URL: https://issues.apache.org/jira/browse/YARN-271 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-271-1.patch, YARN-271.patch After the fair scheduler reserves a container on a node, it doesn't check for reservations it just made when trying to make more reservations during the same heartbeat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-271) Fair scheduler hits IllegalStateException trying to reserve different apps on same node
[ https://issues.apache.org/jira/browse/YARN-271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534636#comment-13534636 ] Hadoop QA commented on YARN-271: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561420/YARN-271-1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/229//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/229//console This message is automatically generated. Fair scheduler hits IllegalStateException trying to reserve different apps on same node --- Key: YARN-271 URL: https://issues.apache.org/jira/browse/YARN-271 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.0.2-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-271-1.patch, YARN-271.patch After the fair scheduler reserves a container on a node, it doesn't check for reservations it just made when trying to make more reservations during the same heartbeat. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-3) Add support for CPU isolation/monitoring of containers
[ https://issues.apache.org/jira/browse/YARN-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534640#comment-13534640 ] Vinod Kumar Vavilapalli commented on YARN-3: Did a quick review (incremental review, trusting my previous self :) ). Looks good, let's track the pending items separately. Triggering Jenkins on YARN-147 and will close the tickets once blessed. Add support for CPU isolation/monitoring of containers -- Key: YARN-3 URL: https://issues.apache.org/jira/browse/YARN-3 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun C Murthy Assignee: Andrew Ferguson Attachments: mapreduce-4334-design-doc.txt, mapreduce-4334-design-doc-v2.txt, MAPREDUCE-4334-executor-v1.patch, MAPREDUCE-4334-executor-v2.patch, MAPREDUCE-4334-executor-v3.patch, MAPREDUCE-4334-executor-v4.patch, MAPREDUCE-4334-pre1.patch, MAPREDUCE-4334-pre2.patch, MAPREDUCE-4334-pre2-with_cpu.patch, MAPREDUCE-4334-pre3.patch, MAPREDUCE-4334-pre3-with_cpu.patch, MAPREDUCE-4334-v1.patch, MAPREDUCE-4334-v2.patch, YARN-3-lce_only-v1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-147) Add support for CPU isolation/monitoring of containers
[ https://issues.apache.org/jira/browse/YARN-147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534645#comment-13534645 ] Hadoop QA commented on YARN-147: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12555695/YARN-147-v8.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: org.apache.hadoop.yarn.server.nodemanager.TestLinuxContainerExecutorWithMocks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/230//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/230//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/230//console This message is automatically generated. Add support for CPU isolation/monitoring of containers -- Key: YARN-147 URL: https://issues.apache.org/jira/browse/YARN-147 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Andrew Ferguson Fix For: 2.0.3-alpha Attachments: YARN-147-v1.patch, YARN-147-v2.patch, YARN-147-v3.patch, YARN-147-v4.patch, YARN-147-v5.patch, YARN-147-v6.patch, YARN-147-v8.patch, YARN-3.patch This is a clone for YARN-3 to be able to submit the patch as YARN-3 does not show the SUBMIT PATCH button. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-3) Add support for CPU isolation/monitoring of containers
[ https://issues.apache.org/jira/browse/YARN-3?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534648#comment-13534648 ] Vinod Kumar Vavilapalli commented on YARN-3: Andrew, can you please look at the FindBugs and the test case issue at YARN-147. Let's try and get this in tomorrow. Also can you please find the pending issues. I can file any that I know tomorrow. Tx. Add support for CPU isolation/monitoring of containers -- Key: YARN-3 URL: https://issues.apache.org/jira/browse/YARN-3 Project: Hadoop YARN Issue Type: Sub-task Reporter: Arun C Murthy Assignee: Andrew Ferguson Attachments: mapreduce-4334-design-doc.txt, mapreduce-4334-design-doc-v2.txt, MAPREDUCE-4334-executor-v1.patch, MAPREDUCE-4334-executor-v2.patch, MAPREDUCE-4334-executor-v3.patch, MAPREDUCE-4334-executor-v4.patch, MAPREDUCE-4334-pre1.patch, MAPREDUCE-4334-pre2.patch, MAPREDUCE-4334-pre2-with_cpu.patch, MAPREDUCE-4334-pre3.patch, MAPREDUCE-4334-pre3-with_cpu.patch, MAPREDUCE-4334-v1.patch, MAPREDUCE-4334-v2.patch, YARN-3-lce_only-v1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-103) Add a yarn AM - RM client module
[ https://issues.apache.org/jira/browse/YARN-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534703#comment-13534703 ] Hadoop QA commented on YARN-103: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12559906/YARN-103.7.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/231//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/231//console This message is automatically generated. Add a yarn AM - RM client module Key: YARN-103 URL: https://issues.apache.org/jira/browse/YARN-103 Project: Hadoop YARN Issue Type: Improvement Reporter: Bikas Saha Assignee: Bikas Saha Attachments: YARN-103.1.patch, YARN-103.2.patch, YARN-103.3.patch, YARN-103.4.patch, YARN-103.4.wrapper.patch, YARN-103.5.patch, YARN-103.6.patch, YARN-103.7.patch Add a basic client wrapper library to the AM RM protocol in order to prevent proliferation of code being duplicated everywhere. Provide helper functions to perform reverse mapping of container requests to RM allocation resource request table format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-103) Add a yarn AM - RM client module
[ https://issues.apache.org/jira/browse/YARN-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534733#comment-13534733 ] Siddharth Seth commented on YARN-103: - Apologies for taking ages to look at this. Some minor stuff pending. - AMRMClient JavaDoc 1) The javadoc for the ContainerRequest class has some typos and needs to be punctuated (instead of newlines) 2) allocate javadoc - makes a reference to makeContainerRequest which is now called addContainerRequest. Also it'll be useful to mention the reboot flag which may be sent as part of the response. - AMRMCLientImpl unregisterApplicationMaster - setAppAttemptId doesn't need to be in a synchronized block - AMRMClientImpl - add/decContainerRequest rack null checks need fixing (host instead of rack) - AMRMClientImpl.addResourceRequestToAsk - am not sure why this method is needed. A simple synchronized asks.add should be sufficient Also, would prefer the DistributedShell changes in a separate jira - just to keep this patch clean. Breaking that out of the current patch should be simple enough. Add a yarn AM - RM client module Key: YARN-103 URL: https://issues.apache.org/jira/browse/YARN-103 Project: Hadoop YARN Issue Type: Improvement Reporter: Bikas Saha Assignee: Bikas Saha Attachments: YARN-103.1.patch, YARN-103.2.patch, YARN-103.3.patch, YARN-103.4.patch, YARN-103.4.wrapper.patch, YARN-103.5.patch, YARN-103.6.patch, YARN-103.7.patch Add a basic client wrapper library to the AM RM protocol in order to prevent proliferation of code being duplicated everywhere. Provide helper functions to perform reverse mapping of container requests to RM allocation resource request table format. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira