[jira] [Resolved] (YARN-1231) Fix test cases that will hit max- am-used-resources-percent limit after YARN-276
[ https://issues.apache.org/jira/browse/YARN-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou resolved YARN-1231. - Resolution: Won't Fix YARN-2637 Has fixed the problem described in YARN-276. So this ticket needn't to be fixed anymore. Fix test cases that will hit max- am-used-resources-percent limit after YARN-276 Key: YARN-1231 URL: https://issues.apache.org/jira/browse/YARN-1231 Project: Hadoop YARN Issue Type: Task Affects Versions: 2.1.1-beta Reporter: Nemon Lou Assignee: Nemon Lou Labels: test Attachments: YARN-1231.patch Use a separate jira to fix YARN's test cases that will fail by hitting max- am-used-resources-percent limit after YARN-276. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1033) Expose RM active/standby state to Web UI and REST API
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867413#comment-13867413 ] Nemon Lou commented on YARN-1033: - Thanks Karthik Kambatla .You are really efficient. +1(non-binding) Agree that HA state in JMX can be added later in another JIRA when needed. Expose RM active/standby state to Web UI and REST API - Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Nemon Lou Assignee: Karthik Kambatla Attachments: yarn-1033-1.patch Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Users should be able to access this information through the REST API as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated YARN-1033: Assignee: Karthik Kambatla (was: Nemon Lou) Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Nemon Lou Assignee: Karthik Kambatla Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13864948#comment-13864948 ] Nemon Lou commented on YARN-1033: - Hi,Karthik Kambatla .Feel free to take it. : ) Thanks Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: Nemon Lou Assignee: Nemon Lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1231) Fix test cases that will hit max- am-used-resources-percent limit after YARN-276
[ https://issues.apache.org/jira/browse/YARN-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated YARN-1231: Attachment: YARN-1231.patch A patch fixing test cases in hadoop-yarn-server-resourcemanager project. Fix test cases that will hit max- am-used-resources-percent limit after YARN-276 Key: YARN-1231 URL: https://issues.apache.org/jira/browse/YARN-1231 Project: Hadoop YARN Issue Type: Task Affects Versions: 2.1.1-beta Reporter: Nemon Lou Assignee: Nemon Lou Labels: test Attachments: YARN-1231.patch Use a separate jira to fix YARN's test cases that will fail by hitting max- am-used-resources-percent limit after YARN-276. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1231) Fix test cases that will hit max- am-used-resources-percent limit after YARN-276
Nemon Lou created YARN-1231: --- Summary: Fix test cases that will hit max- am-used-resources-percent limit after YARN-276 Key: YARN-1231 URL: https://issues.apache.org/jira/browse/YARN-1231 Project: Hadoop YARN Issue Type: Task Affects Versions: 2.1.1-beta Reporter: Nemon Lou Assignee: Nemon Lou Use a separate jira to fix YARN's test cases that will fail by hitting max- am-used-resources-percent limit after YARN-276. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1196) LocalDirsHandlerService never change failedDirs back to normal even when these disks turn good
Nemon Lou created YARN-1196: --- Summary: LocalDirsHandlerService never change failedDirs back to normal even when these disks turn good Key: YARN-1196 URL: https://issues.apache.org/jira/browse/YARN-1196 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Nemon Lou A simple way to reproduce it: 1,change access mode of one node manager's local-dirs to 000 After a few seconds,this node manager will become unhealthy. 2,change access mode of one node manager's local-dirs back to normal. The node manager is still unhealthy with all local-dirs in bad state even after a long time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1196) LocalDirsHandlerService never change failedDirs back to normal even when these disks turn good
[ https://issues.apache.org/jira/browse/YARN-1196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nemon Lou updated YARN-1196: Description: A simple way to reproduce it: 1,change access mode of one node manager's local-dirs to 000 After a few seconds,this node manager will become unhealthy. 2,change access mode of the node manager's local-dirs back to normal. The node manager is still unhealthy with all local-dirs in bad state even after a long time. was: A simple way to reproduce it: 1,change access mode of one node manager's local-dirs to 000 After a few seconds,this node manager will become unhealthy. 2,change access mode of one node manager's local-dirs back to normal. The node manager is still unhealthy with all local-dirs in bad state even after a long time. LocalDirsHandlerService never change failedDirs back to normal even when these disks turn good -- Key: YARN-1196 URL: https://issues.apache.org/jira/browse/YARN-1196 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.1.1-beta Reporter: Nemon Lou A simple way to reproduce it: 1,change access mode of one node manager's local-dirs to 000 After a few seconds,this node manager will become unhealthy. 2,change access mode of the node manager's local-dirs back to normal. The node manager is still unhealthy with all local-dirs in bad state even after a long time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-938) Hadoop 2 benchmarking
[ https://issues.apache.org/jira/browse/YARN-938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13763838#comment-13763838 ] Nemon Lou commented on YARN-938: Thanks Mayank Bansal for your work.Do you mind sharing how much input data do you run for TeraSort? Hadoop 2 benchmarking -- Key: YARN-938 URL: https://issues.apache.org/jira/browse/YARN-938 Project: Hadoop YARN Issue Type: Task Reporter: Mayank Bansal Assignee: Mayank Bansal Attachments: Hadoop-benchmarking-2.x-vs-1.x.xls I am running the benchmarks on Hadoop 2 and will update the results soon. Thanks, Mayank -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-842) Resource Manager Node Manager UI's doesn't work with IE
[ https://issues.apache.org/jira/browse/YARN-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761600#comment-13761600 ] Nemon Lou commented on YARN-842: Following Harsh J's advise,a different error occurs : {code} user agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; aff-kingsoft-ciba) timestamp: Mon, 9 Sep 2013 03:45:49 UTC message: Object doesn't support this property or method line: 652 char: 21 code: 0 URI: http://158.1.131.13:8088/static/jt/jquery.jstree.js {code} Any suggestions? Resource Manager Node Manager UI's doesn't work with IE - Key: YARN-842 URL: https://issues.apache.org/jira/browse/YARN-842 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Affects Versions: 2.0.4-alpha Reporter: Devaraj K {code:xml} Webpage error details User Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0) Timestamp: Mon, 17 Jun 2013 12:06:03 UTC Message: 'JSON' is undefined Line: 41 Char: 218 Code: 0 URI: http://10.18.40.24:8088/cluster/apps {code} RM NM UI's are not working with IE and showing the above error for every link on the UI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt
[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750861#comment-13750861 ] Nemon Lou commented on YARN-292: I will try to post my test result after applying this patch when i have time. No idea about the test case part. ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt Key: YARN-292 URL: https://issues.apache.org/jira/browse/YARN-292 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Zhijie Shen Attachments: YARN-292.1.patch, YARN-292.2.patch, YARN-292.3.patch {code:xml} 2012-12-26 08:41:15,030 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Calling allocate on removed or non existant application appattempt_1356385141279_49525_01 2012-12-26 08:41:15,031 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1356385141279_49525 java.lang.ArrayIndexOutOfBoundsException: 0 at java.util.Arrays$ArrayList.get(Arrays.java:3381) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt
[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13742021#comment-13742021 ] nemon lou commented on YARN-292: Thanks Zhijie Shen for your update.Do you plan to add some test cases for it? I think the test part will be the most difficult one. ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt Key: YARN-292 URL: https://issues.apache.org/jira/browse/YARN-292 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Zhijie Shen Attachments: YARN-292.1.patch, YARN-292.2.patch, YARN-292.3.patch {code:xml} 2012-12-26 08:41:15,030 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Calling allocate on removed or non existant application appattempt_1356385141279_49525_01 2012-12-26 08:41:15,031 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1356385141279_49525 java.lang.ArrayIndexOutOfBoundsException: 0 at java.util.Arrays$ArrayList.get(Arrays.java:3381) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-292) ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt
[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13740880#comment-13740880 ] nemon lou commented on YARN-292: FIFO Scheduler uses TreeMap to keep applications in FIFO order,ConcurrentHashMap will break this featrue. Right? ResourceManager throws ArrayIndexOutOfBoundsException while handling CONTAINER_ALLOCATED for application attempt Key: YARN-292 URL: https://issues.apache.org/jira/browse/YARN-292 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.0.1-alpha Reporter: Devaraj K Assignee: Zhijie Shen Attachments: YARN-292.1.patch, YARN-292.2.patch {code:xml} 2012-12-26 08:41:15,030 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: Calling allocate on removed or non existant application appattempt_1356385141279_49525_01 2012-12-26 08:41:15,031 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type CONTAINER_ALLOCATED for applicationAttempt application_1356385141279_49525 java.lang.ArrayIndexOutOfBoundsException: 0 at java.util.Arrays$ArrayList.get(Arrays.java:3381) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) at java.lang.Thread.run(Thread.java:662) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-1027: Assignee: Karthik Kambatla (was: nemon lou) Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Karthik Kambatla Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-1033) Expose RM active/standby state to web UI and metrics
nemon lou created YARN-1033: --- Summary: Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: nemon lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. RM web services shall refuse client request unless querying for RM state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-1033: Assignee: nemon lou Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: nemon lou Assignee: nemon lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. RM web services shall refuse client request unless querying for RM state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1033) Expose RM active/standby state to web UI and metrics
[ https://issues.apache.org/jira/browse/YARN-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-1033: Description: Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. was: Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. RM web services shall refuse client request unless querying for RM state. Expose RM active/standby state to web UI and metrics Key: YARN-1033 URL: https://issues.apache.org/jira/browse/YARN-1033 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: 2.1.0-beta Reporter: nemon lou Assignee: nemon lou Both active and standby RM shall expose it's web server and show it's current state (active or standby) on web page. Cluster metrics also need this state for monitor. Standby RM web services shall refuse client request unless querying for RM state. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729348#comment-13729348 ] nemon lou commented on YARN-1027: - I have also started working on this since it was in unassigned. It's ok to take it up,i will review the patch :) Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: nemon lou Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-1027) Implement RMHAServiceProtocol
[ https://issues.apache.org/jira/browse/YARN-1027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-1027: Assignee: nemon lou Implement RMHAServiceProtocol - Key: YARN-1027 URL: https://issues.apache.org/jira/browse/YARN-1027 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: nemon lou Implement existing HAServiceProtocol from Hadoop common. This protocol is the single point of interaction between the RM and HA clients/services. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Assignee: nemon lou Labels: incompatible Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-764) blank Used Resources on Capacity Scheduler page
nemon lou created YARN-764: -- Summary: blank Used Resources on Capacity Scheduler page Key: YARN-764 URL: https://issues.apache.org/jira/browse/YARN-764 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.4-alpha Reporter: nemon lou Assignee: nemon lou Even when there are jobs running,used resources is empty on Capacity Scheduler page for leaf queue.(I use google-chrome on windows 7.) After changing resource.java's toString method by replacing with {},this bug gets fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-764) blank Used Resources on Capacity Scheduler page
[ https://issues.apache.org/jira/browse/YARN-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-764: --- Attachment: YARN-764.patch No test case added since it's only a symbol change in toString() blank Used Resources on Capacity Scheduler page Key: YARN-764 URL: https://issues.apache.org/jira/browse/YARN-764 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.4-alpha Reporter: nemon lou Assignee: nemon lou Attachments: YARN-764.patch Even when there are jobs running,used resources is empty on Capacity Scheduler page for leaf queue.(I use google-chrome on windows 7.) After changing resource.java's toString method by replacing with {},this bug gets fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-764) blank Used Resources on Capacity Scheduler page
[ https://issues.apache.org/jira/browse/YARN-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-764: --- Attachment: YARN-764.patch Changing patch as Thomas suggested,escape HTML is a better way definitely blank Used Resources on Capacity Scheduler page Key: YARN-764 URL: https://issues.apache.org/jira/browse/YARN-764 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.4-alpha Reporter: nemon lou Assignee: nemon lou Attachments: YARN-764.patch, YARN-764.patch Even when there are jobs running,used resources is empty on Capacity Scheduler page for leaf queue.(I use google-chrome on windows 7.) After changing resource.java's toString method by replacing with {},this bug gets fixed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-606) negative queue metrics apps Failed
[ https://issues.apache.org/jira/browse/YARN-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-606: --- Assignee: nemon lou negative queue metrics apps Failed - Key: YARN-606 URL: https://issues.apache.org/jira/browse/YARN-606 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: nemon lou Assignee: nemon lou Priority: Minor Queue metrcis apps Failed can be negative in some cases(more than one attempt for an application can cause this). It's confusing if we use this metrics directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-513) Verify all clients will wait for RM to restart
[ https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640150#comment-13640150 ] nemon lou commented on YARN-513: What about admin client? refreshQueues,refreshNodes,etc. These will be needed in HA. Verify all clients will wait for RM to restart -- Key: YARN-513 URL: https://issues.apache.org/jira/browse/YARN-513 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Bikas Saha Assignee: Xuan Gong When the RM is restarting, the NM, AM and Clients should wait for some time for the RM to come back up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-606) negative queue metrcis apps Failed
nemon lou created YARN-606: -- Summary: negative queue metrcis apps Failed Key: YARN-606 URL: https://issues.apache.org/jira/browse/YARN-606 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: nemon lou Priority: Minor Queue metrcis apps Failed can be negative in some cases(more than one attempt for an application can cause this). It's confusing if we use this metrics directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-606) negative queue metrcis apps Failed
[ https://issues.apache.org/jira/browse/YARN-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13640214#comment-13640214 ] nemon lou commented on YARN-606: The submitApp() method in QueueMetrcis.java cause negative ,it has this logic: public void submitApp(String user, int attemptId) { if (attemptId == 1) { appsSubmitted.incr(); } else { appsFailed.decr(); } ... } Which is introduced in by MAPREDUCE-3870. negative queue metrcis apps Failed - Key: YARN-606 URL: https://issues.apache.org/jira/browse/YARN-606 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: nemon lou Priority: Minor Queue metrcis apps Failed can be negative in some cases(more than one attempt for an application can cause this). It's confusing if we use this metrics directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-606) negative queue metrics apps Failed
[ https://issues.apache.org/jira/browse/YARN-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-606: --- Summary: negative queue metrics apps Failed (was: negative queue metrcis apps Failed) negative queue metrics apps Failed - Key: YARN-606 URL: https://issues.apache.org/jira/browse/YARN-606 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.0.3-alpha Reporter: nemon lou Priority: Minor Queue metrcis apps Failed can be negative in some cases(more than one attempt for an application can cause this). It's confusing if we use this metrics directly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Assignee: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Labels: incompatible (was: ) Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Assignee: nemon lou Labels: incompatible Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Assignee: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Assignee: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-20) More information for yarn.resourcemanager.webapp.address in yarn-default.xml
[ https://issues.apache.org/jira/browse/YARN-20?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou resolved YARN-20. --- Resolution: Won't Fix More information for yarn.resourcemanager.webapp.address in yarn-default.xml -- Key: YARN-20 URL: https://issues.apache.org/jira/browse/YARN-20 Project: Hadoop YARN Issue Type: Improvement Components: documentation, resourcemanager Affects Versions: 2.0.0-alpha Reporter: nemon lou Priority: Trivial Attachments: YARN-20.patch Original Estimate: 1h Remaining Estimate: 1h The parameter yarn.resourcemanager.webapp.address in yarn-default.xml is in host:port format,which is noted in the cluster set up guide (http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html). When i read though the code,i find host format is also supported. In host format,the port will be random. So we may add more documentation in yarn-default.xml for easy understood. I will submit a patch if it's helpful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630939#comment-13630939 ] nemon lou commented on YARN-276: [~tgraves] Here is the initial thoughts on checking cluster level AM resource percent in each leafqueue: Leaf queue'capacity is computed based on absoluteMaxCapacity. Considering we have 10 leaf queues,each with a value of 100% absoluteMaxCapacity and 10% maxAMResourcePerQueuePercent configured, there is still a chance that all leaf queue's resources taken up by AM before reaching the 10% maxAMResourcePerQueuePercent limit. Note that a cluster basis' am resource percent only works in leaf queue if no am resource percent configured for this leaf queue. As Thomas Graves mentioned,cluster level checking will causing one queue restrict another.I will remove cluster level checking. Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Assignee: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch uploading a interim patch. Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Assignee: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-447) applicationComparator improvement for CS
[ https://issues.apache.org/jira/browse/YARN-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-447: --- Attachment: YARN-447-trunk.patch Patch rebased. applicationComparator improvement for CS Key: YARN-447 URL: https://issues.apache.org/jira/browse/YARN-447 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.3-alpha Reporter: nemon lou Assignee: nemon lou Priority: Minor Attachments: YARN-447-trunk.patch, YARN-447-trunk.patch, YARN-447-trunk.patch, YARN-447-trunk.patch Now the compare code is : return a1.getApplicationId().getId() - a2.getApplicationId().getId(); Will be replaced with : return a1.getApplicationId().compareTo(a2.getApplicationId()); This will bring some benefits: 1,leave applicationId compare logic to ApplicationId class; 2,In future's HA mode,cluster time stamp may change,ApplicationId class already takes care of this condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13617008#comment-13617008 ] nemon lou commented on YARN-276: [~zjshen] Yes,a dynamic maxActiveApplications will work ,too.And no need adding any new criteria .I'll give it a try . Thanks. Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-447) applicationComparator improvement for CS
nemon lou created YARN-447: -- Summary: applicationComparator improvement for CS Key: YARN-447 URL: https://issues.apache.org/jira/browse/YARN-447 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.3-alpha Reporter: nemon lou Priority: Minor Attachments: YARN-447-trunk.patch Now the compare code is : return a1.getApplicationId().getId() - a2.getApplicationId().getId(); Will be replaced with : return a1.getApplicationId().compareTo(a2.getApplicationId()); This will bring some benefits: 1,leave applicationId compare logic to ApplicationId class; 2,In future's HA mode,cluster time stamp may change,ApplicationId class already takes care of this condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-447) applicationComparator improvement for CS
[ https://issues.apache.org/jira/browse/YARN-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-447: --- Attachment: YARN-447-trunk.patch Attaching a simple patch with a test case. applicationComparator improvement for CS Key: YARN-447 URL: https://issues.apache.org/jira/browse/YARN-447 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.3-alpha Reporter: nemon lou Priority: Minor Attachments: YARN-447-trunk.patch Now the compare code is : return a1.getApplicationId().getId() - a2.getApplicationId().getId(); Will be replaced with : return a1.getApplicationId().compareTo(a2.getApplicationId()); This will bring some benefits: 1,leave applicationId compare logic to ApplicationId class; 2,In future's HA mode,cluster time stamp may change,ApplicationId class already takes care of this condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-447) applicationComparator improvement for CS
[ https://issues.apache.org/jira/browse/YARN-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-447: --- Attachment: YARN-447-trunk.patch Use real applicationId instead of mock one in TestUtil.So applicationId's compareTo method will do its work applicationComparator improvement for CS Key: YARN-447 URL: https://issues.apache.org/jira/browse/YARN-447 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.3-alpha Reporter: nemon lou Priority: Minor Attachments: YARN-447-trunk.patch, YARN-447-trunk.patch Now the compare code is : return a1.getApplicationId().getId() - a2.getApplicationId().getId(); Will be replaced with : return a1.getApplicationId().compareTo(a2.getApplicationId()); This will bring some benefits: 1,leave applicationId compare logic to ApplicationId class; 2,In future's HA mode,cluster time stamp may change,ApplicationId class already takes care of this condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-447) applicationComparator improvement for CS
[ https://issues.apache.org/jira/browse/YARN-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-447: --- Attachment: YARN-447-trunk.patch Adding a timeout applicationComparator improvement for CS Key: YARN-447 URL: https://issues.apache.org/jira/browse/YARN-447 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.3-alpha Reporter: nemon lou Priority: Minor Attachments: YARN-447-trunk.patch, YARN-447-trunk.patch, YARN-447-trunk.patch Now the compare code is : return a1.getApplicationId().getId() - a2.getApplicationId().getId(); Will be replaced with : return a1.getApplicationId().compareTo(a2.getApplicationId()); This will bring some benefits: 1,leave applicationId compare logic to ApplicationId class; 2,In future's HA mode,cluster time stamp may change,ApplicationId class already takes care of this condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-447) applicationComparator improvement for CS
[ https://issues.apache.org/jira/browse/YARN-447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13593135#comment-13593135 ] nemon lou commented on YARN-447: This patch is ready for review now. Thank you. applicationComparator improvement for CS Key: YARN-447 URL: https://issues.apache.org/jira/browse/YARN-447 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.0.3-alpha Reporter: nemon lou Priority: Minor Attachments: YARN-447-trunk.patch, YARN-447-trunk.patch, YARN-447-trunk.patch Now the compare code is : return a1.getApplicationId().getId() - a2.getApplicationId().getId(); Will be replaced with : return a1.getApplicationId().compareTo(a2.getApplicationId()); This will bring some benefits: 1,leave applicationId compare logic to ApplicationId class; 2,In future's HA mode,cluster time stamp may change,ApplicationId class already takes care of this condition. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (YARN-111) Application level priority in Resource Manager Schedulers
[ https://issues.apache.org/jira/browse/YARN-111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou resolved YARN-111. Resolution: Won't Fix Application level priority in Resource Manager Schedulers - Key: YARN-111 URL: https://issues.apache.org/jira/browse/YARN-111 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.1-alpha Reporter: nemon lou We need application level priority for Hadoop 2.0,both in FIFO scheduler and Capacity Scheduler. In Hadoop 1.0.x,job priority is supported. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-20) More information for yarn.resourcemanager.webapp.address in yarn-default.xml
[ https://issues.apache.org/jira/browse/YARN-20?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-20: -- Attachment: YARN-20.patch Adding annotation just as Harsh J said.Sorry for comming back so late.No test case is added since it's only a trivial document change. More information for yarn.resourcemanager.webapp.address in yarn-default.xml -- Key: YARN-20 URL: https://issues.apache.org/jira/browse/YARN-20 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.0.0-alpha Reporter: nemon lou Priority: Trivial Attachments: YARN-20.patch Original Estimate: 1h Remaining Estimate: 1h The parameter yarn.resourcemanager.webapp.address in yarn-default.xml is in host:port format,which is noted in the cluster set up guide (http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/ClusterSetup.html). When i read though the code,i find host format is also supported. In host format,the port will be random. So we may add more documentation in yarn-default.xml for easy understood. I will submit a patch if it's helpful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-111) Application level priority in Resource Manager Schedulers
[ https://issues.apache.org/jira/browse/YARN-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573142#comment-13573142 ] nemon lou commented on YARN-111: Finally i use two Queues in Capacity Scheduler to basically meet our needs. Both queue has a Absolute Max Capacity of 100% .The queue with higher priority has more Absolute Capacity configured(85%). Job which need high priority will be submitted to the queue which has more Absolute Capacity configured. Application level priority in Resource Manager Schedulers - Key: YARN-111 URL: https://issues.apache.org/jira/browse/YARN-111 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.1-alpha Reporter: nemon lou We need application level priority for Hadoop 2.0,both in FIFO scheduler and Capacity Scheduler. In Hadoop 1.0.x,job priority is supported. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-374) Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication
[ https://issues.apache.org/jira/browse/YARN-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573165#comment-13573165 ] nemon lou commented on YARN-374: Thanks for the information. But why not have one more API like gracefullyKillApplication(or just change force kill's behavior). With this method,RM will ask AM to kill the app itself, a force kill will be triggered if AM haven't killed itself during some period. Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication -- Key: YARN-374 URL: https://issues.apache.org/jira/browse/YARN-374 Project: Hadoop YARN Issue Type: Bug Components: client, resourcemanager Affects Versions: 2.0.1-alpha Reporter: nemon lou After i kill a app by typing bin/yarn rmadmin app -kill APP_ID, no job info is kept on JHS web page. However, when i kill a job by typing bin/mapred job -kill JOB_ID , i can see a killed job left on JHS. Some hive users are confused by that their jobs been killed but nothing left on JHS ,and killed app's info on RM web page is not enough.(They kill job by clientRMProtocol) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-374) Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication
[ https://issues.apache.org/jira/browse/YARN-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573238#comment-13573238 ] nemon lou commented on YARN-374: Agree that YARN-321 will help. Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication -- Key: YARN-374 URL: https://issues.apache.org/jira/browse/YARN-374 Project: Hadoop YARN Issue Type: Bug Components: client, resourcemanager Affects Versions: 2.0.1-alpha Reporter: nemon lou After i kill a app by typing bin/yarn rmadmin app -kill APP_ID, no job info is kept on JHS web page. However, when i kill a job by typing bin/mapred job -kill JOB_ID , i can see a killed job left on JHS. Some hive users are confused by that their jobs been killed but nothing left on JHS ,and killed app's info on RM web page is not enough.(They kill job by clientRMProtocol) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-374) Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication
nemon lou created YARN-374: -- Summary: Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication Key: YARN-374 URL: https://issues.apache.org/jira/browse/YARN-374 Project: Hadoop YARN Issue Type: Bug Reporter: nemon lou After i kill a app by typing bin/yarn rmadmin app -kill APP_ID, no job info is kept on JHS web page. However, when i kill a job by typing bin/mapred job -kill JOB_ID , i can see a killed job left on JHS. Some hive users are confused by that their jobs been killed but nothing left on JHS ,and killed app's info on RM web page is not enough.(They kill job by clientRMProtocol) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-374) Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication
[ https://issues.apache.org/jira/browse/YARN-374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-374: --- Component/s: resourcemanager client Affects Version/s: 2.0.1-alpha The difference between bin/yarn and bin/mapred is that one use clientRMProtocol sending request to RM and the other use MRClientProtocol sending request to AM. Job History Server doesn't show jobs which killed by ClientRMProtocol.forceKillApplication -- Key: YARN-374 URL: https://issues.apache.org/jira/browse/YARN-374 Project: Hadoop YARN Issue Type: Bug Components: client, resourcemanager Affects Versions: 2.0.1-alpha Reporter: nemon lou After i kill a app by typing bin/yarn rmadmin app -kill APP_ID, no job info is kept on JHS web page. However, when i kill a job by typing bin/mapred job -kill JOB_ID , i can see a killed job left on JHS. Some hive users are confused by that their jobs been killed but nothing left on JHS ,and killed app's info on RM web page is not enough.(They kill job by clientRMProtocol) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539260#comment-13539260 ] nemon lou commented on YARN-276: updating the patch. Four properties have been added to CS web page: Max AM Used Per Queue Percent Actual AM Used Per Queue Percent Max AM Used Percent For Cluster Actual AM Used Percent For Cluster This patch keeps track of AM used resources and checks for it both at cluster level and leaf Queue level. Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537643#comment-13537643 ] nemon lou commented on YARN-276: Good idea ,Robert.Thank you for your comment. I think it's good to display AM used resources and AM percent limit(or max resources that AMs can use) for each leaf queue on capacity scheduler page. Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13535809#comment-13535809 ] nemon lou commented on YARN-276: All YARN and MR 's tests passed on my own cluster.So Submit Patch again. Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536818#comment-13536818 ] nemon lou commented on YARN-276: This patch is ready for review now.Thank you. Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
nemon lou created YARN-276: -- Summary: Capacity Scheduler can hang when submit many jobs concurrently Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.0.1-alpha, 3.0.0 Reporter: nemon lou In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently
[ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-276: --- Attachment: YARN-276.patch Capacity Scheduler can hang when submit many jobs concurrently -- Key: YARN-276 URL: https://issues.apache.org/jira/browse/YARN-276 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.0.0, 2.0.1-alpha Reporter: nemon lou Attachments: YARN-276.patch Original Estimate: 24h Remaining Estimate: 24h In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. The cause is that yarn.scheduler.capacity.maximum-am-resource-percent not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-111) Application level priority in Resource Manager Schedulers
[ https://issues.apache.org/jira/browse/YARN-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13459420#comment-13459420 ] nemon lou commented on YARN-111: to Harsh J I have looked into the code. CS's LeafQueue keeps active applications and pending applacations in TreeSet. TreeSet's comparator comes from CapacityScheduler's applicationComparator . ApplicationComparator's compare method is like this: return a1.getApplicationId().getId() - a2.getApplicationId().getId(); (org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.java line 106) So application's priority doesn't take effect.Am i right? to Robert I'm not sure whether job priority is removed or not in recent MR1 code.But it will be very nice of you to add application level priority in YARN. :) Application level priority in Resource Manager Schedulers - Key: YARN-111 URL: https://issues.apache.org/jira/browse/YARN-111 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.1-alpha Reporter: nemon lou We need application level priority for Hadoop 2.0,both in FIFO scheduler and Capacity Scheduler. In Hadoop 1.0.x,job priority is supported. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (YARN-111) Application level priority in Resource Manager Schedulers
nemon lou created YARN-111: -- Summary: Application level priority in Resource Manager Schedulers Key: YARN-111 URL: https://issues.apache.org/jira/browse/YARN-111 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.0.1-alpha Reporter: nemon lou We need application level priority for Hadoop 2.0,both in FIFO scheduler and Capacity Scheduler. In Hadoop 1.0.x,job priority is supported. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (YARN-77) Test case TestAMRMRPCNodeUpdates.testAMRMUnusableNodes fails occasionally
[ https://issues.apache.org/jira/browse/YARN-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13447104#comment-13447104 ] nemon lou commented on YARN-77: --- In other words ,method syncNodeHeartbeat doesn't work synchronously. DrainDispatcher's await() method has return before queue becomes empty. Test case TestAMRMRPCNodeUpdates.testAMRMUnusableNodes fails occasionally - Key: YARN-77 URL: https://issues.apache.org/jira/browse/YARN-77 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-alpha Environment: Linux 2.6.32.12-0.7-default x86_64 java version 1.6.0_26 Java HotSpot(TM) 64-Bit Server VM Reporter: nemon lou Attachments: TestAMRMRPCNodeUpdates_output.TXT Test case org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates.testAMRMUnusableNodes fails occasionally in my entironment. Here is the error message.Standard output will be uploaded in a file later. Error Message expected:1 but was:0Stacktrace junit.framework.AssertionFailedError: expected:1 but was:0 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates.testAMRMUnusableNodes(TestAMRMRPCNodeUpdates.java:123) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-77) Test case TestAMRMRPCNodeUpdates.testAMRMUnusableNodes fails occasionally
[ https://issues.apache.org/jira/browse/YARN-77?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nemon lou updated YARN-77: -- Attachment: TestAMRMRPCNodeUpdates_output.TXT Test case TestAMRMRPCNodeUpdates.testAMRMUnusableNodes fails occasionally - Key: YARN-77 URL: https://issues.apache.org/jira/browse/YARN-77 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.1.0-alpha Environment: Linux 2.6.32.12-0.7-default x86_64 java version 1.6.0_26 Java HotSpot(TM) 64-Bit Server VM Reporter: nemon lou Attachments: TestAMRMRPCNodeUpdates_output.TXT Test case org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates.testAMRMUnusableNodes fails occasionally in my entironment. Here is the error message.Standard output will be uploaded in a file later. Error Message expected:1 but was:0Stacktrace junit.framework.AssertionFailedError: expected:1 but was:0 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:195) at junit.framework.Assert.assertEquals(Assert.java:201) at org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates.testAMRMUnusableNodes(TestAMRMRPCNodeUpdates.java:123) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) at org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira