[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216093#comment-14216093 ] Hudson commented on YARN-2574: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/9/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java Add support for FairScheduler to the ReservationSystem -- Key: YARN-2574 URL: https://issues.apache.org/jira/browse/YARN-2574 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Subru Krishnan Assignee: Anubhav Dhoot YARN-1051 introduces the ReservationSystem and the current implementation is based on CapacityScheduler. This JIRA proposes adding support for FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2414) RM web UI: app page will crash if app is failed before any attempt has been created
[ https://issues.apache.org/jira/browse/YARN-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216086#comment-14216086 ] Hudson commented on YARN-2414: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/9/]) YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been created. Contributed by Wangda Tan (jlowe: rev 81c9d17af84ed87b9ded7057cb726a3855ddd32d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestAppPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/QueueACLsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/security/ApplicationACLsManager.java * hadoop-yarn-project/CHANGES.txt RM web UI: app page will crash if app is failed before any attempt has been created --- Key: YARN-2414 URL: https://issues.apache.org/jira/browse/YARN-2414 Project: Hadoop YARN Issue Type: Bug Components: webapp Reporter: Zhijie Shen Assignee: Wangda Tan Fix For: 2.7.0 Attachments: YARN-2414.20141104-1.patch, YARN-2414.20141104-2.patch, YARN-2414.patch {code} 2014-08-12 16:45:13,573 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /cluster/app/application_1407887030038_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:84) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:460) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1191) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at
[jira] [Commented] (YARN-2690) Make ReservationSystem and its dependent classes independent of Scheduler type
[ https://issues.apache.org/jira/browse/YARN-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216100#comment-14216100 ] Hudson commented on YARN-2690: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #747 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/747/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java Make ReservationSystem and its dependent classes independent of Scheduler type Key: YARN-2690 URL: https://issues.apache.org/jira/browse/YARN-2690 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.7.0 Attachments: YARN-2690.001.patch, YARN-2690.002.patch, YARN-2690.002.patch, YARN-2690.003.patch, YARN-2690.004.patch, YARN-2690.004.patch A lot of common reservation classes depend on CapacityScheduler and specifically its configuration. This jira is to make them ready for other Schedulers by abstracting out the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216106#comment-14216106 ] Hudson commented on YARN-2574: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #747 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/747/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java Add support for FairScheduler to the ReservationSystem -- Key: YARN-2574 URL: https://issues.apache.org/jira/browse/YARN-2574 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Subru Krishnan Assignee: Anubhav Dhoot YARN-1051 introduces the ReservationSystem and the current implementation is based on CapacityScheduler. This JIRA proposes adding support for FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2414) RM web UI: app page will crash if app is failed before any attempt has been created
[ https://issues.apache.org/jira/browse/YARN-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216099#comment-14216099 ] Hudson commented on YARN-2414: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #747 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/747/]) YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been created. Contributed by Wangda Tan (jlowe: rev 81c9d17af84ed87b9ded7057cb726a3855ddd32d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/security/ApplicationACLsManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestAppPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/QueueACLsManager.java RM web UI: app page will crash if app is failed before any attempt has been created --- Key: YARN-2414 URL: https://issues.apache.org/jira/browse/YARN-2414 Project: Hadoop YARN Issue Type: Bug Components: webapp Reporter: Zhijie Shen Assignee: Wangda Tan Fix For: 2.7.0 Attachments: YARN-2414.20141104-1.patch, YARN-2414.20141104-2.patch, YARN-2414.patch {code} 2014-08-12 16:45:13,573 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /cluster/app/application_1407887030038_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:84) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:460) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1191) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at
[jira] [Commented] (YARN-2865) Application recovery continuously fails with Application with id already present. Cannot duplicate
[ https://issues.apache.org/jira/browse/YARN-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216134#comment-14216134 ] Rohith commented on YARN-2865: -- bq. Is it YARN-1874 ? This is different issue that move RMActiveService from ResourceManager, ignore it. Application recovery continuously fails with Application with id already present. Cannot duplicate Key: YARN-2865 URL: https://issues.apache.org/jira/browse/YARN-2865 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Critical Attachments: YARN-2865.patch YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2865) Application recovery continuously fails with Application with id already present. Cannot duplicate
[ https://issues.apache.org/jira/browse/YARN-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-2865: - Attachment: YARN-2865.patch Application recovery continuously fails with Application with id already present. Cannot duplicate Key: YARN-2865 URL: https://issues.apache.org/jira/browse/YARN-2865 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Critical Attachments: YARN-2865.patch, YARN-2865.patch YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2865) Application recovery continuously fails with Application with id already present. Cannot duplicate
[ https://issues.apache.org/jira/browse/YARN-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216180#comment-14216180 ] Rohith commented on YARN-2865: -- Attached patch that separates RMContext and RMActiveServiceContext given with minimal changes. # Retain RMContext interface. Add RMActiveServiceContext that owns the ActiveServices details.ActiveService services are accessed via activeServiceContext .getX() and activeServiceContext.setX() in RMContextImpl. #* RMContext : rmDispatcher,isHAEnabled,haServiceState,adminService,configurationProvider,activeServiceContext #* RMActiveServiceContext : other then previous line field variables from RMContext(like stateStore). Application recovery continuously fails with Application with id already present. Cannot duplicate Key: YARN-2865 URL: https://issues.apache.org/jira/browse/YARN-2865 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Critical Attachments: YARN-2865.patch, YARN-2865.patch YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216205#comment-14216205 ] Hudson commented on YARN-2574: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1937 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1937/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java Add support for FairScheduler to the ReservationSystem -- Key: YARN-2574 URL: https://issues.apache.org/jira/browse/YARN-2574 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Subru Krishnan Assignee: Anubhav Dhoot YARN-1051 introduces the ReservationSystem and the current implementation is based on CapacityScheduler. This JIRA proposes adding support for FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2690) Make ReservationSystem and its dependent classes independent of Scheduler type
[ https://issues.apache.org/jira/browse/YARN-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216199#comment-14216199 ] Hudson commented on YARN-2690: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1937 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1937/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java Make ReservationSystem and its dependent classes independent of Scheduler type Key: YARN-2690 URL: https://issues.apache.org/jira/browse/YARN-2690 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.7.0 Attachments: YARN-2690.001.patch, YARN-2690.002.patch, YARN-2690.002.patch, YARN-2690.003.patch, YARN-2690.004.patch, YARN-2690.004.patch A lot of common reservation classes depend on CapacityScheduler and specifically its configuration. This jira is to make them ready for other Schedulers by abstracting out the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2414) RM web UI: app page will crash if app is failed before any attempt has been created
[ https://issues.apache.org/jira/browse/YARN-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216198#comment-14216198 ] Hudson commented on YARN-2414: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1937 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1937/]) YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been created. Contributed by Wangda Tan (jlowe: rev 81c9d17af84ed87b9ded7057cb726a3855ddd32d) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestAppPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/security/ApplicationACLsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/QueueACLsManager.java RM web UI: app page will crash if app is failed before any attempt has been created --- Key: YARN-2414 URL: https://issues.apache.org/jira/browse/YARN-2414 Project: Hadoop YARN Issue Type: Bug Components: webapp Reporter: Zhijie Shen Assignee: Wangda Tan Fix For: 2.7.0 Attachments: YARN-2414.20141104-1.patch, YARN-2414.20141104-2.patch, YARN-2414.patch {code} 2014-08-12 16:45:13,573 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /cluster/app/application_1407887030038_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:84) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:460) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1191) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at
[jira] [Commented] (YARN-2690) Make ReservationSystem and its dependent classes independent of Scheduler type
[ https://issues.apache.org/jira/browse/YARN-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216213#comment-14216213 ] Hudson commented on YARN-2690: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/9/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java Make ReservationSystem and its dependent classes independent of Scheduler type Key: YARN-2690 URL: https://issues.apache.org/jira/browse/YARN-2690 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.7.0 Attachments: YARN-2690.001.patch, YARN-2690.002.patch, YARN-2690.002.patch, YARN-2690.003.patch, YARN-2690.004.patch, YARN-2690.004.patch A lot of common reservation classes depend on CapacityScheduler and specifically its configuration. This jira is to make them ready for other Schedulers by abstracting out the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216219#comment-14216219 ] Hudson commented on YARN-2574: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/9/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java Add support for FairScheduler to the ReservationSystem -- Key: YARN-2574 URL: https://issues.apache.org/jira/browse/YARN-2574 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Subru Krishnan Assignee: Anubhav Dhoot YARN-1051 introduces the ReservationSystem and the current implementation is based on CapacityScheduler. This JIRA proposes adding support for FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2414) RM web UI: app page will crash if app is failed before any attempt has been created
[ https://issues.apache.org/jira/browse/YARN-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216212#comment-14216212 ] Hudson commented on YARN-2414: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/9/]) YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been created. Contributed by Wangda Tan (jlowe: rev 81c9d17af84ed87b9ded7057cb726a3855ddd32d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/security/ApplicationACLsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/QueueACLsManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestAppPage.java RM web UI: app page will crash if app is failed before any attempt has been created --- Key: YARN-2414 URL: https://issues.apache.org/jira/browse/YARN-2414 Project: Hadoop YARN Issue Type: Bug Components: webapp Reporter: Zhijie Shen Assignee: Wangda Tan Fix For: 2.7.0 Attachments: YARN-2414.20141104-1.patch, YARN-2414.20141104-2.patch, YARN-2414.patch {code} 2014-08-12 16:45:13,573 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /cluster/app/application_1407887030038_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:84) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:460) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1191) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at
[jira] [Updated] (YARN-2165) Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero
[ https://issues.apache.org/jira/browse/YARN-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vasanth kumar RJ updated YARN-2165: --- Attachment: YARN-2165.1.patch [~zjshen] Implemented your comments. Please give comments if any change required Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero - Key: YARN-2165 URL: https://issues.apache.org/jira/browse/YARN-2165 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Karam Singh Assignee: Vasanth kumar RJ Attachments: YARN-2165.1.patch, YARN-2165.patch Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero Currently if set yarn.timeline-service.ttl-ms=0 Or yarn.timeline-service.ttl-ms=-86400 Timeline server start successfully with complaining {code} 2014-06-15 14:52:16,562 INFO timeline.LeveldbTimelineStore (LeveldbTimelineStore.java:init(247)) - Starting deletion thread with ttl -60480 and cycle interval 30 {code} At starting timelinserver should that yarn.timeline-service-ttl-ms 0 otherwise specially for -ive value discard oldvalues timestamp will be set future value. Which may lead to inconsistancy in behavior {code} public void run() { while (true) { long timestamp = System.currentTimeMillis() - ttl; try { discardOldEntities(timestamp); Thread.sleep(ttlInterval); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2865) Application recovery continuously fails with Application with id already present. Cannot duplicate
[ https://issues.apache.org/jira/browse/YARN-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216242#comment-14216242 ] Hadoop QA commented on YARN-2865: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682163/YARN-2865.patch against trunk revision 9dd5d67. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5866//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5866//console This message is automatically generated. Application recovery continuously fails with Application with id already present. Cannot duplicate Key: YARN-2865 URL: https://issues.apache.org/jira/browse/YARN-2865 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Critical Attachments: YARN-2865.patch, YARN-2865.patch YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2874) Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps
Naganarasimha G R created YARN-2874: --- Summary: Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps Key: YARN-2874 URL: https://issues.apache.org/jira/browse/YARN-2874 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.5.1, 2.4.1, 2.5.0 Reporter: Naganarasimha G R Assignee: Naganarasimha G R Priority: Critical When token renewal fails and the application finishes this dead lock can occur Jstack dump : {quote} Found one Java-level deadlock: = DelegationTokenRenewer #181865: waiting to lock monitor 0x00900918 (object 0xc18a9998, a java.util.Collections$SynchronizedSet), which is held by DelayedTokenCanceller DelayedTokenCanceller: waiting to lock monitor 0x04141718 (object 0xc7eae720, a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask), which is held by Timer-4 Timer-4: waiting to lock monitor 0x00900918 (object 0xc18a9998, a java.util.Collections$SynchronizedSet), which is held by DelayedTokenCanceller Java stack information for the threads listed above: === DelegationTokenRenewer #181865: at java.util.Collections$SynchronizedCollection.add(Collections.java:1636) - waiting to lock 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.addTokenToList(DelegationTokenRenewer.java:322) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:398) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$500(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:657) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:638) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) DelayedTokenCanceller: at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.cancel(DelegationTokenRenewer.java:443) - waiting to lock 0xc7eae720 (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeApplicationFromRenewal(DelegationTokenRenewer.java:558) - locked 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$300(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelayedTokenRemovalRunnable.run(DelegationTokenRenewer.java:599) at java.lang.Thread.run(Thread.java:745) Timer-4: at java.util.Collections$SynchronizedCollection.remove(Collections.java:1639) - waiting to lock 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeFailedDelegationToken(DelegationTokenRenewer.java:503) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$100(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.run(DelegationTokenRenewer.java:437) - locked 0xc7eae720 (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505) Found 1 deadlock. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2738) Add FairReservationSystem for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216246#comment-14216246 ] Anubhav Dhoot commented on YARN-2738: - The minimal configuration we need is ability to mark queues as usable by the reservation system. We can use global defaults for the rest to begin with. Add FairReservationSystem for FairScheduler --- Key: YARN-2738 URL: https://issues.apache.org/jira/browse/YARN-2738 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2738.001.patch, YARN-2738.002.patch Need to create a FairReservationSystem that will implement ReservationSystem for FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2690) Make ReservationSystem and its dependent classes independent of Scheduler type
[ https://issues.apache.org/jira/browse/YARN-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216273#comment-14216273 ] Hudson commented on YARN-2690: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1961 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1961/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java Make ReservationSystem and its dependent classes independent of Scheduler type Key: YARN-2690 URL: https://issues.apache.org/jira/browse/YARN-2690 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.7.0 Attachments: YARN-2690.001.patch, YARN-2690.002.patch, YARN-2690.002.patch, YARN-2690.003.patch, YARN-2690.004.patch, YARN-2690.004.patch A lot of common reservation classes depend on CapacityScheduler and specifically its configuration. This jira is to make them ready for other Schedulers by abstracting out the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2414) RM web UI: app page will crash if app is failed before any attempt has been created
[ https://issues.apache.org/jira/browse/YARN-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216272#comment-14216272 ] Hudson commented on YARN-2414: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1961 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1961/]) YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been created. Contributed by Wangda Tan (jlowe: rev 81c9d17af84ed87b9ded7057cb726a3855ddd32d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/QueueACLsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestAppPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/security/ApplicationACLsManager.java * hadoop-yarn-project/CHANGES.txt RM web UI: app page will crash if app is failed before any attempt has been created --- Key: YARN-2414 URL: https://issues.apache.org/jira/browse/YARN-2414 Project: Hadoop YARN Issue Type: Bug Components: webapp Reporter: Zhijie Shen Assignee: Wangda Tan Fix For: 2.7.0 Attachments: YARN-2414.20141104-1.patch, YARN-2414.20141104-2.patch, YARN-2414.patch {code} 2014-08-12 16:45:13,573 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /cluster/app/application_1407887030038_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:84) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:460) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1191) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at
[jira] [Commented] (YARN-2165) Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero
[ https://issues.apache.org/jira/browse/YARN-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216289#comment-14216289 ] Hadoop QA commented on YARN-2165: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682170/YARN-2165.1.patch against trunk revision 9dd5d67. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice: org.apache.hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilter {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5867//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5867//console This message is automatically generated. Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero - Key: YARN-2165 URL: https://issues.apache.org/jira/browse/YARN-2165 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Karam Singh Assignee: Vasanth kumar RJ Attachments: YARN-2165.1.patch, YARN-2165.patch Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero Currently if set yarn.timeline-service.ttl-ms=0 Or yarn.timeline-service.ttl-ms=-86400 Timeline server start successfully with complaining {code} 2014-06-15 14:52:16,562 INFO timeline.LeveldbTimelineStore (LeveldbTimelineStore.java:init(247)) - Starting deletion thread with ttl -60480 and cycle interval 30 {code} At starting timelinserver should that yarn.timeline-service-ttl-ms 0 otherwise specially for -ive value discard oldvalues timestamp will be set future value. Which may lead to inconsistancy in behavior {code} public void run() { while (true) { long timestamp = System.currentTimeMillis() - ttl; try { discardOldEntities(timestamp); Thread.sleep(ttlInterval); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2414) RM web UI: app page will crash if app is failed before any attempt has been created
[ https://issues.apache.org/jira/browse/YARN-2414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216293#comment-14216293 ] Hudson commented on YARN-2414: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/9/]) YARN-2414. RM web UI: app page will crash if app is failed before any attempt has been created. Contributed by Wangda Tan (jlowe: rev 81c9d17af84ed87b9ded7057cb726a3855ddd32d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestAppPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/QueueACLsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/AppBlock.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/security/ApplicationACLsManager.java * hadoop-yarn-project/CHANGES.txt RM web UI: app page will crash if app is failed before any attempt has been created --- Key: YARN-2414 URL: https://issues.apache.org/jira/browse/YARN-2414 Project: Hadoop YARN Issue Type: Bug Components: webapp Reporter: Zhijie Shen Assignee: Wangda Tan Fix For: 2.7.0 Attachments: YARN-2414.20141104-1.patch, YARN-2414.20141104-2.patch, YARN-2414.patch {code} 2014-08-12 16:45:13,573 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /cluster/app/application_1407887030038_0001 java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.yarn.webapp.Dispatcher.service(Dispatcher.java:153) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:263) at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:178) at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:62) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834) at org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebAppFilter.doFilter(RMWebAppFilter.java:84) at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795) at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163) at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58) at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118) at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:460) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1191) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at
[jira] [Commented] (YARN-2574) Add support for FairScheduler to the ReservationSystem
[ https://issues.apache.org/jira/browse/YARN-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216300#comment-14216300 ] Hudson commented on YARN-2574: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/9/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java Add support for FairScheduler to the ReservationSystem -- Key: YARN-2574 URL: https://issues.apache.org/jira/browse/YARN-2574 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Subru Krishnan Assignee: Anubhav Dhoot YARN-1051 introduces the ReservationSystem and the current implementation is based on CapacityScheduler. This JIRA proposes adding support for FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2690) Make ReservationSystem and its dependent classes independent of Scheduler type
[ https://issues.apache.org/jira/browse/YARN-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216294#comment-14216294 ] Hudson commented on YARN-2690: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #9 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/9/]) YARN-2690. [YARN-2574] Make ReservationSystem and its dependent classes independent of Scheduler type. (Anubhav Dhoot via kasha) (kasha: rev 2fce6d61412843f0447f60cfe02326f769edae25) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/NoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityReservationSystem.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/Planner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSystemTestUtil.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestGreedyReservationAgent.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/ReservationSchedulerConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/SharingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/CapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestSimpleCapacityReplanner.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestNoOverCommitPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityOverTimePolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/TestCapacityReservationSystem.java Make ReservationSystem and its dependent classes independent of Scheduler type Key: YARN-2690 URL: https://issues.apache.org/jira/browse/YARN-2690 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Fix For: 2.7.0 Attachments: YARN-2690.001.patch, YARN-2690.002.patch, YARN-2690.002.patch, YARN-2690.003.patch, YARN-2690.004.patch, YARN-2690.004.patch A lot of common reservation classes depend on CapacityScheduler and specifically its configuration. This jira is to make them ready for other Schedulers by abstracting out the configuration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2874) Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps
[ https://issues.apache.org/jira/browse/YARN-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-2874: Attachment: YARN-2874.20141118-2.patch Updated patch with fixes for review comment Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps - Key: YARN-2874 URL: https://issues.apache.org/jira/browse/YARN-2874 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.5.0, 2.4.1, 2.5.1 Reporter: Naganarasimha G R Assignee: Naganarasimha G R Priority: Critical Attachments: YARN-2874.20141118-1.patch, YARN-2874.20141118-2.patch When token renewal fails and the application finishes this dead lock can occur Jstack dump : {quote} Found one Java-level deadlock: = DelegationTokenRenewer #181865: waiting to lock monitor 0x00900918 (object 0xc18a9998, a java.util.Collections$SynchronizedSet), which is held by DelayedTokenCanceller DelayedTokenCanceller: waiting to lock monitor 0x04141718 (object 0xc7eae720, a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask), which is held by Timer-4 Timer-4: waiting to lock monitor 0x00900918 (object 0xc18a9998, a java.util.Collections$SynchronizedSet), which is held by DelayedTokenCanceller Java stack information for the threads listed above: === DelegationTokenRenewer #181865: at java.util.Collections$SynchronizedCollection.add(Collections.java:1636) - waiting to lock 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.addTokenToList(DelegationTokenRenewer.java:322) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:398) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$500(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:657) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:638) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) DelayedTokenCanceller: at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.cancel(DelegationTokenRenewer.java:443) - waiting to lock 0xc7eae720 (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeApplicationFromRenewal(DelegationTokenRenewer.java:558) - locked 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$300(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelayedTokenRemovalRunnable.run(DelegationTokenRenewer.java:599) at java.lang.Thread.run(Thread.java:745) Timer-4: at java.util.Collections$SynchronizedCollection.remove(Collections.java:1639) - waiting to lock 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeFailedDelegationToken(DelegationTokenRenewer.java:503) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$100(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.run(DelegationTokenRenewer.java:437) - locked 0xc7eae720 (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505) Found 1 deadlock. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2165) Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero
[ https://issues.apache.org/jira/browse/YARN-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216407#comment-14216407 ] Chen He commented on YARN-2165: --- Hi [~vasanthkumar], thank you for the patch. A small nit in the unit test code. Once the unit test gets expected exception, it will be great to verify which parameter produces this exception. eg. verify the message in exception to check which parameter causes this exception. Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero - Key: YARN-2165 URL: https://issues.apache.org/jira/browse/YARN-2165 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Karam Singh Assignee: Vasanth kumar RJ Attachments: YARN-2165.1.patch, YARN-2165.patch Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero Currently if set yarn.timeline-service.ttl-ms=0 Or yarn.timeline-service.ttl-ms=-86400 Timeline server start successfully with complaining {code} 2014-06-15 14:52:16,562 INFO timeline.LeveldbTimelineStore (LeveldbTimelineStore.java:init(247)) - Starting deletion thread with ttl -60480 and cycle interval 30 {code} At starting timelinserver should that yarn.timeline-service-ttl-ms 0 otherwise specially for -ive value discard oldvalues timestamp will be set future value. Which may lead to inconsistancy in behavior {code} public void run() { while (true) { long timestamp = System.currentTimeMillis() - ttl; try { discardOldEntities(timestamp); Thread.sleep(ttlInterval); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2165) Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero
[ https://issues.apache.org/jira/browse/YARN-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216413#comment-14216413 ] Chen He commented on YARN-2165: --- Agree with [~zhijie shen] to combine parameter checking together. Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero - Key: YARN-2165 URL: https://issues.apache.org/jira/browse/YARN-2165 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Karam Singh Assignee: Vasanth kumar RJ Attachments: YARN-2165.1.patch, YARN-2165.patch Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero Currently if set yarn.timeline-service.ttl-ms=0 Or yarn.timeline-service.ttl-ms=-86400 Timeline server start successfully with complaining {code} 2014-06-15 14:52:16,562 INFO timeline.LeveldbTimelineStore (LeveldbTimelineStore.java:init(247)) - Starting deletion thread with ttl -60480 and cycle interval 30 {code} At starting timelinserver should that yarn.timeline-service-ttl-ms 0 otherwise specially for -ive value discard oldvalues timestamp will be set future value. Which may lead to inconsistancy in behavior {code} public void run() { while (true) { long timestamp = System.currentTimeMillis() - ttl; try { discardOldEntities(timestamp); Thread.sleep(ttlInterval); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2375) Allow enabling/disabling timeline server per framework
[ https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-2375: Attachment: YARN-2375.patch Attaching the patch Allow enabling/disabling timeline server per framework -- Key: YARN-2375 URL: https://issues.apache.org/jira/browse/YARN-2375 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2375.patch This JIRA is to remove the ats enabled flag check within the TimelineClientImpl. Example where this fails is below. While running secure timeline server with ats flag set to disabled on resource manager, Timeline delegation token renewer throws an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2375) Allow enabling/disabling timeline server per framework
[ https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-2375: Attachment: YARN-2375.patch Attaching the patch. [~jeagles], [~zjshen] can you see if this approach is good? Allow enabling/disabling timeline server per framework -- Key: YARN-2375 URL: https://issues.apache.org/jira/browse/YARN-2375 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2375.patch, YARN-2375.patch This JIRA is to remove the ats enabled flag check within the TimelineClientImpl. Example where this fails is below. While running secure timeline server with ats flag set to disabled on resource manager, Timeline delegation token renewer throws an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2675) the containersKilled metrics is not updated when the container is killed during localization.
[ https://issues.apache.org/jira/browse/YARN-2675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216452#comment-14216452 ] zhihai xu commented on YARN-2675: - Hi [~vinodkv], Are you ok with the new patch? The new patch passed Hadoop QA test. thanks zhihai the containersKilled metrics is not updated when the container is killed during localization. - Key: YARN-2675 URL: https://issues.apache.org/jira/browse/YARN-2675 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.5.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2675.000.patch, YARN-2675.001.patch, YARN-2675.002.patch, YARN-2675.003.patch The containersKilled metrics is not updated when the container is killed during localization. We should add KILLING state in finished of ContainerImpl.java to update killedContainer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-2863) ResourceManager will shutdown when job's queuename is empty
[ https://issues.apache.org/jira/browse/YARN-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved YARN-2863. - Resolution: Invalid ResourceManager will shutdown when job's queuename is empty --- Key: YARN-2863 URL: https://issues.apache.org/jira/browse/YARN-2863 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.2.0 Reporter: yangping wu Labels: hadoop Fix For: 3.0.0 Original Estimate: 8h Remaining Estimate: 8h When I submit a job to hadoop cluster, but don't specified a queuename as follow {code} $HADOOP_HOME/bin/hadoop jar statistics.jar com.iteblog.Sts -Dmapreduce.job.queuename= {code} and if *yarn.scheduler.fair.allow-undeclared-pools* is not overwrite by user(default is true), then QueueManager will call createLeafQueue method to create the queue, because mapreduce.job.queuename is empty and cann't find it in QueueManager .But this will throw MetricsException {code} 2014-11-14 16:07:57,358 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type APP_ADDED to the scheduler org.apache.hadoop.metrics2.MetricsException: Metrics source QueueMetrics,q0=root already exists! at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.newSourceName(DefaultMetricsSystem.java:126) at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.sourceName(DefaultMetricsSystem.java:107) at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:217) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueueMetrics.forQueue(FSQueueMetrics.java:94) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSQueue.init(FSQueue.java:57) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.init(FSLeafQueue.java:57) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.createLeafQueue(QueueManager.java:191) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:136) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.assignToQueue(FairScheduler.java:652) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.addApplication(FairScheduler.java:610) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1015) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:440) at java.lang.Thread.run(Thread.java:744) 2014-11-14 16:07:57,359 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2873) improve LevelDB error handling for missing files DBException to avoid NM start failure.
[ https://issues.apache.org/jira/browse/YARN-2873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216464#comment-14216464 ] zhihai xu commented on YARN-2873: - Hi [~jlowe], I agree with you. The root cause is the Sorted Tables(*.sst) and MANIFEST file being deleted. If these files are stored away from tmp directory, it may solve the problem. thanks zhihai improve LevelDB error handling for missing files DBException to avoid NM start failure. --- Key: YARN-2873 URL: https://issues.apache.org/jira/browse/YARN-2873 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.5.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2873.000.patch, YARN-2873.001.patch improve LevelDB error handling for missing files DBException to avoid NM start failure. We saw the following three level DB exceptions, all these exceptions cause NM start failure. DBException 1 in ShuffleHandler {code} INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl failed in state STARTED; cause: org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /tmp/hadoop-yarn/yarn-nm-recovery/nm-aux-services/mapreduce_shuffle/mapreduce_shuffle_state/05.sst org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /tmp/hadoop-yarn/yarn-nm-recovery/nm-aux-services/mapreduce_shuffle/mapreduce_shuffle_state/05.sst at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:204) at org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceStart(AuxServices.java:159) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceStart(ContainerManagerImpl.java:441) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:261) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:446) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:492) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /tmp/hadoop-yarn/yarn-nm-recovery/nm-aux-services/mapreduce_shuffle/mapreduce_shuffle_state/05.sst at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:200) at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:218) at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168) at org.apache.hadoop.mapred.ShuffleHandler.startStore(ShuffleHandler.java:475) at org.apache.hadoop.mapred.ShuffleHandler.recoverState(ShuffleHandler.java:443) at org.apache.hadoop.mapred.ShuffleHandler.serviceStart(ShuffleHandler.java:379) at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) ... 10 more {code} DBException 2 in NMLeveldbStateStoreService: {code} Error starting NodeManager org.apache.hadoop.service.ServiceStateException: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.: /tmp/hadoop-yarn/yarn-nm-recovery/yarn-nm-state/05.sst at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartRecoveryStore(NodeManager.java:152) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:190) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:445) at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:492) Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 1 missing files; e.g.:
[jira] [Commented] (YARN-2351) YARN CLI should provide a command to list the configurations in use
[ https://issues.apache.org/jira/browse/YARN-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216465#comment-14216465 ] Rohith commented on YARN-2351: -- hdfs has command like {{hdfs -namenodes}} to determine cluster NameNodes. Similarly, if yarn supports command to get yarn cluster detail.like {{yarn getConf -resourcemanager}} and other commands will be good. YARN CLI should provide a command to list the configurations in use --- Key: YARN-2351 URL: https://issues.apache.org/jira/browse/YARN-2351 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.6.0 Reporter: Zhijie Shen To more easily understand the expected behavior of a yarn component, it is good have the command line to be able to print the configurations in use for RM, NM and timeline server daemons, as what we can do now via the web interfaces: {code} http://RM|NM|Timeline host:port/conf {code} The command line could be something like: {code} yarn conf resourcemanager|nodemanager|timelineserver [host] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2351) YARN CLI should provide a command to list the configurations in use
[ https://issues.apache.org/jira/browse/YARN-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena reassigned YARN-2351: -- Assignee: Varun Saxena YARN CLI should provide a command to list the configurations in use --- Key: YARN-2351 URL: https://issues.apache.org/jira/browse/YARN-2351 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.6.0 Reporter: Zhijie Shen Assignee: Varun Saxena To more easily understand the expected behavior of a yarn component, it is good have the command line to be able to print the configurations in use for RM, NM and timeline server daemons, as what we can do now via the web interfaces: {code} http://RM|NM|Timeline host:port/conf {code} The command line could be something like: {code} yarn conf resourcemanager|nodemanager|timelineserver [host] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2875) Bump SLF4J to 1.7.7 from 1.7.5
Tim Robertson created YARN-2875: --- Summary: Bump SLF4J to 1.7.7 from 1.7.5 Key: YARN-2875 URL: https://issues.apache.org/jira/browse/YARN-2875 Project: Hadoop YARN Issue Type: Bug Reporter: Tim Robertson Priority: Minor hadoop-yarn-common [uses log4j directly|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/pom.xml#L167] and when trying to redirect that through an SLF4J bridge version 1.7.5 has issues, due to use of methods missing in log4j-over-slf4j version 1.7.5. This is documented on the [1.7.6 release notes|http://www.slf4j.org/news.html] but 1.7.7 should be suitable. This is applicable to all the projects using Hadoop motherpom, but Yarn appears to be bringing Log4J in, rather than coding to the SLF4J API. The issue shows in the logs as follows in Yarn MR apps, which is painful to diagnose. {code} WARN [2014-11-18 09:58:06,390+0100] [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Caught exception in callback postStart java.lang.reflect.InvocationTargetException: null at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_71] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_71] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_71] at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_71] at org.apache.hadoop.metrics2.impl.MetricsSystemImpl$3.invoke(MetricsSystemImpl.java:290) ~[job.jar:0.22-SNAPSHOT] at com.sun.proxy.$Proxy2.postStart(Unknown Source) [na:na] at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.start(MetricsSystemImpl.java:185) [job.jar:0.22-SNAPSHOT] at org.apache.hadoop.metrics2.impl.MetricsSystemImpl.init(MetricsSystemImpl.java:157) [job.jar:0.22-SNAPSHOT] at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.init(DefaultMetricsSystem.java:54) [job.jar:0.22-SNAPSHOT] at org.apache.hadoop.metrics2.lib.DefaultMetricsSystem.initialize(DefaultMetricsSystem.java:50) [job.jar:0.22-SNAPSHOT] at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1036) [job.jar:0.22-SNAPSHOT] at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) [job.jar:0.22-SNAPSHOT] at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1478) [job.jar:0.22-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) [na:1.7.0_71] at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_71] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) [job.jar:0.22-SNAPSHOT] at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1474) [job.jar:0.22-SNAPSHOT] at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1407) [job.jar:0.22-SNAPSHOT] Caused by: java.lang.IncompatibleClassChangeError: Implementing class at java.lang.ClassLoader.defineClass1(Native Method) ~[na:1.7.0_71] at java.lang.ClassLoader.defineClass(ClassLoader.java:800) ~[na:1.7.0_71] at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) ~[na:1.7.0_71] at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) ~[na:1.7.0_71] at java.net.URLClassLoader.access$100(URLClassLoader.java:71) ~[na:1.7.0_71] at java.net.URLClassLoader$1.run(URLClassLoader.java:361) ~[na:1.7.0_71] at java.net.URLClassLoader$1.run(URLClassLoader.java:355) ~[na:1.7.0_71] at java.security.AccessController.doPrivileged(Native Method) [na:1.7.0_71] at java.net.URLClassLoader.findClass(URLClassLoader.java:354) ~[na:1.7.0_71] at java.lang.ClassLoader.loadClass(ClassLoader.java:425) ~[na:1.7.0_71] at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) ~[na:1.7.0_71] at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ~[na:1.7.0_71] at org.apache.hadoop.metrics2.source.JvmMetrics.getEventCounters(JvmMetrics.java:183) ~[job.jar:0.22-SNAPSHOT] at org.apache.hadoop.metrics2.source.JvmMetrics.getMetrics(JvmMetrics.java:100) ~[job.jar:0.22-SNAPSHOT] at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMetrics(MetricsSourceAdapter.java:195) ~[job.jar:0.22-SNAPSHOT] at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.updateJmxCache(MetricsSourceAdapter.java:172) ~[job.jar:0.22-SNAPSHOT] at org.apache.hadoop.metrics2.impl.MetricsSourceAdapter.getMBeanInfo(MetricsSourceAdapter.java:151) ~[job.jar:0.22-SNAPSHOT] at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getNewMBeanClassName(DefaultMBeanServerInterceptor.java:333) ~[na:1.7.0_71]
[jira] [Commented] (YARN-2351) YARN CLI should provide a command to list the configurations in use
[ https://issues.apache.org/jira/browse/YARN-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216556#comment-14216556 ] Allen Wittenauer commented on YARN-2351: We really just need to move getconf to common. YARN CLI should provide a command to list the configurations in use --- Key: YARN-2351 URL: https://issues.apache.org/jira/browse/YARN-2351 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.6.0 Reporter: Zhijie Shen Assignee: Varun Saxena To more easily understand the expected behavior of a yarn component, it is good have the command line to be able to print the configurations in use for RM, NM and timeline server daemons, as what we can do now via the web interfaces: {code} http://RM|NM|Timeline host:port/conf {code} The command line could be something like: {code} yarn conf resourcemanager|nodemanager|timelineserver [host] {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2874) Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps
[ https://issues.apache.org/jira/browse/YARN-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2874: --- Priority: Blocker (was: Critical) Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps - Key: YARN-2874 URL: https://issues.apache.org/jira/browse/YARN-2874 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.5.0, 2.4.1, 2.5.1 Reporter: Naganarasimha G R Assignee: Naganarasimha G R Priority: Blocker Attachments: YARN-2874.20141118-1.patch, YARN-2874.20141118-2.patch When token renewal fails and the application finishes this dead lock can occur Jstack dump : {quote} Found one Java-level deadlock: = DelegationTokenRenewer #181865: waiting to lock monitor 0x00900918 (object 0xc18a9998, a java.util.Collections$SynchronizedSet), which is held by DelayedTokenCanceller DelayedTokenCanceller: waiting to lock monitor 0x04141718 (object 0xc7eae720, a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask), which is held by Timer-4 Timer-4: waiting to lock monitor 0x00900918 (object 0xc18a9998, a java.util.Collections$SynchronizedSet), which is held by DelayedTokenCanceller Java stack information for the threads listed above: === DelegationTokenRenewer #181865: at java.util.Collections$SynchronizedCollection.add(Collections.java:1636) - waiting to lock 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.addTokenToList(DelegationTokenRenewer.java:322) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:398) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$500(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:657) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:638) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) DelayedTokenCanceller: at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.cancel(DelegationTokenRenewer.java:443) - waiting to lock 0xc7eae720 (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeApplicationFromRenewal(DelegationTokenRenewer.java:558) - locked 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$300(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelayedTokenRemovalRunnable.run(DelegationTokenRenewer.java:599) at java.lang.Thread.run(Thread.java:745) Timer-4: at java.util.Collections$SynchronizedCollection.remove(Collections.java:1639) - waiting to lock 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeFailedDelegationToken(DelegationTokenRenewer.java:503) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$100(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.run(DelegationTokenRenewer.java:437) - locked 0xc7eae720 (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505) Found 1 deadlock. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2802) add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue.
[ https://issues.apache.org/jira/browse/YARN-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216635#comment-14216635 ] Anubhav Dhoot commented on YARN-2802: - This metric probably will not catch issues with queue configuration, so cluster wide might be fine. But in general adding to queue metrics should give both queue specific metrics plus cluster wide metrics by using the root queue metrics. Just gives us more granular data. For this patch, 1. Can we please use separate variables for the two uses of launchAMStartTime. In AMLaunchedTransition we should use a new variable at {{appAttempt.launchAMStartTime = System.currentTimeMillis();}} 2. In TestQueueMetrics we are adding to the metrics but not checking it. It would be good to check the value back is possible. add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. -- Key: YARN-2802 URL: https://issues.apache.org/jira/browse/YARN-2802 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.5.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2802.000.patch, YARN-2802.001.patch, YARN-2802.002.patch, YARN-2802.003.patch add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. Added two metrics in QueueMetrics: aMLaunchDelay: the time spent from sending event AMLauncherEventType.LAUNCH to receiving event RMAppAttemptEventType.LAUNCHED in RMAppAttemptImpl. aMRegisterDelay: the time waiting from receiving event RMAppAttemptEventType.LAUNCHED to receiving event RMAppAttemptEventType.REGISTERED(ApplicationMasterService#registerApplicationMaster) in RMAppAttemptImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2874) Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps
[ https://issues.apache.org/jira/browse/YARN-2874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-2874: --- Target Version/s: 2.7.0 Affects Version/s: (was: 2.4.1) (was: 2.5.0) Dead lock in DelegationTokenRenewer which blocks RM to execute any further apps - Key: YARN-2874 URL: https://issues.apache.org/jira/browse/YARN-2874 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.5.1 Reporter: Naganarasimha G R Assignee: Naganarasimha G R Priority: Blocker Attachments: YARN-2874.20141118-1.patch, YARN-2874.20141118-2.patch When token renewal fails and the application finishes this dead lock can occur Jstack dump : {quote} Found one Java-level deadlock: = DelegationTokenRenewer #181865: waiting to lock monitor 0x00900918 (object 0xc18a9998, a java.util.Collections$SynchronizedSet), which is held by DelayedTokenCanceller DelayedTokenCanceller: waiting to lock monitor 0x04141718 (object 0xc7eae720, a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask), which is held by Timer-4 Timer-4: waiting to lock monitor 0x00900918 (object 0xc18a9998, a java.util.Collections$SynchronizedSet), which is held by DelayedTokenCanceller Java stack information for the threads listed above: === DelegationTokenRenewer #181865: at java.util.Collections$SynchronizedCollection.add(Collections.java:1636) - waiting to lock 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.addTokenToList(DelegationTokenRenewer.java:322) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:398) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$500(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:657) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:638) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) DelayedTokenCanceller: at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.cancel(DelegationTokenRenewer.java:443) - waiting to lock 0xc7eae720 (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeApplicationFromRenewal(DelegationTokenRenewer.java:558) - locked 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$300(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelayedTokenRemovalRunnable.run(DelegationTokenRenewer.java:599) at java.lang.Thread.run(Thread.java:745) Timer-4: at java.util.Collections$SynchronizedCollection.remove(Collections.java:1639) - waiting to lock 0xc18a9998 (a java.util.Collections$SynchronizedSet) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.removeFailedDelegationToken(DelegationTokenRenewer.java:503) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$100(DelegationTokenRenewer.java:70) at org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask.run(DelegationTokenRenewer.java:437) - locked 0xc7eae720 (a org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$RenewalTimerTask) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505) Found 1 deadlock. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1963) Support priorities across applications within the same queue
[ https://issues.apache.org/jira/browse/YARN-1963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216674#comment-14216674 ] Wangda Tan commented on YARN-1963: -- Thanks [~sunilg], [~vinodkv] for your great effort on this! I've just read through the design doc, some comments: 1) yarn.app.priority How this get to be implemented? Does this means, any YARN application doesn't need change a line of their code, can specify priority when submit the app using yarn CLI? I think if this can be done, we should extend to other YARN parameters like queue, node-label-expression, etc. 2) Specify only highest priority for queue and user I found there are property like {{yarn.scheduler.root.queue_name.priority_label=high,low}} and {{yarn.scheduler.capacity.root.queue_name.priority_label.acl=user1,user2}}. I would perfer just specify only highest priority for queue and user. For example, it doesn't make sense to me if priority = \{high,mid,low\}, and a queue can access \{high,low\} only. Is there any benefit to specify individual priorities instead of highest priority? 3) User limit and priority I think we shouldn't consider user limit within priority level, because the priority is not a specific kind of resource. Comparing to node label, you cannot say, user-X of queue-A used 8G highest priority resource, but you can say, user-X of queue-A used 8G resource in node with label=GPU. There's no difference for a 2G resource allocated to highest/lowest priority. If we want to implement this, bq. it will not be fair to schedule resources in a uniform manner for all application in a queue with respect to user limits. I suggest to add preemption within queue considering priority. Upon YARN-2069, we can considering user-limit and priority together -- while enforcing user-limit, we always preempt from lower priority applications. Any thoughts? Thanks, Wangda Support priorities across applications within the same queue - Key: YARN-1963 URL: https://issues.apache.org/jira/browse/YARN-1963 Project: Hadoop YARN Issue Type: New Feature Components: api, resourcemanager Reporter: Arun C Murthy Assignee: Sunil G Attachments: YARN Application Priorities Design.pdf It will be very useful to support priorities among applications within the same queue, particularly in production scenarios. It allows for finer-grained controls without having to force admins to create a multitude of queues, plus allows existing applications to continue using existing queues which are usually part of institutional memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2802) add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue.
[ https://issues.apache.org/jira/browse/YARN-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-2802: Attachment: YARN-2802.004.patch add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. -- Key: YARN-2802 URL: https://issues.apache.org/jira/browse/YARN-2802 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.5.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2802.000.patch, YARN-2802.001.patch, YARN-2802.002.patch, YARN-2802.003.patch, YARN-2802.004.patch add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. Added two metrics in QueueMetrics: aMLaunchDelay: the time spent from sending event AMLauncherEventType.LAUNCH to receiving event RMAppAttemptEventType.LAUNCHED in RMAppAttemptImpl. aMRegisterDelay: the time waiting from receiving event RMAppAttemptEventType.LAUNCHED to receiving event RMAppAttemptEventType.REGISTERED(ApplicationMasterService#registerApplicationMaster) in RMAppAttemptImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2802) add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue.
[ https://issues.apache.org/jira/browse/YARN-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216741#comment-14216741 ] zhihai xu commented on YARN-2802: - Hi [~adhoot], thanks for the review. Good finding, I uploaded a new patch YARN-2802.004.patch, which addressed your comments. thanks zhihai add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. -- Key: YARN-2802 URL: https://issues.apache.org/jira/browse/YARN-2802 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.5.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2802.000.patch, YARN-2802.001.patch, YARN-2802.002.patch, YARN-2802.003.patch, YARN-2802.004.patch add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. Added two metrics in QueueMetrics: aMLaunchDelay: the time spent from sending event AMLauncherEventType.LAUNCH to receiving event RMAppAttemptEventType.LAUNCHED in RMAppAttemptImpl. aMRegisterDelay: the time waiting from receiving event RMAppAttemptEventType.LAUNCHED to receiving event RMAppAttemptEventType.REGISTERED(ApplicationMasterService#registerApplicationMaster) in RMAppAttemptImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2165) Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero
[ https://issues.apache.org/jira/browse/YARN-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216765#comment-14216765 ] Hadoop QA commented on YARN-2165: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682226/YARN-2165.2.patch against trunk revision bcd402a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice: org.apache.hadoop.yarn.server.timeline.security.TestTimelineAuthenticationFilter {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5868//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5868//console This message is automatically generated. Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero - Key: YARN-2165 URL: https://issues.apache.org/jira/browse/YARN-2165 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Karam Singh Assignee: Vasanth kumar RJ Attachments: YARN-2165.1.patch, YARN-2165.2.patch, YARN-2165.patch Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero Currently if set yarn.timeline-service.ttl-ms=0 Or yarn.timeline-service.ttl-ms=-86400 Timeline server start successfully with complaining {code} 2014-06-15 14:52:16,562 INFO timeline.LeveldbTimelineStore (LeveldbTimelineStore.java:init(247)) - Starting deletion thread with ttl -60480 and cycle interval 30 {code} At starting timelinserver should that yarn.timeline-service-ttl-ms 0 otherwise specially for -ive value discard oldvalues timestamp will be set future value. Which may lead to inconsistancy in behavior {code} public void run() { while (true) { long timestamp = System.currentTimeMillis() - ttl; try { discardOldEntities(timestamp); Thread.sleep(ttlInterval); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2865) Application recovery continuously fails with Application with id already present. Cannot duplicate
[ https://issues.apache.org/jira/browse/YARN-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216806#comment-14216806 ] Karthik Kambatla commented on YARN-2865: Patch looks mostly good. Minor comments - we should reduce the visibility of the class and its methods to package-private, mark it @Private @Unstable, and add comments that this class is expected to be used only by RMContext and ResourceManager. I just want to guard against new code using this instead of RMContext; we might want this to be accessible in the future, but we should probably keep the changes small in this JIRA. Application recovery continuously fails with Application with id already present. Cannot duplicate Key: YARN-2865 URL: https://issues.apache.org/jira/browse/YARN-2865 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Critical Attachments: YARN-2865.patch, YARN-2865.patch YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2745) YARN new pluggable scheduler which does multi-resource packing
[ https://issues.apache.org/jira/browse/YARN-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216828#comment-14216828 ] Wangda Tan commented on YARN-2745: -- [~srikanthkandula], Exactly, but I think YARN-314 is permit one priority to multiple resource, but locality might be different. Thanks, YARN new pluggable scheduler which does multi-resource packing -- Key: YARN-2745 URL: https://issues.apache.org/jira/browse/YARN-2745 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager, scheduler Reporter: Robert Grandl Attachments: sigcomm_14_tetris_talk.pptx, tetris_paper.pdf In this umbrella JIRA we propose a new pluggable scheduler, which accounts for all resources used by a task (CPU, memory, disk, network) and it is able to achieve three competing objectives: fairness, improve cluster utilization and reduces average job completion time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2802) add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue.
[ https://issues.apache.org/jira/browse/YARN-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216853#comment-14216853 ] Hadoop QA commented on YARN-2802: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682228/YARN-2802.004.patch against trunk revision bcd402a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5869//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5869//console This message is automatically generated. add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. -- Key: YARN-2802 URL: https://issues.apache.org/jira/browse/YARN-2802 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.5.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2802.000.patch, YARN-2802.001.patch, YARN-2802.002.patch, YARN-2802.003.patch, YARN-2802.004.patch add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. Added two metrics in QueueMetrics: aMLaunchDelay: the time spent from sending event AMLauncherEventType.LAUNCH to receiving event RMAppAttemptEventType.LAUNCHED in RMAppAttemptImpl. aMRegisterDelay: the time waiting from receiving event RMAppAttemptEventType.LAUNCHED to receiving event RMAppAttemptEventType.REGISTERED(ApplicationMasterService#registerApplicationMaster) in RMAppAttemptImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework
[ https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216865#comment-14216865 ] Hadoop QA commented on YARN-2375: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682195/YARN-2375.patch against trunk revision bcd402a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5870//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5870//console This message is automatically generated. Allow enabling/disabling timeline server per framework -- Key: YARN-2375 URL: https://issues.apache.org/jira/browse/YARN-2375 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2375.patch, YARN-2375.patch This JIRA is to remove the ats enabled flag check within the TimelineClientImpl. Example where this fails is below. While running secure timeline server with ats flag set to disabled on resource manager, Timeline delegation token renewer throws an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to disable
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2800: - Summary: Remove MemoryNodeLabelsStore and add a way to disable (was: Should print WARN log in both RM/RMAdminCLI side when MemoryRMNodeLabelsManager is enabled) Remove MemoryNodeLabelsStore and add a way to disable -- Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch Even though we have documented this, but it will be better to explicitly print a message in both RM/RMAdminCLI side to explicitly say that the node label being added will be lost across RM restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2800: - Summary: Remove MemoryNodeLabelsStore and add a way to disable node labels feature (was: Remove MemoryNodeLabelsStore and add a way to disable ) Remove MemoryNodeLabelsStore and add a way to disable node labels feature - Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch Even though we have documented this, but it will be better to explicitly print a message in both RM/RMAdminCLI side to explicitly say that the node label being added will be lost across RM restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2800: - Summary: Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature (was: Remove MemoryNodeLabelsStore and add a way to disable node labels feature) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch Even though we have documented this, but it will be better to explicitly print a message in both RM/RMAdminCLI side to explicitly say that the node label being added will be lost across RM restart. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2800: - Description: In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. was: In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad experiecne. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot start if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, all operations trying to modify/use node labels will throw exception. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2800: - Description: In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad experiecne. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot start if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, all operations trying to modify/use node labels will throw exception. was:Even though we have documented this, but it will be better to explicitly print a message in both RM/RMAdminCLI side to explicitly say that the node label being added will be lost across RM restart. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad experiecne. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot start if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, all operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2865) Application recovery continuously fails with Application with id already present. Cannot duplicate
[ https://issues.apache.org/jira/browse/YARN-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216893#comment-14216893 ] Jian He commented on YARN-2865: --- looks good to me too. minor thing: In RMActiveServices, some are using {{rmContext#setter}}, some are using {{activeServiceContext#setter}}, we may make it consistent to use the latter Application recovery continuously fails with Application with id already present. Cannot duplicate Key: YARN-2865 URL: https://issues.apache.org/jira/browse/YARN-2865 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Critical Attachments: YARN-2865.patch, YARN-2865.patch YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216897#comment-14216897 ] Wangda Tan commented on YARN-2800: -- [~vinodkv], [~ozawa], thanks for your comments, I've edited the title/desc of this JIRA, will upload a patch soon. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2165) Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero
[ https://issues.apache.org/jira/browse/YARN-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216914#comment-14216914 ] Zhijie Shen commented on YARN-2165: --- [~vasanthkumar], thanks for your contribution! Some comments about the patch. 1. TIMELINE_SERVICE_CLIENT_MAX_RETRIES can be -1 for endless retry. It's good to make it clear in yarn-default.xml too. 2. Instead of {{ property value should be positive and non-zero}}, can we simply say {{ property value should be greater than zero}}? 3. You can use {{com.google.common.base.Preconditions.checkArgument}}. 4. Multiple lines are longer than 80 chars. 5. TIMELINE_SERVICE_LEVELDB_READ_CACHE_SIZE can be zero. 6. TIMELINE_SERVICE_LEVELDB_START_TIME_READ_CACHE_SIZE and TIMELINE_SERVICE_LEVELDB_START_TIME_WRITE_CACHE_SIZE seems to be 0 because LRUMap requires this. However, ideally we should be able to disable cache completely. Let's deal with it separately. Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero - Key: YARN-2165 URL: https://issues.apache.org/jira/browse/YARN-2165 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Karam Singh Assignee: Vasanth kumar RJ Attachments: YARN-2165.1.patch, YARN-2165.2.patch, YARN-2165.patch Timelineserver should validate that yarn.timeline-service.ttl-ms is greater than zero Currently if set yarn.timeline-service.ttl-ms=0 Or yarn.timeline-service.ttl-ms=-86400 Timeline server start successfully with complaining {code} 2014-06-15 14:52:16,562 INFO timeline.LeveldbTimelineStore (LeveldbTimelineStore.java:init(247)) - Starting deletion thread with ttl -60480 and cycle interval 30 {code} At starting timelinserver should that yarn.timeline-service-ttl-ms 0 otherwise specially for -ive value discard oldvalues timestamp will be set future value. Which may lead to inconsistancy in behavior {code} public void run() { while (true) { long timestamp = System.currentTimeMillis() - ttl; try { discardOldEntities(timestamp); Thread.sleep(ttlInterval); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
Siqi Li created YARN-2876: - Summary: In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework
[ https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14216973#comment-14216973 ] Zhijie Shen commented on YARN-2375: --- [~mitdesai], thanks for the patch. Two suggestions: 1. We should still let DS work when the timeline service is disable, and we just need to prevent sending the timeline data to the timeline server while the DS app is running. 2. In JobHistoryEventHandler we need to check both the global config and the mr specific config to decide whether we emit MR history events. Allow enabling/disabling timeline server per framework -- Key: YARN-2375 URL: https://issues.apache.org/jira/browse/YARN-2375 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2375.patch, YARN-2375.patch This JIRA is to remove the ats enabled flag check within the TimelineClientImpl. Example where this fails is below. While running secure timeline server with ats flag set to disabled on resource manager, Timeline delegation token renewer throws an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
[ https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated YARN-2876: -- Description: If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
[ https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated YARN-2876: -- Attachment: screenshot-1.png In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Attachments: YARN-2876.v1.patch, screenshot-1.png If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
[ https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated YARN-2876: -- Attachment: YARN-2876.v1.patch In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Attachments: YARN-2876.v1.patch, screenshot-1.png If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2157) Document YARN metrics
[ https://issues.apache.org/jira/browse/YARN-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-2157: -- Attachment: YARN-2157.3.patch Document YARN metrics - Key: YARN-2157 URL: https://issues.apache.org/jira/browse/YARN-2157 Project: Hadoop YARN Issue Type: Improvement Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: YARN-2157.2.patch, YARN-2157.3.patch, YARN-2157.patch YARN-side of HADOOP-6350. Add YARN metrics to Metrics document. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2157) Document YARN metrics
[ https://issues.apache.org/jira/browse/YARN-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217006#comment-14217006 ] Jian He commented on YARN-2157: --- thanks [~ajisakaa], sorry for the late feedback. looks good to me. just made some very minor edits myself. pending jenkins. Document YARN metrics - Key: YARN-2157 URL: https://issues.apache.org/jira/browse/YARN-2157 Project: Hadoop YARN Issue Type: Improvement Components: documentation Reporter: Akira AJISAKA Assignee: Akira AJISAKA Attachments: YARN-2157.2.patch, YARN-2157.3.patch, YARN-2157.patch YARN-side of HADOOP-6350. Add YARN metrics to Metrics document. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
[ https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217008#comment-14217008 ] Siqi Li commented on YARN-2876: --- Hi [~sandyr], Can you take a look at this? Although, this problem only affects observability, it would be great if we can get this right. So that, it would be less worrisome for users that things might go wrong with hierarchical queue structure. In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Attachments: YARN-2876.v1.patch, screenshot-1.png If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework
[ https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217010#comment-14217010 ] Mit Desai commented on YARN-2375: - Thanks for reviewing [~zjshen]. One clarification. bq. 1. We should still let DS work when the timeline service is disable, and we just need to prevent sending the timeline data to the timeline server while the DS app is running. Do you mean that we should not check for TIMELINE_SERVICE_ENABLED flag in the Application Master and rather have it work same way that it was doing before and only check that flag while sending data to timeline server? Allow enabling/disabling timeline server per framework -- Key: YARN-2375 URL: https://issues.apache.org/jira/browse/YARN-2375 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2375.patch, YARN-2375.patch This JIRA is to remove the ats enabled flag check within the TimelineClientImpl. Example where this fails is below. While running secure timeline server with ats flag set to disabled on resource manager, Timeline delegation token renewer throws an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
[ https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217012#comment-14217012 ] Wei Yan commented on YARN-2876: --- {code} +if (maxShare.equals(Resources.unbounded()) parent != null) { +return parent.getMaxShare(); {code} If the parent queue is also not configured, so it still returns UNBOUNDED? In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Attachments: YARN-2876.v1.patch, screenshot-1.png If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2802) add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue.
[ https://issues.apache.org/jira/browse/YARN-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-2802: Attachment: YARN-2802.005.patch add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. -- Key: YARN-2802 URL: https://issues.apache.org/jira/browse/YARN-2802 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 2.5.0 Reporter: zhihai xu Assignee: zhihai xu Attachments: YARN-2802.000.patch, YARN-2802.001.patch, YARN-2802.002.patch, YARN-2802.003.patch, YARN-2802.004.patch, YARN-2802.005.patch add AM container launch and register delay metrics in QueueMetrics to help diagnose performance issue. Added two metrics in QueueMetrics: aMLaunchDelay: the time spent from sending event AMLauncherEventType.LAUNCH to receiving event RMAppAttemptEventType.LAUNCHED in RMAppAttemptImpl. aMRegisterDelay: the time waiting from receiving event RMAppAttemptEventType.LAUNCHED to receiving event RMAppAttemptEventType.REGISTERED(ApplicationMasterService#registerApplicationMaster) in RMAppAttemptImpl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
[ https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217034#comment-14217034 ] Siqi Li commented on YARN-2876: --- No, If parent queue is also not configured, it will keep querying their ancestor queue until one of them has configured maxResource. Unless root is also not configured, then this method will return UNBOUNDED. In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Attachments: YARN-2876.v1.patch, screenshot-1.png If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
[ https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217036#comment-14217036 ] Wei Yan commented on YARN-2876: --- Oh, you're right, I misunderstood it. In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Attachments: YARN-2876.v1.patch, screenshot-1.png If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2375) Allow enabling/disabling timeline server per framework
[ https://issues.apache.org/jira/browse/YARN-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217049#comment-14217049 ] Zhijie Shen commented on YARN-2375: --- bq. Do you mean that we should not check for TIMELINE_SERVICE_ENABLED flag in the Application Master and rather have it work same way that it was doing before and only check that flag while sending data to timeline server? I think the logic could be: when TIMELINE_SERVICE_ENABLED == true, read the domain env var and construct the timeline client. Only if the timeline client is not null, the AM will send the data to timeline server where it should do it. Allow enabling/disabling timeline server per framework -- Key: YARN-2375 URL: https://issues.apache.org/jira/browse/YARN-2375 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jonathan Eagles Assignee: Mit Desai Attachments: YARN-2375.patch, YARN-2375.patch This JIRA is to remove the ats enabled flag check within the TimelineClientImpl. Example where this fails is below. While running secure timeline server with ats flag set to disabled on resource manager, Timeline delegation token renewer throws an NPE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-2800: - Attachment: YARN-2800-20141118-1.patch Uploaded patch and kick Jenkins Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2876) In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues
[ https://issues.apache.org/jira/browse/YARN-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217126#comment-14217126 ] Hadoop QA commented on YARN-2876: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682279/YARN-2876.v1.patch against trunk revision fbf81fb. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5871//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5871//console This message is automatically generated. In Fair Scheduler, JMX and Scheduler UI display wrong maxResource info for subqueues Key: YARN-2876 URL: https://issues.apache.org/jira/browse/YARN-2876 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Assignee: Siqi Li Attachments: YARN-2876.v1.patch, screenshot-1.png If a subqueue doesn't have a maxResource set in fair-scheduler.xml, JMX and Scheduler UI will display the entire cluster capacity as its maxResource instead of its parent queue's maxResource. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2877) Extend YARN to support distributed scheduling
[ https://issues.apache.org/jira/browse/YARN-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217207#comment-14217207 ] Sriram Rao commented on YARN-2877: -- The proposal: # Extend the NM to support task queueing. AM's can queue tasks directly at the NM's and the NM's will execute those tasks opportunistically. # Extend the type of containers that YARN exposes: #* CONSERVATIVE: This corresponds to containers allocated by YARN today. #* OPTIMISTIC: This corresponds to a new class of containers, which will be queued for execution at the NM. This extension allows AM's to control what type of container they are requesting from the RM framework. # Extend the NM with a local RM (i.e., a local Resource Manager) which uses local policies for deciding when an OPTIMISTIC container can be executed. We are exploring using timed leases for OPTIMISTIC containers to ensure minimum duration of execution. On the other hand, this mechanism allows NM's to free up resources and thus guarantee predictable start times for CONSERVATIVE containers. There are additional motivations for the uses of this feature and we will discuss them in follow-up comments. Extend YARN to support distributed scheduling - Key: YARN-2877 URL: https://issues.apache.org/jira/browse/YARN-2877 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, resourcemanager Reporter: Sriram Rao This is an umbrella JIRA that proposes to extend YARN to support distributed scheduling. Briefly, some of the motivations for distributed scheduling are the following: 1. Improve cluster utilization by opportunistically executing tasks otherwise idle resources on individual machines. 2. Reduce allocation latency. Tasks where the scheduling time dominates (i.e., task execution time is much less compared to the time required for obtaining a container from the RM). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2206) Update document for applications REST API response examples
[ https://issues.apache.org/jira/browse/YARN-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217280#comment-14217280 ] Jian He commented on YARN-2206: --- [~kj-ki], thanks for working on this. Patch not applying any more, mind updating the patch ? I can commit once updated. Update document for applications REST API response examples --- Key: YARN-2206 URL: https://issues.apache.org/jira/browse/YARN-2206 Project: Hadoop YARN Issue Type: Improvement Components: documentation Affects Versions: 2.4.0 Reporter: Kenji Kikushima Assignee: Kenji Kikushima Priority: Minor Attachments: YARN-2206.patch In ResourceManagerRest.apt.vm, Applications API responses are missing some elements. - JSON response should have applicationType and applicationTags. - XML response should have applicationTags. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2404) Remove ApplicationAttemptState and ApplicationState class in RMStateStore class
[ https://issues.apache.org/jira/browse/YARN-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217302#comment-14217302 ] Jian He commented on YARN-2404: --- [~ozawa], sorry for the late response, patch not applying any more. mind updating the patch ? Remove ApplicationAttemptState and ApplicationState class in RMStateStore class Key: YARN-2404 URL: https://issues.apache.org/jira/browse/YARN-2404 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Tsuyoshi OZAWA Attachments: YARN-2404.1.patch, YARN-2404.2.patch, YARN-2404.3.patch, YARN-2404.4.patch We can remove ApplicationState and ApplicationAttemptState class in RMStateStore, given that we already have ApplicationStateData and ApplicationAttemptStateData records. we may just replace ApplicationState with ApplicationStateData, similarly for ApplicationAttemptState. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2404) Remove ApplicationAttemptState and ApplicationState class in RMStateStore class
[ https://issues.apache.org/jira/browse/YARN-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217308#comment-14217308 ] Hadoop QA commented on YARN-2404: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12668577/YARN-2404.4.patch against trunk revision 79301e8. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5876//console This message is automatically generated. Remove ApplicationAttemptState and ApplicationState class in RMStateStore class Key: YARN-2404 URL: https://issues.apache.org/jira/browse/YARN-2404 Project: Hadoop YARN Issue Type: Sub-task Reporter: Jian He Assignee: Tsuyoshi OZAWA Attachments: YARN-2404.1.patch, YARN-2404.2.patch, YARN-2404.3.patch, YARN-2404.4.patch We can remove ApplicationState and ApplicationAttemptState class in RMStateStore, given that we already have ApplicationStateData and ApplicationAttemptStateData records. we may just replace ApplicationState with ApplicationStateData, similarly for ApplicationAttemptState. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2356) yarn status command for non-existent application/application attempt/container is too verbose
[ https://issues.apache.org/jira/browse/YARN-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217334#comment-14217334 ] Jian He commented on YARN-2356: --- [~sunilg], thanks for working on this. Patch looks good, one minor comment: {{doesn't exist in RM or History Server.}}, we may say {{TimeLineServer}} instead of {{History Server}} yarn status command for non-existent application/application attempt/container is too verbose -- Key: YARN-2356 URL: https://issues.apache.org/jira/browse/YARN-2356 Project: Hadoop YARN Issue Type: Bug Components: client Reporter: Sunil G Assignee: Sunil G Priority: Minor Attachments: Yarn-2356.1.patch *yarn application -status* or *applicationattempt -status* or *container status* commands can suppress exception such as ApplicationNotFound, ApplicationAttemptNotFound and ContainerNotFound for non-existent entries in RM or History Server. For example, below exception can be suppressed better sunildev@host-a:~/hadoop/hadoop/bin ./yarn application -status application_1402668848165_0015 No GC_PROFILE is given. Defaults to medium. 14/07/25 16:21:45 INFO client.RMProxy: Connecting to ResourceManager at /10.18.40.77:45022 Exception in thread main org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1402668848165_0015' doesn't exist in RM. at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:285) at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145) at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:607) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:932) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2099) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2095) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1626) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2093) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:166) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) at $Proxy12.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:291) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:428) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:76) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException): Application with id 'application_1402668848165_0015' doesn't exist in RM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217343#comment-14217343 ] Hadoop QA commented on YARN-2800: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682325/YARN-2800-20141118-2.patch against trunk revision 79301e8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 5 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5875//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5875//console This message is automatically generated. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2865) Application recovery continuously fails with Application with id already present. Cannot duplicate
[ https://issues.apache.org/jira/browse/YARN-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217362#comment-14217362 ] Rohith commented on YARN-2865: -- Thanks Karthik and Jian He for review. I will update the patch. bq. In RMActiveServices, some are using rmContext#setter, some are using activeServiceContext#setter, we may make it consistent to use the latter RMContext has 5 setter methods. I used those methods to set from RMActiveService just to retain interface implementation. Application recovery continuously fails with Application with id already present. Cannot duplicate Key: YARN-2865 URL: https://issues.apache.org/jira/browse/YARN-2865 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Critical Attachments: YARN-2865.patch, YARN-2865.patch YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2865) Application recovery continuously fails with Application with id already present. Cannot duplicate
[ https://issues.apache.org/jira/browse/YARN-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217448#comment-14217448 ] Tsuyoshi OZAWA commented on YARN-2865: -- [~rohithsharma], thanks for taking this issue. I'd like to +1 for adding Private and Unstable annotation to the methods defined in RMActiveServiceContext as Karthik mentioned. Otherwise points looks good to me. Application recovery continuously fails with Application with id already present. Cannot duplicate Key: YARN-2865 URL: https://issues.apache.org/jira/browse/YARN-2865 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Reporter: Rohith Assignee: Rohith Priority: Critical Attachments: YARN-2865.patch, YARN-2865.patch YARN-2588 handles exception thrown while transitioningToActive and reset activeServices. But it misses out clearing RMcontext apps/nodes details and ClusterMetrics and QueueMetrics. This causes application recovery to fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2878) Fix DockerContainerExecutor.apt.vm formatting
Abin Shahab created YARN-2878: - Summary: Fix DockerContainerExecutor.apt.vm formatting Key: YARN-2878 URL: https://issues.apache.org/jira/browse/YARN-2878 Project: Hadoop YARN Issue Type: Improvement Reporter: Abin Shahab Assignee: Abin Shahab The formatting on DockerContainerExecutor.apt.vm is off. Needs correction -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2878) Fix DockerContainerExecutor.apt.vm formatting
[ https://issues.apache.org/jira/browse/YARN-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abin Shahab updated YARN-2878: -- Attachment: YARN-1964-docs.patch Fix DockerContainerExecutor.apt.vm formatting - Key: YARN-2878 URL: https://issues.apache.org/jira/browse/YARN-2878 Project: Hadoop YARN Issue Type: Improvement Reporter: Abin Shahab Assignee: Abin Shahab Attachments: YARN-1964-docs.patch The formatting on DockerContainerExecutor.apt.vm is off. Needs correction -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2738) Add FairReservationSystem for FairScheduler
[ https://issues.apache.org/jira/browse/YARN-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2738: Attachment: YARN-2738.003.patch Removed configurability in fairscheduler configuration as per discussion. Add FairReservationSystem for FairScheduler --- Key: YARN-2738 URL: https://issues.apache.org/jira/browse/YARN-2738 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Reporter: Anubhav Dhoot Assignee: Anubhav Dhoot Attachments: YARN-2738.001.patch, YARN-2738.002.patch, YARN-2738.003.patch Need to create a FairReservationSystem that will implement ReservationSystem for FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2878) Fix DockerContainerExecutor.apt.vm formatting
[ https://issues.apache.org/jira/browse/YARN-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217516#comment-14217516 ] Hadoop QA commented on YARN-2878: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12682353/YARN-1964-docs.patch against trunk revision 79301e8. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/5877//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5877//console This message is automatically generated. Fix DockerContainerExecutor.apt.vm formatting - Key: YARN-2878 URL: https://issues.apache.org/jira/browse/YARN-2878 Project: Hadoop YARN Issue Type: Improvement Reporter: Abin Shahab Assignee: Abin Shahab Attachments: YARN-1964-docs.patch The formatting on DockerContainerExecutor.apt.vm is off. Needs correction -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation
[ https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch reassigned YARN-2637: - Assignee: Craig Welch maximum-am-resource-percent could be violated when resource of AM is minimumAllocation Key: YARN-2637 URL: https://issues.apache.org/jira/browse/YARN-2637 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.6.0 Reporter: Wangda Tan Assignee: Craig Welch Priority: Critical Currently, number of AM in leaf queue will be calculated in following way: {code} max_am_resource = queue_max_capacity * maximum_am_resource_percent #max_am_number = max_am_resource / minimum_allocation #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor {code} And when submit new application to RM, it will check if an app can be activated in following way: {code} for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); i.hasNext(); ) { FiCaSchedulerApp application = i.next(); // Check queue limit if (getNumActiveApplications() = getMaximumActiveApplications()) { break; } // Check user limit User user = getUser(application.getUser()); if (user.getActiveApplications() getMaximumActiveApplicationsPerUser()) { user.activateApplication(); activeApplications.add(application); i.remove(); LOG.info(Application + application.getApplicationId() + from user: + application.getUser() + activated in queue: + getQueueName()); } } {code} An example is, If a queue has capacity = 1G, max_am_resource_percent = 0.2, the maximum resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be launched is 200, and if user uses 5M for each AM ( minimum_allocation). All apps can still be activated, and it will occupy all resource of a queue instead of only a max_am_resource_percent of a queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation
[ https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch updated YARN-2637: -- Attachment: YARN-2637.0.patch Attaching a roughish but I think serviceable work in progress patch - based on manual testing/checking the logs it looks to work as it should - still need to write some unit tests validate it against the existing tests... maximum-am-resource-percent could be violated when resource of AM is minimumAllocation Key: YARN-2637 URL: https://issues.apache.org/jira/browse/YARN-2637 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.6.0 Reporter: Wangda Tan Assignee: Craig Welch Priority: Critical Attachments: YARN-2637.0.patch Currently, number of AM in leaf queue will be calculated in following way: {code} max_am_resource = queue_max_capacity * maximum_am_resource_percent #max_am_number = max_am_resource / minimum_allocation #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor {code} And when submit new application to RM, it will check if an app can be activated in following way: {code} for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); i.hasNext(); ) { FiCaSchedulerApp application = i.next(); // Check queue limit if (getNumActiveApplications() = getMaximumActiveApplications()) { break; } // Check user limit User user = getUser(application.getUser()); if (user.getActiveApplications() getMaximumActiveApplicationsPerUser()) { user.activateApplication(); activeApplications.add(application); i.remove(); LOG.info(Application + application.getApplicationId() + from user: + application.getUser() + activated in queue: + getQueueName()); } } {code} An example is, If a queue has capacity = 1G, max_am_resource_percent = 0.2, the maximum resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be launched is 200, and if user uses 5M for each AM ( minimum_allocation). All apps can still be activated, and it will occupy all resource of a queue instead of only a max_am_resource_percent of a queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2800) Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature
[ https://issues.apache.org/jira/browse/YARN-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14217569#comment-14217569 ] Tsuyoshi OZAWA commented on YARN-2800: -- Thanks for your comment, Vinod and thanks for the patch, Wangda. +1 for removing MemoryNodeLabelsStore. My comments: * MemoryRMNodeLabelsManager for tests do nothing in new patch. How about renaming MemoryRMNodeLabelsManager to NullRMNodeLabelsManager for the consistency with RMStateStore? * Maybe not related to this JIRA, but it's better to add testing RMRestart with NodeLabelManager to avoid regressions. Remove MemoryNodeLabelsStore and add a way to enable/disable node labels feature Key: YARN-2800 URL: https://issues.apache.org/jira/browse/YARN-2800 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Wangda Tan Assignee: Wangda Tan Attachments: YARN-2800-20141102-1.patch, YARN-2800-20141102-2.patch, YARN-2800-20141118-1.patch, YARN-2800-20141118-2.patch In the past, we have a MemoryNodeLabelStore, mostly for user to try this feature without configuring where to store node labels on file system. It seems convenient for user to try this, but actually it causes some bad use experience. User may add/remove labels, and edit capacity-scheduler.xml. After RM restart, labels will gone, (we store it in mem). And RM cannot get started if we have some queue uses labels, and the labels don't exist in cluster. As what we discussed, we should have an explicitly way to let user specify if he/she wants this feature or not. If node label is disabled, any operations trying to modify/use node labels will throw exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2637) maximum-am-resource-percent could be violated when resource of AM is minimumAllocation
[ https://issues.apache.org/jira/browse/YARN-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Craig Welch updated YARN-2637: -- Attachment: YARN-2637.1.patch Had forgotten to remove the resource when the application finishes - updated patch which does. I think that this actually needs to be a per cluster (rather than a per-queue) limit, based on the name the behavior most seem to expect - except that there can be a per-queue override to the value, and most other values like it end up being evaluated at the queue level. It seems as though either this should be a global value or possibly based on a portion of the cluster (perhaps the queues baseline portion of the cluster, then adjusted). Most likely, the right approach is to make the usedAMResources a single per-cluster value by attaching it to the parent queue (so, abstract cs queue instance of the root queue) - which wouldn't be difficult - and then it would be per-cluster as it probably should be. maximum-am-resource-percent could be violated when resource of AM is minimumAllocation Key: YARN-2637 URL: https://issues.apache.org/jira/browse/YARN-2637 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.6.0 Reporter: Wangda Tan Assignee: Craig Welch Priority: Critical Attachments: YARN-2637.0.patch, YARN-2637.1.patch Currently, number of AM in leaf queue will be calculated in following way: {code} max_am_resource = queue_max_capacity * maximum_am_resource_percent #max_am_number = max_am_resource / minimum_allocation #max_am_number_for_each_user = #max_am_number * userlimit * userlimit_factor {code} And when submit new application to RM, it will check if an app can be activated in following way: {code} for (IteratorFiCaSchedulerApp i=pendingApplications.iterator(); i.hasNext(); ) { FiCaSchedulerApp application = i.next(); // Check queue limit if (getNumActiveApplications() = getMaximumActiveApplications()) { break; } // Check user limit User user = getUser(application.getUser()); if (user.getActiveApplications() getMaximumActiveApplicationsPerUser()) { user.activateApplication(); activeApplications.add(application); i.remove(); LOG.info(Application + application.getApplicationId() + from user: + application.getUser() + activated in queue: + getQueueName()); } } {code} An example is, If a queue has capacity = 1G, max_am_resource_percent = 0.2, the maximum resource that AM can use is 200M, assuming minimum_allocation=1M, #am can be launched is 200, and if user uses 5M for each AM ( minimum_allocation). All apps can still be activated, and it will occupy all resource of a queue instead of only a max_am_resource_percent of a queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)