[jira] [Commented] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870628#comment-13870628 ] Hudson commented on YARN-888: - SUCCESS: Integrated in Hadoop-Yarn-trunk #452 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/452/]) YARN-888. Cleaned up POM files so that non-leaf modules don't include any dependencies and thus compact the dependency list for leaf modules. Contributed by Alejandro Abdelnur. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1557801) * /hadoop/common/trunk/hadoop-project/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/pom.xml clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.4.0 Attachments: YARN-888.patch, YARN-888.patch, YARN-888.patch, YARN-888.patch, yarn-888-2.patch Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870712#comment-13870712 ] Hudson commented on YARN-888: - FAILURE: Integrated in Hadoop-Mapreduce-trunk #1669 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1669/]) YARN-888. Cleaned up POM files so that non-leaf modules don't include any dependencies and thus compact the dependency list for leaf modules. Contributed by Alejandro Abdelnur. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1557801) * /hadoop/common/trunk/hadoop-project/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/pom.xml clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.4.0 Attachments: YARN-888.patch, YARN-888.patch, YARN-888.patch, YARN-888.patch, yarn-888-2.patch Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-888) clean up POM dependencies
[ https://issues.apache.org/jira/browse/YARN-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870724#comment-13870724 ] Hudson commented on YARN-888: - SUCCESS: Integrated in Hadoop-Hdfs-trunk #1644 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1644/]) YARN-888. Cleaned up POM files so that non-leaf modules don't include any dependencies and thus compact the dependency list for leaf modules. Contributed by Alejandro Abdelnur. (vinodkv: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1557801) * /hadoop/common/trunk/hadoop-project/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-unmanaged-am-launcher/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-web-proxy/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/pom.xml * /hadoop/common/trunk/hadoop-yarn-project/pom.xml clean up POM dependencies - Key: YARN-888 URL: https://issues.apache.org/jira/browse/YARN-888 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Fix For: 2.4.0 Attachments: YARN-888.patch, YARN-888.patch, YARN-888.patch, YARN-888.patch, yarn-888-2.patch Intermediate 'pom' modules define dependencies inherited by leaf modules. This is causing issues in intellij IDE. We should normalize the leaf modules like in common, hdfs and tools where all dependencies are defined in each leaf module and the intermediate 'pom' module do not define any dependency. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1600) RM does not startup when security is enabled without spnego configured
Jason Lowe created YARN-1600: Summary: RM does not startup when security is enabled without spnego configured Key: YARN-1600 URL: https://issues.apache.org/jira/browse/YARN-1600 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Jason Lowe Priority: Blocker We have a custom auth filter in front of our various UI pages that handles user authentication. However currently the RM assumes that if security is enabled then the user must have configured spnego as well for the RM web pages which is not true in our case. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1600) RM does not startup when security is enabled without spnego configured
[ https://issues.apache.org/jira/browse/YARN-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870898#comment-13870898 ] Jason Lowe commented on YARN-1600: -- A number of ways to address this, and I'm sure there are others: * have the RM avoid setting spnego confs on the WebApps setup if the confs have no values set * have WebApps avoid setting up username and keytab confs for HttpServer if those confs have no values set (similar to early patches on YARN-1463) * if we're worried we need to make sure users are aware that they configured security but not spnego and want to make that break by default as it does today then we need a separate config to indicate the user really wants to run with security but not spnego on the RM web pages RM does not startup when security is enabled without spnego configured -- Key: YARN-1600 URL: https://issues.apache.org/jira/browse/YARN-1600 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Jason Lowe Priority: Blocker We have a custom auth filter in front of our various UI pages that handles user authentication. However currently the RM assumes that if security is enabled then the user must have configured spnego as well for the RM web pages which is not true in our case. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1530) [Umbrella] Store, manage and serve per-framework application-timeline data
[ https://issues.apache.org/jira/browse/YARN-1530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13870908#comment-13870908 ] Lohit Vijayarenu commented on YARN-1530: Yes, proxy server inside library, but only in AM not containers. Containers could make rest calls to AM. Main advantage is that we would not send timeline data to one single server. For example we have seen cases where our history files could grow upto 700MB for large jobs. In that case having hundreds of would would easily become bottleneck for single REST point, distributing it to its own AM would help. [Umbrella] Store, manage and serve per-framework application-timeline data -- Key: YARN-1530 URL: https://issues.apache.org/jira/browse/YARN-1530 Project: Hadoop YARN Issue Type: Bug Reporter: Vinod Kumar Vavilapalli Attachments: application timeline design-20140108.pdf This is a sibling JIRA for YARN-321. Today, each application/framework has to do store, and serve per-framework data all by itself as YARN doesn't have a common solution. This JIRA attempts to solve the storage, management and serving of per-framework data from various applications, both running and finished. The aim is to change YARN to collect and store data in a generic manner with plugin points for frameworks to do their own thing w.r.t interpretation and serving. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1594) YARN-321 branch needs to be updated after YARN-888 pom changes
[ https://issues.apache.org/jira/browse/YARN-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1594: -- Fix Version/s: YARN-321 YARN-321 branch needs to be updated after YARN-888 pom changes -- Key: YARN-1594 URL: https://issues.apache.org/jira/browse/YARN-1594 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: YARN-321 Attachments: YARN-1594-20140113.txt, YARN-1594.txt YARN-888 changed the pom structure. And so latest merge to trunk breaks YARN-321 branch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1596) Javadoc failures on YARN-321 branch
[ https://issues.apache.org/jira/browse/YARN-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1596: -- Fix Version/s: YARN-321 Javadoc failures on YARN-321 branch --- Key: YARN-1596 URL: https://issues.apache.org/jira/browse/YARN-1596 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: YARN-321 Attachments: YARN-1596.txt There are some javadoc issues on YARN-321 branch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1597) FindBugs warnings on YARN-321 branch
[ https://issues.apache.org/jira/browse/YARN-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1597: -- Fix Version/s: YARN-321 FindBugs warnings on YARN-321 branch Key: YARN-1597 URL: https://issues.apache.org/jira/browse/YARN-1597 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Fix For: YARN-321 Attachments: YARN-1597.txt There are a bunch of findBugs warnings on YARN-321 branch. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1496) Protocol additions to allow moving apps between queues
[ https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871226#comment-13871226 ] Karthik Kambatla commented on YARN-1496: The patch looks good to me, seems to have addressed Vinod's comments as well. +1 Protocol additions to allow moving apps between queues -- Key: YARN-1496 URL: https://issues.apache.org/jira/browse/YARN-1496 Project: Hadoop YARN Issue Type: Sub-task Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1496-1.patch, YARN-1496-2.patch, YARN-1496-3.patch, YARN-1496-4.patch, YARN-1496-5.patch, YARN-1496-6.patch, YARN-1496.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1496) Protocol additions to allow moving apps between queues
[ https://issues.apache.org/jira/browse/YARN-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871250#comment-13871250 ] Sandy Ryza commented on YARN-1496: -- I'll commit this tomorrow if there are no further objections Protocol additions to allow moving apps between queues -- Key: YARN-1496 URL: https://issues.apache.org/jira/browse/YARN-1496 Project: Hadoop YARN Issue Type: Sub-task Components: scheduler Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-1496-1.patch, YARN-1496-2.patch, YARN-1496-3.patch, YARN-1496-4.patch, YARN-1496-5.patch, YARN-1496-6.patch, YARN-1496.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1567) In Fair Scheduler, allow empty queues to change between leaf and parent on allocation file reload
[ https://issues.apache.org/jira/browse/YARN-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871339#comment-13871339 ] Hudson commented on YARN-1567: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4996 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4996/]) YARN-1567. In Fair Scheduler, allow empty queues to change between leaf and parent on allocation file reload (Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1558228) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/QueueMetrics.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AllocationConfiguration.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSQueue.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestQueueManager.java In Fair Scheduler, allow empty queues to change between leaf and parent on allocation file reload - Key: YARN-1567 URL: https://issues.apache.org/jira/browse/YARN-1567 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Affects Versions: 2.2.0 Reporter: Sandy Ryza Assignee: Sandy Ryza Fix For: 2.4.0 Attachments: YARN-1567-1.patch, YARN-1567-2.patch, YARN-1567-3.patch, YARN-1567.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1601) 3rd party JARs are missing from hadoop-dist output
Alejandro Abdelnur created YARN-1601: Summary: 3rd party JARs are missing from hadoop-dist output Key: YARN-1601 URL: https://issues.apache.org/jira/browse/YARN-1601 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.4.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur With the build changes of YARN-888 we are leaving out all 3rd party JArs used directly by YARN under /share/hadoop/yarn/lib/. We did not notice this when running minicluster because they all happen to be in the classpath from hadoop-common and hadoop-yarn. As 3d party JARs are not 'public' interfaces we cannot rely on them being provided to yarn by common and hdfs. (ie if common and hdfs stop using a 3rd party dependency that yarn uses this would break yarn if yarn does not pull that dependency explicitly). Also, this will break bigtop hadoop build when they move to use branch-2 as they expect to find jars in /share/hadoop/yarn/lib/ -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1598) HA-related rmadmin commands don't work on a secure cluster
[ https://issues.apache.org/jira/browse/YARN-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871417#comment-13871417 ] Karthik Kambatla commented on YARN-1598: The test failure is unrelated. Committing this shortly. HA-related rmadmin commands don't work on a secure cluster -- Key: YARN-1598 URL: https://issues.apache.org/jira/browse/YARN-1598 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Attachments: yarn-1598-1.patch The HA-related commands like -getServiceState -checkHealth etc. don't work in a secure cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1601) 3rd party JARs are missing from hadoop-dist output
[ https://issues.apache.org/jira/browse/YARN-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated YARN-1601: - Attachment: YARN-1601.patch Patch that adds to the hadoop-yarn POM the submodules as dependencies, this is required to be able to populate the /share/hadoop/lib/ dir with 3rd party JARs used by Yarn. The hadoop-yarn POM isn't a parent of any other POM so this does not affect existing dependencies. It just makes the assembly for yarn to pick up Yarn 3rd party JARs when creating the tarball 3rd party JARs are missing from hadoop-dist output -- Key: YARN-1601 URL: https://issues.apache.org/jira/browse/YARN-1601 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.4.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-1601.patch With the build changes of YARN-888 we are leaving out all 3rd party JArs used directly by YARN under /share/hadoop/yarn/lib/. We did not notice this when running minicluster because they all happen to be in the classpath from hadoop-common and hadoop-yarn. As 3d party JARs are not 'public' interfaces we cannot rely on them being provided to yarn by common and hdfs. (ie if common and hdfs stop using a 3rd party dependency that yarn uses this would break yarn if yarn does not pull that dependency explicitly). Also, this will break bigtop hadoop build when they move to use branch-2 as they expect to find jars in /share/hadoop/yarn/lib/ -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1584) Support explicit failover when automatic failover is enabled
[ https://issues.apache.org/jira/browse/YARN-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871425#comment-13871425 ] Karthik Kambatla commented on YARN-1584: bq. The duration of failover depends on how long ZK needs to figure out that the leader is gone. Then notifying the new leader. Then new leader reading state. Right. I agree that the failover takes the same time irrespective of whether it is through graceful failover or shutting the RM down. bq. Its not clear to me how any of these steps are faster with a admin failover option. I was referring to the duration after the failover for which a single RM is up. In other words, the duration for recovering the RM that was shutdown. Firstly, it requires manually checking the other RM has actually taken over, which in itself is slower than handling it automatically. Then, the start-up time for the second RM; the start-up might become an issue if/when the Standby and the other services retain/pre-fetch state. IMO, the biggest gain of supporting -failover is the ease of use. What do you think of adding a config whether to support graceful failover and may be we can turn it off by default. bq. When the RM is asked to transition to active via the AdminService (FORCE_USER) flag, then the AdminService can transition to standby and then notify the elector to quitElection(). That API is present on the elector for this specific purpose. The elector gives up participation in the leader election process. This RM will remain in standby (because the elector is not going to notify it anymore) until the admin ask it to transitionToActive(FORCE_USER). Later, when the AdminService is asked to transitionToActive() it can call the joinElection API on the elector to rejoin the leader election and stay in the Standby state. The elector will join the election and notify the RM to transitionToActive if it wins the election. The transitionToStandby() part sounds reasonable to me. transitionToActive(FORCE_USER) wouldn't actually transition the RM to Active, but instead just become ready to be Active? Users might find it confusing. Support explicit failover when automatic failover is enabled Key: YARN-1584 URL: https://issues.apache.org/jira/browse/YARN-1584 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla YARN-1029 adds automatic failover support. However, users can't explicitly ask for a failover from one RM to the other without stopping the other RM. Stopping the RM until the other RM takes over and then restarting the first RM is more involving and exposes the RM-ensemble to SPOF for a longer duration. It would be nice to allow explicit failover through yarn rmadmin -failover command. PS: HDFS supports -failover option. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1602) All failed RMStateStore operations should not be RMFatalEvents
[ https://issues.apache.org/jira/browse/YARN-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1602: --- Priority: Critical (was: Blocker) All failed RMStateStore operations should not be RMFatalEvents -- Key: YARN-1602 URL: https://issues.apache.org/jira/browse/YARN-1602 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Currently, if a state store operation fails, depending on the exception, either a RMFatalEvent.STATE_STORE_FENCED or RMFatalEvent.STATE_STORE_OP_FAILED events are created. The latter results in the RM failing. Instead, we should probably kill the application corresponding to the store operation. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (YARN-1602) All failed RMStateStore operations should not be RMFatalEvents
Karthik Kambatla created YARN-1602: -- Summary: All failed RMStateStore operations should not be RMFatalEvents Key: YARN-1602 URL: https://issues.apache.org/jira/browse/YARN-1602 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Currently, if a state store operation fails, depending on the exception, either a RMFatalEvent.STATE_STORE_FENCED or RMFatalEvent.STATE_STORE_OP_FAILED events are created. The latter results in the RM failing. Instead, we should probably kill the application corresponding to the store operation. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1598) HA-related rmadmin commands don't work on a secure cluster
[ https://issues.apache.org/jira/browse/YARN-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1598: --- Fix Version/s: 2.4.0 HA-related rmadmin commands don't work on a secure cluster -- Key: YARN-1598 URL: https://issues.apache.org/jira/browse/YARN-1598 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Fix For: 2.4.0 Attachments: yarn-1598-1.patch The HA-related commands like -getServiceState -checkHealth etc. don't work in a secure cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1587) [YARN-321] MERGE Patch for YARN-321
[ https://issues.apache.org/jira/browse/YARN-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-1587: -- Attachment: YARN-1587-20140114.txt Uber patch with YARN-321..trunk diff + one more attempt at YARN-1595. [YARN-321] MERGE Patch for YARN-321 --- Key: YARN-1587 URL: https://issues.apache.org/jira/browse/YARN-1587 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Vinod Kumar Vavilapalli Fix For: YARN-321 Attachments: YARN-1587-20140113.txt, YARN-1587-20140114.txt, YARN-321-merge-1.patch Merge Patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1598) HA-related rmadmin commands don't work on a secure cluster
[ https://issues.apache.org/jira/browse/YARN-1598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871552#comment-13871552 ] Hudson commented on YARN-1598: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4997 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4997/]) YARN-1598. HA-related rmadmin commands don't work on a secure cluster (kasha) (kasha: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1558251) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/security/authorize/RMPolicyProvider.java HA-related rmadmin commands don't work on a secure cluster -- Key: YARN-1598 URL: https://issues.apache.org/jira/browse/YARN-1598 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Critical Fix For: 2.4.0 Attachments: yarn-1598-1.patch The HA-related commands like -getServiceState -checkHealth etc. don't work in a secure cluster. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1601) 3rd party JARs are missing from hadoop-dist output
[ https://issues.apache.org/jira/browse/YARN-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871556#comment-13871556 ] Hadoop QA commented on YARN-1601: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12623023/YARN-1601.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-assemblies. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2885//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2885//console This message is automatically generated. 3rd party JARs are missing from hadoop-dist output -- Key: YARN-1601 URL: https://issues.apache.org/jira/browse/YARN-1601 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.0.0, 2.4.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: YARN-1601.patch With the build changes of YARN-888 we are leaving out all 3rd party JArs used directly by YARN under /share/hadoop/yarn/lib/. We did not notice this when running minicluster because they all happen to be in the classpath from hadoop-common and hadoop-yarn. As 3d party JARs are not 'public' interfaces we cannot rely on them being provided to yarn by common and hdfs. (ie if common and hdfs stop using a 3rd party dependency that yarn uses this would break yarn if yarn does not pull that dependency explicitly). Also, this will break bigtop hadoop build when they move to use branch-2 as they expect to find jars in /share/hadoop/yarn/lib/ -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1587) [YARN-321] MERGE Patch for YARN-321
[ https://issues.apache.org/jira/browse/YARN-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871590#comment-13871590 ] Hadoop QA commented on YARN-1587: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12623033/YARN-1587-20140114.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 27 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-assemblies hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.TestNonExistentJob org.apache.hadoop.yarn.client.api.impl.TestNMClient org.apache.hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer org.apache.hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryClientService {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2883//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2883//console This message is automatically generated. [YARN-321] MERGE Patch for YARN-321 --- Key: YARN-1587 URL: https://issues.apache.org/jira/browse/YARN-1587 Project: Hadoop YARN Issue Type: Sub-task Reporter: Mayank Bansal Assignee: Vinod Kumar Vavilapalli Fix For: YARN-321 Attachments: YARN-1587-20140113.txt, YARN-1587-20140114.txt, YARN-321-merge-1.patch Merge Patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission
[ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871606#comment-13871606 ] Xuan Gong commented on YARN-1410: - [~bikassaha] any further comments ? Handle client failover during 2 step client API's like app submission - Key: YARN-1410 URL: https://issues.apache.org/jira/browse/YARN-1410 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch Original Estimate: 48h Remaining Estimate: 48h App submission involves 1) creating appId 2) using that appId to submit an ApplicationSubmissionContext to the user. The client may have obtained an appId from an RM, the RM may have failed over, and the client may submit the app to the new RM. Since the new RM has a different notion of cluster timestamp (used to create app id) the new RM may reject the app submission resulting in unexpected failure on the client side. The same may happen for other 2 step client API operations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (YARN-1603) Remove two *.orig files which were unexpectedly committed
[ https://issues.apache.org/jira/browse/YARN-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-1603: -- Summary: Remove two *.orig files which were unexpectedly committed (was: Remove two *.orig files which were unexpected committed) Remove two *.orig files which were unexpectedly committed - Key: YARN-1603 URL: https://issues.apache.org/jira/browse/YARN-1603 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Minor Attachments: YARN-1603.1.patch FairScheduler.java.orig and TestFifoScheduler.java.orig -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1603) Remove two *.orig files which were unexpectedly committed
[ https://issues.apache.org/jira/browse/YARN-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871659#comment-13871659 ] Hadoop QA commented on YARN-1603: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12623058/YARN-1603.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/2886//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2886//console This message is automatically generated. Remove two *.orig files which were unexpectedly committed - Key: YARN-1603 URL: https://issues.apache.org/jira/browse/YARN-1603 Project: Hadoop YARN Issue Type: Bug Reporter: Zhijie Shen Assignee: Zhijie Shen Priority: Minor Attachments: YARN-1603.1.patch FairScheduler.java.orig and TestFifoScheduler.java.orig -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission
[ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871751#comment-13871751 ] Bikas Saha commented on YARN-1410: -- Dont think I understood the failover policy wrt restart stuff. If the RM restarts or it fails over, the client will be retrying across 2 different instances of the RM and so the semantics of the operations should be the same ie the issues we are trying to identify and fix should be the same. Every problem that we have with failover, also applies to restart. Irrespective of failover, if client does submitApp() then gets and error on the network (even though RM has accepted the app). Then it retries submitApp() and the RM says app already exists. So this question is fundamental to the retry semantics of the operation. RM failover is an easy way to trigger this condition. Lets spend some time to think a solution to avoid doing a getApplication and receiving an exception before submitting the application. Handle client failover during 2 step client API's like app submission - Key: YARN-1410 URL: https://issues.apache.org/jira/browse/YARN-1410 Project: Hadoop YARN Issue Type: Sub-task Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-1410-outline.patch, YARN-1410.1.patch, YARN-1410.2.patch, YARN-1410.2.patch, YARN-1410.3.patch Original Estimate: 48h Remaining Estimate: 48h App submission involves 1) creating appId 2) using that appId to submit an ApplicationSubmissionContext to the user. The client may have obtained an appId from an RM, the RM may have failed over, and the client may submit the app to the new RM. Since the new RM has a different notion of cluster timestamp (used to create app id) the new RM may reject the app submission resulting in unexpected failure on the client side. The same may happen for other 2 step client API operations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1584) Support explicit failover when automatic failover is enabled
[ https://issues.apache.org/jira/browse/YARN-1584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871759#comment-13871759 ] Bikas Saha commented on YARN-1584: -- bq. Firstly, it requires manually checking the other RM has actually taken over, which in itself is slower than handling it automatically. Then, the start-up time for the second RM; the start-up might become an issue if/when the Standby and the other services retain/pre-fetch state. Is the proposal for the active rm to give up being a leader, then monitor that someone else becomes a leader. Then do what? If someone else does not become leader then what should it do? If someone else becomes the leader then does the one who just gave up try to participate in the election again? If yes, then why did we ask it to give up in the first place? If we did this to do some maintenance on the first RM then how is it different from shutting it down and letting auto-failover take its course? If we are doing maintenance on the first RM then we cannot help avoid a single RM risk unless we have 3 instances. Under auto-failover, there is no way one can force an RM to become active all by itself. So the documentation of the transitionToActive(FORCE) should state that this puts the RM into election but does not guarantee that it will win. transitionToStandby() can however guarantee that the RM does stop being active. Clearly, I am confused as to how this is resulting in ease of use. How about I get some help in understanding the exact scenario where this is useful. Is there a specific example? What exactly are the chain of events that we think should happen? Support explicit failover when automatic failover is enabled Key: YARN-1584 URL: https://issues.apache.org/jira/browse/YARN-1584 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.4.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla YARN-1029 adds automatic failover support. However, users can't explicitly ask for a failover from one RM to the other without stopping the other RM. Stopping the RM until the other RM takes over and then restarting the first RM is more involving and exposes the RM-ensemble to SPOF for a longer duration. It would be nice to allow explicit failover through yarn rmadmin -failover command. PS: HDFS supports -failover option. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (YARN-1104) NMs to support rolling logs of stdout stderr
[ https://issues.apache.org/jira/browse/YARN-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871774#comment-13871774 ] Zhijie Shen commented on YARN-1104: --- I agree for long living services, stdout and stderr files need to be rotated. However, I'm not sure it's good idea to wire them to log, which may mix stdout/stderr output together with log. If some messages are really considered as log, why not writing them directly into log, instead of pushing them to stdout/stderr? NMs to support rolling logs of stdout stderr -- Key: YARN-1104 URL: https://issues.apache.org/jira/browse/YARN-1104 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.1.0-beta Reporter: Steve Loughran Currently NMs stream the stdout and stderr streams of a container to a file. For longer lived processes those files need to be rotated so that the log doesn't overflow -- This message was sent by Atlassian JIRA (v6.1.5#6160)