[jira] [Commented] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be
[ https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560210#comment-14560210 ] Craig Welch commented on YARN-3626: --- The checkstyle is insignificant, the rest is all good. On Windows localized resources are not moved to the front of the classpath when they should be -- Key: YARN-3626 URL: https://issues.apache.org/jira/browse/YARN-3626 Project: Hadoop YARN Issue Type: Bug Components: yarn Environment: Windows Reporter: Craig Welch Assignee: Craig Welch Fix For: 2.7.1 Attachments: YARN-3626.0.patch, YARN-3626.11.patch, YARN-3626.14.patch, YARN-3626.15.patch, YARN-3626.16.patch, YARN-3626.4.patch, YARN-3626.6.patch, YARN-3626.9.patch In response to the mapreduce.job.user.classpath.first setting the classpath is ordered differently so that localized resources will appear before system classpath resources when tasks execute. On Windows this does not work because the localized resources are not linked into their final location when the classpath jar is created. To compensate for that localized jar resources are added directly to the classpath generated for the jar rather than being discovered from the localized directories. Unfortunately, they are always appended to the classpath, and so are never preferred over system resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-3721: Attachment: YARN-3721-YARN-2928.001.patch Add an exclusion to resolve the cyclic dependency in timelineserver's pom file. build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560305#comment-14560305 ] Hadoop QA commented on YARN-3721: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 22s | Pre-patch YARN-2928 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 42s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 43s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 40s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | yarn tests | 1m 11s | Tests failed in hadoop-yarn-server-timelineservice. | | | | 36m 42s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.timelineservice.storage.TestHBaseTimelineWriterImpl | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735496/YARN-3721-YARN-2928.001.patch | | Optional Tests | javadoc javac unit | | git revision | YARN-2928 / e19566a | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8094/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8094/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8094/console | This message was automatically generated. build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3722) Merge multiple TestWebAppUtils
Masatake Iwasaki created YARN-3722: -- Summary: Merge multiple TestWebAppUtils Key: YARN-3722 URL: https://issues.apache.org/jira/browse/YARN-3722 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to {{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to {{o.a.h.yarn.webapp.util}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3581) Deprecate -directlyAccessNodeLabelStore in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560179#comment-14560179 ] Wangda Tan commented on YARN-3581: -- [~Naganarasimha]. Thanks for working on this, some comments: - {{(Deprecated! Support will be removed in future) Directly access node label store, }}, is it better to make it: {{(This is DEPRECATED, will be removed in future releases)...}}? - RMAdminCLI put the deprecated.. message in args option instead of help. - printHelp in RMAdminCLI should be consistency with usage? For changes of {{...}} Deprecate -directlyAccessNodeLabelStore in RMAdminCLI - Key: YARN-3581 URL: https://issues.apache.org/jira/browse/YARN-3581 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Naganarasimha G R Attachments: YARN-3581.20150525-1.patch In 2.6.0, we added an option called -directlyAccessNodeLabelStore to make RM can start with label-configured queue settings. After YARN-2918, we don't need this option any more, admin can configure queue setting, start RM and configure node label via RMAdminCLI without any error. In addition, this option is very restrictive, first it needs to run on the same node where RM is running if admin configured to store labels in local disk. Second, when admin run the option when RM is running, multiple process write to a same file can happen, this could make node label store becomes invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-41) The RM should handle the graceful shutdown of the NM.
[ https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560178#comment-14560178 ] Jian He commented on YARN-41: - I only briefly scan the patch and found the UnRegisterNodeManagerRequest/Response better be abstract class to be consistent with the rest records The RM should handle the graceful shutdown of the NM. - Key: YARN-41 URL: https://issues.apache.org/jira/browse/YARN-41 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, resourcemanager Reporter: Ravi Teja Ch N V Assignee: Devaraj K Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41-7.patch, YARN-41.patch Instead of waiting for the NM expiry, RM should remove and handle the NM, which is shutdown gracefully. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString
[ https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560182#comment-14560182 ] Wangda Tan commented on YARN-3716: -- Patch LGTM, will commit once Jenkins get back. Output node label expression in ResourceRequestPBImpl.toString -- Key: YARN-3716 URL: https://issues.apache.org/jira/browse/YARN-3716 Project: Hadoop YARN Issue Type: Sub-task Components: api Reporter: Xianyin Xin Assignee: Xianyin Xin Priority: Minor Attachments: YARN-3716.001.patch It's convenient for debug and log trace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3700) ATS Web Performance issue at load time when large number of jobs
[ https://issues.apache.org/jira/browse/YARN-3700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560222#comment-14560222 ] Hadoop QA commented on YARN-3700: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 52s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 49s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 55s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 4m 9s | Site still builds. | | {color:red}-1{color} | checkstyle | 2m 9s | The applied patch generated 1 new checkstyle issues (total was 215, now 215). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 28s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 25s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 57s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 3m 6s | Tests passed in hadoop-yarn-server-applicationhistoryservice. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-server-common. | | | | 55m 32s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735457/YARN-3700.3.patch | | Optional Tests | javadoc javac unit findbugs checkstyle site | | git revision | trunk / cdbd66b | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-applicationhistoryservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt | | hadoop-yarn-server-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8092/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8092/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8092/console | This message was automatically generated. ATS Web Performance issue at load time when large number of jobs Key: YARN-3700 URL: https://issues.apache.org/jira/browse/YARN-3700 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager, webapp, yarn Reporter: Xuan Gong Assignee: Xuan Gong Attachments: YARN-3700.1.patch, YARN-3700.2.1.patch, YARN-3700.2.2.patch, YARN-3700.2.patch, YARN-3700.3.patch Currently, we will load all the apps when we try to load the yarn timelineservice web page. If we have large number of jobs, it will be very slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI
[ https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3467: Attachment: Screen Shot 2015-05-26 at 5.46.54 PM.png Shows the 2 new sortable columns for Allocated memory and cpu Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI --- Key: YARN-3467 URL: https://issues.apache.org/jira/browse/YARN-3467 Project: Hadoop YARN Issue Type: New Feature Components: webapp, yarn Affects Versions: 2.5.0 Reporter: Anthony Rojas Assignee: Anubhav Dhoot Priority: Minor Attachments: ApplicationAttemptPage.png, Screen Shot 2015-05-26 at 5.46.54 PM.png, YARN-3467.001.patch The YARN REST API can report on the following properties: *allocatedMB*: The sum of memory in MB allocated to the application's running containers *allocatedVCores*: The sum of virtual cores allocated to the application's running containers *runningContainers*: The number of containers currently running for the application Currently, the RM Web UI does not report on these items (at least I couldn't find any entries within the Web UI). It would be useful for YARN Application and Resource troubleshooting to have these properties and their corresponding values exposed on the RM WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu reassigned YARN-3721: --- Assignee: Li Lu build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560247#comment-14560247 ] Li Lu commented on YARN-3721: - Hi [~sjlee0], thanks for catching this! Wow, this is a real problem. I can take a look at it. build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3547) FairScheduler: Apps that have no resource demand should not participate scheduling
[ https://issues.apache.org/jira/browse/YARN-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560255#comment-14560255 ] Xianyin Xin commented on YARN-3547: --- Hi [~kasha], [~leftnoteasy], can we reach a consensus as the patch is just a simple fix? FairScheduler: Apps that have no resource demand should not participate scheduling -- Key: YARN-3547 URL: https://issues.apache.org/jira/browse/YARN-3547 Project: Hadoop YARN Issue Type: Improvement Components: fairscheduler Reporter: Xianyin Xin Assignee: Xianyin Xin Attachments: YARN-3547.001.patch, YARN-3547.002.patch, YARN-3547.003.patch, YARN-3547.004.patch, YARN-3547.005.patch At present, all of the 'running' apps participate the scheduling process, however, most of them may have no resource demand on a production cluster, as the app's status is running other than waiting for resource at the most of the app's lifetime. It's not a wise way we sort all the 'running' apps and try to fulfill them, especially on a large-scale cluster which has heavy scheduling load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3722) Merge multiple TestWebAppUtils
[ https://issues.apache.org/jira/browse/YARN-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3722: --- Attachment: YARN-3722.001.patch Merge multiple TestWebAppUtils -- Key: YARN-3722 URL: https://issues.apache.org/jira/browse/YARN-3722 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3722.001.patch The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to {{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to {{o.a.h.yarn.webapp.util}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3722) Merge multiple TestWebAppUtils
[ https://issues.apache.org/jira/browse/YARN-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560321#comment-14560321 ] Hadoop QA commented on YARN-3722: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 5m 13s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 28s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 29s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 32s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 23s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 55s | Tests passed in hadoop-yarn-common. | | | | 18m 56s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735502/YARN-3722.001.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / cdbd66b | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8095/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8095/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8095/console | This message was automatically generated. Merge multiple TestWebAppUtils -- Key: YARN-3722 URL: https://issues.apache.org/jira/browse/YARN-3722 Project: Hadoop YARN Issue Type: Improvement Components: test Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3722.001.patch The tests in {{o.a.h.yarn.util.TestWebAppUtils}} could be moved to {{o.a.h.yarn.webapp.util.TestWebAppUtils}}. WebAppUtils belongs to {{o.a.h.yarn.webapp.util}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3682) Decouple PID-file management from ContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated YARN-3682: -- Attachment: YARN-3682-20150526.1.txt Updated patch. Apparently, I already forgot how to write code that compiles. Decouple PID-file management from ContainerExecutor --- Key: YARN-3682 URL: https://issues.apache.org/jira/browse/YARN-3682 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: YARN-3682-20150526.1.txt, YARN-3682-20150526.txt The PID-files management currently present in ContainerExecutor really doesn't belong there. I know the original history of why we added it, that was about the only right place to put it in at that point of time. Given the evolution of executors for Windows etc, the ContainerExecutor is getting more complicated than is necessary. We should pull the PID-file management into its own entity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS
[ https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560188#comment-14560188 ] Hudson commented on YARN-160: - SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #209 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/209/]) YARN-160. Enhanced NodeManager to automatically obtain cpu/memory values from underlying OS when configured to do so. Contributed by Varun Vasudev. (vinodkv: rev 500a1d9c76ec612b4e737888f4be79951c11591d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/LinuxResourceCalculatorPlugin.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java * hadoop-tools/hadoop-gridmix/src/test/java/org/apache/hadoop/mapred/gridmix/DummyResourceCalculatorPlugin.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerExecutor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLinuxResourceCalculatorPlugin.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ResourceCalculatorPlugin.java nodemanagers should obtain cpu/memory values from underlying OS --- Key: YARN-160 URL: https://issues.apache.org/jira/browse/YARN-160 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Varun Vasudev Labels: BB2015-05-TBR Fix For: 2.8.0 Attachments: YARN-160.005.patch, YARN-160.006.patch, YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch As mentioned in YARN-2 *NM memory and CPU configs* Currently these values are coming from the config of the NM, we should be able to obtain those values from the OS (ie, in the case of Linux from /proc/meminfo /proc/cpuinfo). As this is highly OS dependent we should have an interface that obtains this information. In addition implementations of this interface should be able to specify a mem/cpu offset (amount of mem/cpu not to be avail as YARN resource), this would allow to reserve mem/cpu for the OS and other services outside of YARN containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3632) Ordering policy should be allowed to reorder an application when demand changes
[ https://issues.apache.org/jira/browse/YARN-3632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560190#comment-14560190 ] Hudson commented on YARN-3632: -- SUCCESS: Integrated in Hadoop-Yarn-trunk-Java8 #209 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/209/]) YARN-3632. Ordering policy should be allowed to reorder an application when demand changes. Contributed by Craig Welch (jianhe: rev 10732d515f62258309f98e4d7d23249f80b1847d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FifoOrderingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/AbstractComparatorOrderingPolicy.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/OrderingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/policy/FairOrderingPolicy.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java Ordering policy should be allowed to reorder an application when demand changes --- Key: YARN-3632 URL: https://issues.apache.org/jira/browse/YARN-3632 Project: Hadoop YARN Issue Type: Sub-task Components: capacityscheduler Reporter: Craig Welch Assignee: Craig Welch Fix For: 2.8.0 Attachments: YARN-3632.0.patch, YARN-3632.1.patch, YARN-3632.3.patch, YARN-3632.4.patch, YARN-3632.5.patch, YARN-3632.6.patch, YARN-3632.7.patch At present, ordering policies have the option to have an application re-ordered (for allocation and preemption) when it is allocated to or a container is recovered from the application. Some ordering policies may also need to reorder when demand changes if that is part of the ordering comparison, this needs to be made available (and used by the fairorderingpolicy when sizebasedweight is true) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString
[ https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560224#comment-14560224 ] Xianyin Xin commented on YARN-3716: --- Thanks, [~leftnoteasy]. Output node label expression in ResourceRequestPBImpl.toString -- Key: YARN-3716 URL: https://issues.apache.org/jira/browse/YARN-3716 Project: Hadoop YARN Issue Type: Sub-task Components: api Reporter: Xianyin Xin Assignee: Xianyin Xin Priority: Minor Attachments: YARN-3716.001.patch It's convenient for debug and log trace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560234#comment-14560234 ] Sangjin Lee commented on YARN-3721: --- The error message: {panel} Failed to execute goal on project hadoop-yarn-server-timelineservice: Could not resolve dependencies for project org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT: Failure to find org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT in https://repository.apache.org/content/repositories/snapshots was cached in the local repository, resolution will not be reattempted until the update interval of apache.snapshots.https has elapsed or updates are forced {panel} The dependency cycle is introduced by hbase testing util. It has a transitive dependency on timelineservice (test) itself! {noformat} org.apache.hbase:hbase-testing-util:jar:1.0.1:test org.apache.hbase:hbase-common:jar:tests:1.0.1:runtime org.apache.hbase:hbase-annotations:jar:tests:1.0.1:test org.apache.hbase:hbase-hadoop-compat:jar:tests:1.0.1:test org.apache.hbase:hbase-hadoop2-compat:jar:tests:1.0.1:test org.apache.hadoop:hadoop-client:jar:3.0.0-SNAPSHOT:compile (version managed from 2.5.1 by org.apache.hadoop:hadoop-project:3.0.0-SNAPSHOT) org.apache.hadoop:hadoop-mapreduce-client-app:jar:3.0.0-SNAPSHOT:compile org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:3.0.0-SNAPSHOT:compile (version managed from 2.5.1 by org.apache.hadoop:hadoop-project:3.0.0-SNAPSHOT) org.apache.hadoop:hadoop-mapreduce-client-common:jar:3.0.0-SNAPSHOT:compile org.apache.hadoop:hadoop-yarn-client:jar:3.0.0-SNAPSHOT:compile org.apache.hadoop:hadoop-mapreduce-client-shuffle:jar:3.0.0-SNAPSHOT:compile org.apache.hadoop:hadoop-yarn-server-nodemanager:jar:3.0.0-SNAPSHOT:compile org.apache.hadoop:hadoop-minicluster:jar:3.0.0-SNAPSHOT:test (version managed from 2.5.1 by org.apache.hadoop:hadoop-project:3.0.0-SNAPSHOT) org.apache.hadoop:hadoop-yarn-server-tests:jar:tests:3.0.0-SNAPSHOT:test org.apache.hadoop:hadoop-yarn-server-resourcemanager:jar:3.0.0-SNAPSHOT:test org.apache.hadoop:hadoop-yarn-server-web-proxy:jar:3.0.0-SNAPSHOT:test org.apache.zookeeper:zookeeper:jar:tests:3.4.6:test org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT:test org.apache.hadoop:hadoop-mapreduce-client-jobclient:jar:tests:3.0.0-SNAPSHOT:test org.apache.hadoop:hadoop-mapreduce-client-hs:jar:3.0.0-SNAPSHOT:test {noformat} build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Priority: Blocker The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance
[ https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560252#comment-14560252 ] Xianyin Xin commented on YARN-3652: --- Thanks [~vinodkv]. When i said YARN-3293 and {{SchedulerMetrics}} are similar, i mean the two are similar on function design, and it is not implemented yet at that time. A simple {{SchedulerMetrics}} was introduced in YARN-3630, where a {{#ofWaitingSchedulerEvent}} metric was used to evaluate the load of the scheduler. [~vvasudev], hope for your idea. :) A SchedulerMetrics may be need for evaluating the scheduler's performance - Key: YARN-3652 URL: https://issues.apache.org/jira/browse/YARN-3652 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager, scheduler Reporter: Xianyin Xin As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating the scheduler's performance. The performance indexes includes #events waiting for being handled by scheduler, the throughput, the scheduling delay and/or other indicators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations
[ https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560262#comment-14560262 ] Vinod Kumar Vavilapalli commented on YARN-3685: --- bq. This was true of Linux even before YARN-316, so in that sense, YARN did already have some classpath logic indirectly. bq. I was thinking of stuff like yarn.application.classpath, where values are defined in terms of things like the HADOOP_YARN_HOME and HADOOP_COMMON_HOME environment variables, and those values might not match the file system layout at the client side. Hm.. YARN_APPLICATION_CLASSPATH is a simple convenience configuration property that the server *does not* load, but used by the applications like distributed-shell. And yeah, this convenience property was never assumed to work with variable installation layouts. Increasingly our apps are being migrated to a distributed-cache based deployment so as to avoid the layout issue, so in sum YARN_APPLICATION_CLASSPATH is essentially unused. NodeManager unnecessarily knows about classpath-jars due to Windows limitations --- Key: YARN-3685 URL: https://issues.apache.org/jira/browse/YARN-3685 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Vinod Kumar Vavilapalli Found this while looking at cleaning up ContainerExecutor via YARN-3648, making it a sub-task. YARN *should not* know about classpaths. Our original design modeled around this. But when we added windows suppport, due to classpath issues, we ended up breaking this abstraction via YARN-316. We should clean this up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3581) Deprecate -directlyAccessNodeLabelStore in RMAdminCLI
[ https://issues.apache.org/jira/browse/YARN-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560216#comment-14560216 ] Naganarasimha G R commented on YARN-3581: - hi [~wangda], Regarding deprecated message as argument, {{-removeFromClusterNodeLabels}} had a comment that {{(label splitted by ,)}}, so thought imp info can be shown in this way. will move it to description . but one more thing, shall i totally remove description and have only this Deprecated message, so that no one will use it ? others will get it corrected. Deprecate -directlyAccessNodeLabelStore in RMAdminCLI - Key: YARN-3581 URL: https://issues.apache.org/jira/browse/YARN-3581 Project: Hadoop YARN Issue Type: Sub-task Components: api, client, resourcemanager Reporter: Wangda Tan Assignee: Naganarasimha G R Attachments: YARN-3581.20150525-1.patch In 2.6.0, we added an option called -directlyAccessNodeLabelStore to make RM can start with label-configured queue settings. After YARN-2918, we don't need this option any more, admin can configure queue setting, start RM and configure node label via RMAdminCLI without any error. In addition, this option is very restrictive, first it needs to run on the same node where RM is running if admin configured to store labels in local disk. Second, when admin run the option when RM is running, multiple process write to a same file can happen, this could make node label store becomes invalid. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI
[ https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-3467: Attachment: YARN-3467.001.patch Changes show allocated CPU and memory on the Applications page Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI --- Key: YARN-3467 URL: https://issues.apache.org/jira/browse/YARN-3467 Project: Hadoop YARN Issue Type: New Feature Components: webapp, yarn Affects Versions: 2.5.0 Reporter: Anthony Rojas Assignee: Anubhav Dhoot Priority: Minor Attachments: ApplicationAttemptPage.png, YARN-3467.001.patch The YARN REST API can report on the following properties: *allocatedMB*: The sum of memory in MB allocated to the application's running containers *allocatedVCores*: The sum of virtual cores allocated to the application's running containers *runningContainers*: The number of containers currently running for the application Currently, the RM Web UI does not report on these items (at least I couldn't find any entries within the Web UI). It would be useful for YARN Application and Resource troubleshooting to have these properties and their corresponding values exposed on the RM WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
Sangjin Lee created YARN-3721: - Summary: build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Priority: Blocker The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560352#comment-14560352 ] Li Lu commented on YARN-3721: - Seems like the HBase UT is also failing on YARN-2928 branch. [~vrushalic] would you please take a look at it? The UT failure appears to be irrelevant to the changes in this patch (the maven failure is gone and the mini-hbase cluster has been successfully launched). build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560373#comment-14560373 ] Sangjin Lee commented on YARN-3721: --- Thanks for the quick patch [~gtCarrera9]! So we don't need the hadoop mini-cluster part of the dependency from hbase-testing-util at all? Could you elaborate how that still works with the mini-HBase cluster? That might help us make the dependency clearer (or more explicit). build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations
[ https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560391#comment-14560391 ] Chris Nauroth commented on YARN-3685: - bq. YARN_APPLICATION_CLASSPATH is essentially unused. In that case, this is definitely worth revisiting as part of this issue. Perhaps it's not a problem anymore. This had been used in the past, as seen in bug reports like YARN-1138. NodeManager unnecessarily knows about classpath-jars due to Windows limitations --- Key: YARN-3685 URL: https://issues.apache.org/jira/browse/YARN-3685 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Vinod Kumar Vavilapalli Found this while looking at cleaning up ContainerExecutor via YARN-3648, making it a sub-task. YARN *should not* know about classpaths. Our original design modeled around this. But when we added windows suppport, due to classpath issues, we ended up breaking this abstraction via YARN-316. We should clean this up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560416#comment-14560416 ] Zhijie Shen commented on YARN-3044: --- Naga, sorry for late reply. The new patch looks much better to me, but I still concern about the following change: {code} 287 @Override 288 public Dispatcher getDispatcher() { 289 Dispatcher dispatcher = null; 290 291 if (publishContainerMetrics) { 292 dispatcher = super.getDispatcher(); 293 } else { 294 // Normal dispatcher is sufficient if container metrics are not required 295 // to be published 296 dispatcher = new AsyncDispatcher(); 297 } 298 return dispatcher; 299 } {code} I think it's better to retain the multiple-dispatchers, which is more flexible to fit for different scales. We can config to change how many threads we need. Routing an event to one dispatcher takes constant time according to the current multiple-dispatchers implementation. Thoughts? [Event producers] Implement RM writing app lifecycle events to ATS -- Key: YARN-3044 URL: https://issues.apache.org/jira/browse/YARN-3044 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3044-YARN-2928.004.patch, YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, YARN-3044-YARN-2928.007.patch, YARN-3044-YARN-2928.008.patch, YARN-3044.20150325-1.patch, YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3720) Need comprehensive documentation for configuration CPU/memory resources on NodeManager
[ https://issues.apache.org/jira/browse/YARN-3720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev reassigned YARN-3720: --- Assignee: Varun Vasudev Need comprehensive documentation for configuration CPU/memory resources on NodeManager -- Key: YARN-3720 URL: https://issues.apache.org/jira/browse/YARN-3720 Project: Hadoop YARN Issue Type: Task Components: documentation, nodemanager Reporter: Vinod Kumar Vavilapalli Assignee: Varun Vasudev Things are getting more and more complex after the likes of YARN-160. We need a document explaining how to configure cpu/memory values on a NodeManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3703) Container Launch fails with exitcode 2 with DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560443#comment-14560443 ] Devaraj K commented on YARN-3703: - I lost the app logs for this issue when it occurred, trying to reproduce this. I am closing it now, will reopen this issue once I get the logs and still feel it an issue. Thanks. Container Launch fails with exitcode 2 with DefaultContainerExecutor Key: YARN-3703 URL: https://issues.apache.org/jira/browse/YARN-3703 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.7.0 Reporter: Devaraj K Priority: Minor Please find the below NM log when the issue occurs. {code:xml} 2015-05-21 20:14:53,907 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1432208816246_0225_01_34 is : 2 2015-05-21 20:14:53,908 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1432208816246_0225_01_34 and exit code: 2 ExitCodeException exitCode=2: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at org.apache.hadoop.util.Shell.run(Shell.java:456) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from container-launch. 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: container_1432208816246_0225_01_34 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 2 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: ExitCodeException exitCode=2: 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.run(Shell.java:456) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.FutureTask.run(FutureTask.java:262) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.lang.Thread.run(Thread.java:745) 2015-05-21 20:14:53,910 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 2 2015-05-21 20:14:53,911 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1432208816246_0225_01_34 transitioned from RUNNING to EXITED_WITH_FAILURE 2015-05-21 20:14:53,911
[jira] [Resolved] (YARN-3703) Container Launch fails with exitcode 2 with DefaultContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K resolved YARN-3703. - Resolution: Not A Problem Container Launch fails with exitcode 2 with DefaultContainerExecutor Key: YARN-3703 URL: https://issues.apache.org/jira/browse/YARN-3703 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0, 2.7.0 Reporter: Devaraj K Priority: Minor Please find the below NM log when the issue occurs. {code:xml} 2015-05-21 20:14:53,907 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exit code from container container_1432208816246_0225_01_34 is : 2 2015-05-21 20:14:53,908 WARN org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Exception from container-launch with container ID: container_1432208816246_0225_01_34 and exit code: 2 ExitCodeException exitCode=2: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at org.apache.hadoop.util.Shell.run(Shell.java:456) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exception from container-launch. 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Container id: container_1432208816246_0225_01_34 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Exit code: 2 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: Stack trace: ExitCodeException exitCode=2: 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell.run(Shell.java:456) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.FutureTask.run(FutureTask.java:262) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 2015-05-21 20:14:53,910 INFO org.apache.hadoop.yarn.server.nodemanager.ContainerExecutor: at java.lang.Thread.run(Thread.java:745) 2015-05-21 20:14:53,910 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container exited with a non-zero exit code 2 2015-05-21 20:14:53,911 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1432208816246_0225_01_34 transitioned from RUNNING to EXITED_WITH_FAILURE 2015-05-21 20:14:53,911 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_1432208816246_0225_01_34 {code} -- This message was sent by Atlassian
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560415#comment-14560415 ] Li Lu commented on YARN-3721: - Also, [~sjlee0], I thought I've resolved the problem, but would you please help me to verify if my patch actually resolves exactly the same problem as you raised? Thanks! build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3682) Decouple PID-file management from ContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560369#comment-14560369 ] Hadoop QA commented on YARN-3682: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | patch | 0m 1s | The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions. | | {color:blue}0{color} | pre-patch | 14m 51s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 6 new or modified test files. | | {color:green}+1{color} | javac | 7m 33s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 33s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 19s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 46s | The applied patch generated 7 new checkstyle issues (total was 295, now 298). | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 2s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 6m 2s | Tests failed in hadoop-yarn-server-nodemanager. | | | | 42m 19s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE | | | hadoop.yarn.server.nodemanager.containermanager.container.TestContainer | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735508/YARN-3682-20150526.1.txt | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / cdbd66b | | Release Audit | https://builds.apache.org/job/PreCommit-YARN-Build/8096/artifact/patchprocess/patchReleaseAuditProblems.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8096/artifact/patchprocess/diffcheckstylehadoop-yarn-server-nodemanager.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8096/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8096/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8096/console | This message was automatically generated. Decouple PID-file management from ContainerExecutor --- Key: YARN-3682 URL: https://issues.apache.org/jira/browse/YARN-3682 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Attachments: YARN-3682-20150526.1.txt, YARN-3682-20150526.txt The PID-files management currently present in ContainerExecutor really doesn't belong there. I know the original history of why we added it, that was about the only right place to put it in at that point of time. Given the evolution of executors for Windows etc, the ContainerExecutor is getting more complicated than is necessary. We should pull the PID-file management into its own entity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3721) build is broken on YARN-2928 branch due to possible dependency cycle
[ https://issues.apache.org/jira/browse/YARN-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560406#comment-14560406 ] Li Lu commented on YARN-3721: - Hi [~sjlee0], I'm not 100% sure, but at least on my local machine maven is not only complaining about a cyclic dependency. The direct cause of the failure is Failure to find org.apache.hadoop:hadoop-yarn-server-timelineservice:jar:3.0.0-SNAPSHOT. I suspect this is because we have not published anything in YARN-2928 branch to Apache's snapshot server. Previously, if local builds are cached, there are hadoop-yarn-server-timelineservice available for future builds. However, if timelineservice is not in the cache, Maven cannot find it from the snapshot server. Of course, the root cause of this problem is the cyclic dependence from timeline-service to hbase-test-util to mini hadoop cluster to timeline-service itself. We can exempt the dependence at compile time from hbase-test-util to mini hadoop cluster because for tests, mini hadoop cluster is available. So I don't think we need to enforce that statically. This is only my hunch. I'm not a maven expert so I truly appreciate more analysis. Thanks! build is broken on YARN-2928 branch due to possible dependency cycle Key: YARN-3721 URL: https://issues.apache.org/jira/browse/YARN-3721 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Sangjin Lee Assignee: Li Lu Priority: Blocker Attachments: YARN-3721-YARN-2928.001.patch The build is broken on the YARN-2928 branch at the hadoop-yarn-server-timelineservice module. It's been broken for a while, but we didn't notice it because the build happens to work despite this if the maven local cache is not cleared. To reproduce, remove all hadoop (3.0.0-SNAPSHOT) artifacts from your maven local cache and build it. Almost certainly it was introduced by YARN-3529. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3585) NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled
[ https://issues.apache.org/jira/browse/YARN-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560410#comment-14560410 ] Rohith commented on YARN-3585: -- I will test YARN-3641 fix for this JIRA scenario. About the patch, I think calling System.exit() explicitely after shutdown thead exit is one option. NodeManager cannot exit on SHUTDOWN event triggered and NM recovery is enabled -- Key: YARN-3585 URL: https://issues.apache.org/jira/browse/YARN-3585 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.6.0 Reporter: Peng Zhang Priority: Critical With NM recovery enabled, after decommission, nodemanager log show stop but process cannot end. non daemon thread: {noformat} DestroyJavaVM prio=10 tid=0x7f3460011800 nid=0x29ec waiting on condition [0x] leveldb prio=10 tid=0x7f3354001800 nid=0x2a97 runnable [0x] VM Thread prio=10 tid=0x7f3460167000 nid=0x29f8 runnable Gang worker#0 (Parallel GC Threads) prio=10 tid=0x7f346002 nid=0x29ed runnable Gang worker#1 (Parallel GC Threads) prio=10 tid=0x7f3460022000 nid=0x29ee runnable Gang worker#2 (Parallel GC Threads) prio=10 tid=0x7f3460024000 nid=0x29ef runnable Gang worker#3 (Parallel GC Threads) prio=10 tid=0x7f3460025800 nid=0x29f0 runnable Gang worker#4 (Parallel GC Threads) prio=10 tid=0x7f3460027800 nid=0x29f1 runnable Gang worker#5 (Parallel GC Threads) prio=10 tid=0x7f3460029000 nid=0x29f2 runnable Gang worker#6 (Parallel GC Threads) prio=10 tid=0x7f346002b000 nid=0x29f3 runnable Gang worker#7 (Parallel GC Threads) prio=10 tid=0x7f346002d000 nid=0x29f4 runnable Concurrent Mark-Sweep GC Thread prio=10 tid=0x7f3460120800 nid=0x29f7 runnable Gang worker#0 (Parallel CMS Threads) prio=10 tid=0x7f346011c800 nid=0x29f5 runnable Gang worker#1 (Parallel CMS Threads) prio=10 tid=0x7f346011e800 nid=0x29f6 runnable VM Periodic Task Thread prio=10 tid=0x7f346019f800 nid=0x2a01 waiting on condition {noformat} and jni leveldb thread stack {noformat} Thread 12 (Thread 0x7f33dd842700 (LWP 10903)): #0 0x003d8340b43c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x7f33dfce2a3b in leveldb::(anonymous namespace)::PosixEnv::BGThreadWrapper(void*) () from /tmp/libleveldbjni-64-1-6922178968300745716.8 #2 0x003d83407851 in start_thread () from /lib64/libpthread.so.0 #3 0x003d830e811d in clone () from /lib64/libc.so.6 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration
[ https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3711: --- Description: Proper proxy URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled. (was: Proper URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._node-id_}} and/or {{yarn.resourcemanager.webapp.https.address._node-id_}} if RM-HA is enabled.) Documentation of ResourceManager HA should explain about webapp address configuration - Key: YARN-3711 URL: https://issues.apache.org/jira/browse/YARN-3711 Project: Hadoop YARN Issue Type: Improvement Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Proper proxy URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure
[ https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558727#comment-14558727 ] zhihai xu commented on YARN-3591: - Yes, I think we can get newErrorDirs and newRepairedDirs by comparing {{postCheckOtherDirs}} and {{preCheckOtherErrorDirs}} in {{DirectoryCollection#checkDirs}}. Can we use {{String}} to store {{DirectoryCollection#errorDirs}} in statestore similar as {{storeContainerDiagnostics}}? Resource Localisation on a bad disk causes subsequent containers failure - Key: YARN-3591 URL: https://issues.apache.org/jira/browse/YARN-3591 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Lavkesh Lahngir Assignee: Lavkesh Lahngir Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, YARN-3591.2.patch, YARN-3591.3.patch, YARN-3591.4.patch It happens when a resource is localised on the disk, after localising that disk has gone bad. NM keeps paths for localised resources in memory. At the time of resource request isResourcePresent(rsrc) will be called which calls file.exists() on the localised path. In some cases when disk has gone bad, inodes are stilled cached and file.exists() returns true. But at the time of reading, file will not open. Note: file.exists() actually calls stat64 natively which returns true because it was able to find inode information from the OS. A proposal is to call file.list() on the parent path of the resource, which will call open() natively. If the disk is good it should return an array of paths with length at-least 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1772) Fair Scheduler documentation should indicate that admin ACLs also give submit permissions
[ https://issues.apache.org/jira/browse/YARN-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558800#comment-14558800 ] Darrell Taylor commented on YARN-1772: -- I'm just reading through FairScheduler.md and found the following on line 197 : {quote} Anybody who may administer a queue may also submit applications to it. {quote} Does it need to be made clearer, or is everybody happy that covers it? Fair Scheduler documentation should indicate that admin ACLs also give submit permissions - Key: YARN-1772 URL: https://issues.apache.org/jira/browse/YARN-1772 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Reporter: Sandy Ryza Priority: Minor Labels: newbie I can submit to a Fair Scheduler queue if I'm in the submit ACL OR if I'm in the administer ACL. The Fair Scheduler docs seem to leave out the second part. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3714) AM proxy filter can not get proper default proxy address if RM-HA is enabled
Masatake Iwasaki created YARN-3714: -- Summary: AM proxy filter can not get proper default proxy address if RM-HA is enabled Key: YARN-3714 URL: https://issues.apache.org/jira/browse/YARN-3714 Project: Hadoop YARN Issue Type: Bug Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3217) Remove httpclient dependency from hadoop-yarn-server-web-proxy
[ https://issues.apache.org/jira/browse/YARN-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated YARN-3217: Release Note: Removed commons-httpclient dependency from hadoop-yarn-server-web-proxy module. Hadoop Flags: Incompatible change,Reviewed (was: Reviewed) Remove httpclient dependency from hadoop-yarn-server-web-proxy -- Key: YARN-3217 URL: https://issues.apache.org/jira/browse/YARN-3217 Project: Hadoop YARN Issue Type: Task Affects Versions: 2.6.0 Reporter: Akira AJISAKA Assignee: Brahma Reddy Battula Fix For: 2.7.0 Attachments: YARN-3217-002.patch, YARN-3217-003.patch, YARN-3217-003.patch, YARN-3217-004.patch, YARN-3217.patch Sub-task of HADOOP-10105. Remove httpclient dependency from WebAppProxyServlet.java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3127) Avoid timeline events during RM recovery or restart
[ https://issues.apache.org/jira/browse/YARN-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-3127: Description: 1.Start RM with HA and ATS configured and run some yarn applications 2.Once applications are finished sucessfully start timeline server 3.Now failover HA form active to standby 4.Access timeline server URL IP:PORT/applicationhistory //Note Earlier exception was thrown when accessed. Incomplete information is shown in the ATS web UI. i.e. attempt container and other information is not displayed. Also even if timeline server is started with RM, and on RM restart/ recovery ATS events for the applications already existing in ATS are resent which is not required. was: 1.Start RM with HA and ATS configured and run some yarn applications 2.Once applications are finished sucessfully start timeline server 3.Now failover HA form active to standby 4.Access timeline server URL IP:PORT/applicationhistory Result: Application history URL fails with below info {quote} 2015-02-03 20:28:09,511 ERROR org.apache.hadoop.yarn.webapp.View: Failed to read the applications. java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643) at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:80) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) ... Caused by: org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: The entity for application attempt appattempt_1422972608379_0001_01 doesn't exist in the timeline store at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getApplicationAttempt(ApplicationHistoryManagerOnTimelineStore.java:151) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.generateApplicationReport(ApplicationHistoryManagerOnTimelineStore.java:499) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAllApplications(ApplicationHistoryManagerOnTimelineStore.java:108) at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:84) at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:81) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) ... 51 more 2015-02-03 20:28:09,512 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /applicationhistory org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected 5 at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77) {quote} Behaviour with AHS with file based history store -Apphistory url is working -No attempt entries are shown for each application. Based on inital analysis when RM switches ,application attempts from state store are not replayed but only applications are. So when /applicaitonhistory url is accessed it tries for all attempt id and fails Avoid timeline events during RM recovery or restart --- Key: YARN-3127 URL: https://issues.apache.org/jira/browse/YARN-3127 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Environment: RM HA with ATS Reporter: Bibin A Chundatt Assignee: Naganarasimha G R Priority: Critical Attachments: YARN-3127.20150213-1.patch, YARN-3127.20150329-1.patch 1.Start RM with HA and ATS configured and run some yarn applications 2.Once applications are finished sucessfully start timeline server 3.Now failover HA form active to standby 4.Access timeline server URL IP:PORT/applicationhistory //Note Earlier exception was thrown when accessed. Incomplete information is shown in the ATS web UI. i.e. attempt container and other information is not displayed. Also even if timeline server is started with RM, and on RM restart/ recovery ATS events for the applications already existing in ATS are resent which is not required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration
[ https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3711: --- Issue Type: Sub-task (was: Improvement) Parent: YARN-149 Documentation of ResourceManager HA should explain about webapp address configuration - Key: YARN-3711 URL: https://issues.apache.org/jira/browse/YARN-3711 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Proper proxy URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558815#comment-14558815 ] Hadoop QA commented on YARN-3712: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 35s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 36s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 2s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 6m 5s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 41m 58s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735262/YARN-3712.02.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 39077db | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8079/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8079/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8079/console | This message was automatically generated. ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously - Key: YARN-3712 URL: https://issues.apache.org/jira/browse/YARN-3712 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Attachments: YARN-3712.01.patch, YARN-3712.02.patch It will save some time by handling event CLEANUP_CONTAINER asynchronously. This improvement will be useful for cases that cleaning up container cost a little long time(e.g. for our case: we are running Docker container on NM, it will take above 1 seconds to clean up one docker container. ) and many containers to clean up(e.g. NM need clean up all running containers when NM shutdown). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3644) Node manager shuts down if unable to connect with RM
[ https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raju Bairishetti updated YARN-3644: --- Attachment: YARN-3644.patch Intorduced a new config **NODEMANAGER_SHUTSDWON_ON_RM_CONNECTION_FAILURES** to allow the users to take decision on the shutdown of the NM when it is not able to connect to RM. Keeping default value as true to honour the current behavior. Node manager shuts down if unable to connect with RM Key: YARN-3644 URL: https://issues.apache.org/jira/browse/YARN-3644 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Srikanth Sundarrajan Assignee: Raju Bairishetti Attachments: YARN-3644.patch When NM is unable to connect to RM, NM shuts itself down. {code} } catch (ConnectException e) { //catch and throw the exception if tried MAX wait time to connect RM dispatcher.getEventHandler().handle( new NodeManagerEvent(NodeManagerEventType.SHUTDOWN)); throw new YarnRuntimeException(e); {code} In large clusters, if RM is down for maintenance for longer period, all the NMs shuts themselves down, requiring additional work to bring up the NMs. Setting the yarn.resourcemanager.connect.wait-ms to -1 has other side effects, where non connection failures are being retried infinitely by all YarnClients (via RMProxy). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3127) Avoid timeline events during RM recovery or restart
[ https://issues.apache.org/jira/browse/YARN-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-3127: Summary: Avoid timeline events during RM recovery or restart (was: Apphistory url crashes when RM switches with ATS enabled) Avoid timeline events during RM recovery or restart --- Key: YARN-3127 URL: https://issues.apache.org/jira/browse/YARN-3127 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, timelineserver Affects Versions: 2.6.0 Environment: RM HA with ATS Reporter: Bibin A Chundatt Assignee: Naganarasimha G R Priority: Critical Attachments: YARN-3127.20150213-1.patch, YARN-3127.20150329-1.patch 1.Start RM with HA and ATS configured and run some yarn applications 2.Once applications are finished sucessfully start timeline server 3.Now failover HA form active to standby 4.Access timeline server URL IP:PORT/applicationhistory Result: Application history URL fails with below info {quote} 2015-02-03 20:28:09,511 ERROR org.apache.hadoop.yarn.webapp.View: Failed to read the applications. java.lang.reflect.UndeclaredThrowableException at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643) at org.apache.hadoop.yarn.server.webapp.AppsBlock.render(AppsBlock.java:80) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:67) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) ... Caused by: org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: The entity for application attempt appattempt_1422972608379_0001_01 doesn't exist in the timeline store at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getApplicationAttempt(ApplicationHistoryManagerOnTimelineStore.java:151) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.generateApplicationReport(ApplicationHistoryManagerOnTimelineStore.java:499) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getAllApplications(ApplicationHistoryManagerOnTimelineStore.java:108) at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:84) at org.apache.hadoop.yarn.server.webapp.AppsBlock$1.run(AppsBlock.java:81) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) ... 51 more 2015-02-03 20:28:09,512 ERROR org.apache.hadoop.yarn.webapp.Dispatcher: error handling URI: /applicationhistory org.apache.hadoop.yarn.webapp.WebAppException: Error rendering block: nestLevel=6 expected 5 at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:77) {quote} Behaviour with AHS with file based history store -Apphistory url is working -No attempt entries are shown for each application. Based on inital analysis when RM switches ,application attempts from state store are not replayed but only applications are. So when /applicaitonhistory url is accessed it tries for all attempt id and fails -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition
[ https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3713: Labels: cleanup maintenance (was: ) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition Key: YARN-3713 URL: https://issues.apache.org/jira/browse/YARN-3713 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Priority: Minor Labels: cleanup, maintenance remove duplicate function call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is already called at ContainerImpl#addDiagnostics. {code} private void addDiagnostics(String... diags) { for (String s : diags) { this.diagnostics.append(s); } try { stateStore.storeContainerDiagnostics(containerId, diagnostics); } catch (IOException e) { LOG.warn(Unable to update diagnostics in state store for + containerId, e); } } {code} So we don't need call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition#transition. {code} container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n); try { container.stateStore.storeContainerDiagnostics(container.containerId, container.diagnostics); } catch (IOException e) { LOG.warn(Unable to update state store diagnostics for + container.containerId, e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition
zhihai xu created YARN-3713: --- Summary: Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition Key: YARN-3713 URL: https://issues.apache.org/jira/browse/YARN-3713 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Priority: Minor remove duplicate function call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is already called at ContainerImpl#addDiagnostics. {code} private void addDiagnostics(String... diags) { for (String s : diags) { this.diagnostics.append(s); } try { stateStore.storeContainerDiagnostics(containerId, diagnostics); } catch (IOException e) { LOG.warn(Unable to update diagnostics in state store for + containerId, e); } } {code} So we don't need call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition#transition. {code} container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n); try { container.stateStore.storeContainerDiagnostics(container.containerId, container.diagnostics); } catch (IOException e) { LOG.warn(Unable to update state store diagnostics for + container.containerId, e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration
[ https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3711: --- Attachment: YARN-3711.002.patch I attached patch. 002 fixes markdown formatting nits too. Documentation of ResourceManager HA should explain about webapp address configuration - Key: YARN-3711 URL: https://issues.apache.org/jira/browse/YARN-3711 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3711.002.patch Proper proxy URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration
[ https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3711: --- Description: There should be explanation about webapp address in addition to RPC address. AM proxy filter needs explicit definition of {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} to get proper default addresses in RM-HA mode now. was:Proper proxy URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled. Documentation of ResourceManager HA should explain about webapp address configuration - Key: YARN-3711 URL: https://issues.apache.org/jira/browse/YARN-3711 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3711.002.patch There should be explanation about webapp address in addition to RPC address. AM proxy filter needs explicit definition of {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} to get proper default addresses in RM-HA mode now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition
[ https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558837#comment-14558837 ] Hadoop QA commented on YARN-3713: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 44s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 37s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 51s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 36s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 3s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 6m 11s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 42m 36s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735265/YARN-3713.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 56996a6 | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8080/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8080/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8080/console | This message was automatically generated. Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition Key: YARN-3713 URL: https://issues.apache.org/jira/browse/YARN-3713 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Priority: Minor Labels: cleanup Attachments: YARN-3713.000.patch remove duplicate function call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is already called at ContainerImpl#addDiagnostics. {code} private void addDiagnostics(String... diags) { for (String s : diags) { this.diagnostics.append(s); } try { stateStore.storeContainerDiagnostics(containerId, diagnostics); } catch (IOException e) { LOG.warn(Unable to update diagnostics in state store for + containerId, e); } } {code} So we don't need call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition#transition. {code} container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n); try { container.stateStore.storeContainerDiagnostics(container.containerId, container.diagnostics); } catch (IOException e) { LOG.warn(Unable to update state store diagnostics for + container.containerId, e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition
[ https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3713: Attachment: YARN-3713.000.patch Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition Key: YARN-3713 URL: https://issues.apache.org/jira/browse/YARN-3713 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Priority: Minor Labels: cleanup, maintenance Attachments: YARN-3713.000.patch remove duplicate function call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is already called at ContainerImpl#addDiagnostics. {code} private void addDiagnostics(String... diags) { for (String s : diags) { this.diagnostics.append(s); } try { stateStore.storeContainerDiagnostics(containerId, diagnostics); } catch (IOException e) { LOG.warn(Unable to update diagnostics in state store for + containerId, e); } } {code} So we don't need call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition#transition. {code} container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n); try { container.stateStore.storeContainerDiagnostics(container.containerId, container.diagnostics); } catch (IOException e) { LOG.warn(Unable to update state store diagnostics for + container.containerId, e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3713) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition
[ https://issues.apache.org/jira/browse/YARN-3713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3713: Labels: cleanup (was: cleanup maintenance) Remove duplicate function call storeContainerDiagnostics in ContainerDiagnosticsUpdateTransition Key: YARN-3713 URL: https://issues.apache.org/jira/browse/YARN-3713 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.0 Reporter: zhihai xu Assignee: zhihai xu Priority: Minor Labels: cleanup Attachments: YARN-3713.000.patch remove duplicate function call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition. {{storeContainerDiagnostics}} is already called at ContainerImpl#addDiagnostics. {code} private void addDiagnostics(String... diags) { for (String s : diags) { this.diagnostics.append(s); } try { stateStore.storeContainerDiagnostics(containerId, diagnostics); } catch (IOException e) { LOG.warn(Unable to update diagnostics in state store for + containerId, e); } } {code} So we don't need call {{storeContainerDiagnostics}} in ContainerDiagnosticsUpdateTransition#transition. {code} container.addDiagnostics(updateEvent.getDiagnosticsUpdate(), \n); try { container.stateStore.storeContainerDiagnostics(container.containerId, container.diagnostics); } catch (IOException e) { LOG.warn(Unable to update state store diagnostics for + container.containerId, e); } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration
[ https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3711: --- Attachment: YARN-3711.001.patch Documentation of ResourceManager HA should explain about webapp address configuration - Key: YARN-3711 URL: https://issues.apache.org/jira/browse/YARN-3711 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3711.001.patch Proper proxy URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3714) AM proxy filter can not get proper default proxy address if RM-HA is enabled
[ https://issues.apache.org/jira/browse/YARN-3714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3714: --- Description: Default proxy address could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} explicitly if RM-HA is enabled. AM proxy filter can not get proper default proxy address if RM-HA is enabled Key: YARN-3714 URL: https://issues.apache.org/jira/browse/YARN-3714 Project: Hadoop YARN Issue Type: Bug Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Default proxy address could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} explicitly if RM-HA is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3712: --- Attachment: YARN-3712.02.patch Fix checkstyle warnings. ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously - Key: YARN-3712 URL: https://issues.apache.org/jira/browse/YARN-3712 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Attachments: YARN-3712.01.patch, YARN-3712.02.patch It will save some time by handling event CLEANUP_CONTAINER asynchronously. This improvement will be useful for cases that cleaning up container cost a little long time(e.g. for our case: we are running Docker container on NM, it will take above 1 seconds to clean up one docker container. ) and many containers to clean up(e.g. NM need clean up all running containers when NM shutdown). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration
[ https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3711: --- Attachment: (was: YARN-3711.001.patch) Documentation of ResourceManager HA should explain about webapp address configuration - Key: YARN-3711 URL: https://issues.apache.org/jira/browse/YARN-3711 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Proper proxy URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3711) Documentation of ResourceManager HA should explain about webapp address configuration
[ https://issues.apache.org/jira/browse/YARN-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558831#comment-14558831 ] Hadoop QA commented on YARN-3711: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 2m 54s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 56s | Site still builds. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 6m 14s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735274/YARN-3711.002.patch | | Optional Tests | site | | git revision | trunk / 56996a6 | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8081/console | This message was automatically generated. Documentation of ResourceManager HA should explain about webapp address configuration - Key: YARN-3711 URL: https://issues.apache.org/jira/browse/YARN-3711 Project: Hadoop YARN Issue Type: Sub-task Components: documentation Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3711.002.patch Proper proxy URL of AM Web UI could not be got without setting {{yarn.resourcemanager.webapp.address._rm-id_}} and/or {{yarn.resourcemanager.webapp.https.address._rm-id_}} if RM-HA is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3644) Node manager shuts down if unable to connect with RM
[ https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558902#comment-14558902 ] Hadoop QA commented on YARN-3644: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 38s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 35s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 43s | The applied patch generated 1 new checkstyle issues (total was 214, now 215). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 48s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 26s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 56s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 6m 15s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 49m 2s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735276/YARN-3644.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 56996a6 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8082/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8082/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8082/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8082/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8082/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8082/console | This message was automatically generated. Node manager shuts down if unable to connect with RM Key: YARN-3644 URL: https://issues.apache.org/jira/browse/YARN-3644 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Srikanth Sundarrajan Assignee: Raju Bairishetti Attachments: YARN-3644.patch When NM is unable to connect to RM, NM shuts itself down. {code} } catch (ConnectException e) { //catch and throw the exception if tried MAX wait time to connect RM dispatcher.getEventHandler().handle( new NodeManagerEvent(NodeManagerEventType.SHUTDOWN)); throw new YarnRuntimeException(e); {code} In large clusters, if RM is down for maintenance for longer period, all the NMs shuts themselves down, requiring additional work to bring up the NMs. Setting the yarn.resourcemanager.connect.wait-ms to -1 has other side effects, where non connection failures are being retried infinitely by all YarnClients (via RMProxy). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3644) Node manager shuts down if unable to connect with RM
[ https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558955#comment-14558955 ] Hadoop QA commented on YARN-3644: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 36s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 32s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 20s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 49s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 27s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 57s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 6m 26s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 49m 19s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735288/YARN-3644.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 56996a6 | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8083/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8083/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8083/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8083/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8083/console | This message was automatically generated. Node manager shuts down if unable to connect with RM Key: YARN-3644 URL: https://issues.apache.org/jira/browse/YARN-3644 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Srikanth Sundarrajan Assignee: Raju Bairishetti Attachments: YARN-3644.001.patch, YARN-3644.patch When NM is unable to connect to RM, NM shuts itself down. {code} } catch (ConnectException e) { //catch and throw the exception if tried MAX wait time to connect RM dispatcher.getEventHandler().handle( new NodeManagerEvent(NodeManagerEventType.SHUTDOWN)); throw new YarnRuntimeException(e); {code} In large clusters, if RM is down for maintenance for longer period, all the NMs shuts themselves down, requiring additional work to bring up the NMs. Setting the yarn.resourcemanager.connect.wait-ms to -1 has other side effects, where non connection failures are being retried infinitely by all YarnClients (via RMProxy). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3644) Node manager shuts down if unable to connect with RM
[ https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raju Bairishetti updated YARN-3644: --- Attachment: YARN-3644.001.patch Node manager shuts down if unable to connect with RM Key: YARN-3644 URL: https://issues.apache.org/jira/browse/YARN-3644 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Srikanth Sundarrajan Assignee: Raju Bairishetti Attachments: YARN-3644.001.patch, YARN-3644.patch When NM is unable to connect to RM, NM shuts itself down. {code} } catch (ConnectException e) { //catch and throw the exception if tried MAX wait time to connect RM dispatcher.getEventHandler().handle( new NodeManagerEvent(NodeManagerEventType.SHUTDOWN)); throw new YarnRuntimeException(e); {code} In large clusters, if RM is down for maintenance for longer period, all the NMs shuts themselves down, requiring additional work to bring up the NMs. Setting the yarn.resourcemanager.connect.wait-ms to -1 has other side effects, where non connection failures are being retried infinitely by all YarnClients (via RMProxy). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
[ https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558941#comment-14558941 ] Tsuyoshi Ozawa commented on YARN-2336: -- +1, committing this shortly. Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree -- Key: YARN-2336 URL: https://issues.apache.org/jira/browse/YARN-2336 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.1, 2.6.0 Reporter: Kenji Kikushima Assignee: Akira AJISAKA Labels: BB2015-05-RFC Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch When we have sub queues in Fair Scheduler, REST api returns a missing '[' blacket JSON for childQueues. This issue found by [~ajisakaa] at YARN-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
[ https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558946#comment-14558946 ] Hudson commented on YARN-2336: -- FAILURE: Integrated in Hadoop-trunk-Commit #7901 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7901/]) YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. (ozawa: rev 9a3d617b6325d8918f2833c3e9ce329ecada9242) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfoList.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree -- Key: YARN-2336 URL: https://issues.apache.org/jira/browse/YARN-2336 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.1, 2.6.0 Reporter: Kenji Kikushima Assignee: Akira AJISAKA Labels: BB2015-05-RFC Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch When we have sub queues in Fair Scheduler, REST api returns a missing '[' blacket JSON for childQueues. This issue found by [~ajisakaa] at YARN-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
[ https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559003#comment-14559003 ] Hudson commented on YARN-2336: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #208 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/208/]) YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. (ozawa: rev 9a3d617b6325d8918f2833c3e9ce329ecada9242) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfoList.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java * hadoop-yarn-project/CHANGES.txt Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree -- Key: YARN-2336 URL: https://issues.apache.org/jira/browse/YARN-2336 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.1, 2.6.0 Reporter: Kenji Kikushima Assignee: Akira AJISAKA Labels: BB2015-05-RFC Fix For: 2.8.0 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch When we have sub queues in Fair Scheduler, REST api returns a missing '[' blacket JSON for childQueues. This issue found by [~ajisakaa] at YARN-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page
[ https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559005#comment-14559005 ] Hudson commented on YARN-2238: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #208 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/208/]) YARN-2238. Filtering on UI sticks even if I move away from the page. (xgong: rev 39077dba2e877420e7470df253f6154f6ecc64ec) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java filtering on UI sticks even if I move away from the page Key: YARN-2238 URL: https://issues.apache.org/jira/browse/YARN-2238 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Jian He Labels: usability Fix For: 2.7.1 Attachments: YARN-2238.patch, YARN-2238.png, filtered.png The main data table in many web pages (RM, AM, etc.) seems to show an unexpected filtering behavior. If I filter the table by typing something in the key or value field (or I suspect any search field), the data table gets filtered. The example I used is the job configuration page for a MR job. That is expected. However, when I move away from that page and visit any other web page of the same type (e.g. a job configuration page), the page is rendered with the filtering! That is unexpected. What's even stranger is that it does not render the filtering term. As a result, I have a page that's mysteriously filtered but doesn't tell me what it's filtering on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString
[ https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559065#comment-14559065 ] Hadoop QA commented on YARN-3716: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 38s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 39s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 52s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 56s | Tests passed in hadoop-yarn-common. | | | | 38m 39s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735305/YARN-3716.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 9a3d617 | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8084/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8084/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8084/console | This message was automatically generated. Output node label expression in ResourceRequestPBImpl.toString -- Key: YARN-3716 URL: https://issues.apache.org/jira/browse/YARN-3716 Project: Hadoop YARN Issue Type: Improvement Components: api Reporter: Xianyin Xin Assignee: Xianyin Xin Priority: Minor Attachments: YARN-3716.001.patch It's convenient for debug and log trace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3715) Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on se
Sergey Svinarchuk created YARN-3715: --- Summary: Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on secure cluster with RM HA Key: YARN-3715 URL: https://issues.apache.org/jira/browse/YARN-3715 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Sergey Svinarchuk 2015-05-21 16:06:55,887 WARN ActionStartXCommand:544 - SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive] JOB[001-150521123655733-oozie-mapr-W] ACTION[001-150521123655733-oozie-mapr-W@Hive] Error starting action [Hive]. ErrorType [ERROR], ErrorCode [IllegalArgumentException], Message [IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address')] org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979) at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) at org.apache.oozie.command.XCommand.call(XCommand.java:281) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252) at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211) at org.apache.hadoop.conf.Configuration.getSocketAddr(Configuration.java:1788) at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:58) at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:67) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:114) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:127) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:460) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:964) ... 10 more 2015-05-21 16:06:55,889 WARN ActionStartXCommand:544 - SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive] JOB[001-150521123655733-oozie-mapr-W] ACTION[001-150521123655733-oozie-mapr-W@Hive] Setting Action Status to [DONE] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString
Xianyin Xin created YARN-3716: - Summary: Output node label expression in ResourceRequestPBImpl.toString Key: YARN-3716 URL: https://issues.apache.org/jira/browse/YARN-3716 Project: Hadoop YARN Issue Type: Improvement Components: api Reporter: Xianyin Xin Assignee: Xianyin Xin Priority: Minor It's convenient for debug and log trace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3715) Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on
[ https://issues.apache.org/jira/browse/YARN-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14558994#comment-14558994 ] Sergey Svinarchuk commented on YARN-3715: - There are problem in yarn.resourcemanager.address property. When we try submitting regular job this property set to 0.0.0.0:8032, but when Oozie submitting job this property set to jobtracker property from file job.propertires. In case with RM HA we set to job.properties jobTracker=maprfs:/// and then yarn.resourcemanager.address also set to maprfs:///. Then Master.getMasterAddress get socket address from Configuration as maprfs:/// and call NetUtils.createSocketAddr(address, defaultPort, name), but NetUtils.createSocketAddr can work only with format “hostname:port”. I think that for case when using RM HA need call getSocketAddr(String name, String defaultAddress, int defaultPort) from YarnConfiguration class. Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on secure cluster with RM HA -- Key: YARN-3715 URL: https://issues.apache.org/jira/browse/YARN-3715 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Sergey Svinarchuk 2015-05-21 16:06:55,887 WARN ActionStartXCommand:544 - SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive] JOB[001-150521123655733-oozie-mapr-W] ACTION[001-150521123655733-oozie-mapr-W@Hive] Error starting action [Hive]. ErrorType [ERROR], ErrorCode [IllegalArgumentException], Message [IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address')] org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979) at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) at org.apache.oozie.command.XCommand.call(XCommand.java:281) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252) at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211) at org.apache.hadoop.conf.Configuration.getSocketAddr(Configuration.java:1788) at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:58) at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:67) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:114) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:127) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:460) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at
[jira] [Updated] (YARN-3716) Output node label expression in ResourceRequestPBImpl.toString
[ https://issues.apache.org/jira/browse/YARN-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xianyin Xin updated YARN-3716: -- Attachment: YARN-3716.001.patch Output node label expression in ResourceRequestPBImpl.toString -- Key: YARN-3716 URL: https://issues.apache.org/jira/browse/YARN-3716 Project: Hadoop YARN Issue Type: Improvement Components: api Reporter: Xianyin Xin Assignee: Xianyin Xin Priority: Minor Attachments: YARN-3716.001.patch It's convenient for debug and log trace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
[ https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559017#comment-14559017 ] Hudson commented on YARN-2336: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #939 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/939/]) YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. (ozawa: rev 9a3d617b6325d8918f2833c3e9ce329ecada9242) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfoList.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree -- Key: YARN-2336 URL: https://issues.apache.org/jira/browse/YARN-2336 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.1, 2.6.0 Reporter: Kenji Kikushima Assignee: Akira AJISAKA Labels: BB2015-05-RFC Fix For: 2.8.0 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch When we have sub queues in Fair Scheduler, REST api returns a missing '[' blacket JSON for childQueues. This issue found by [~ajisakaa] at YARN-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page
[ https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559019#comment-14559019 ] Hudson commented on YARN-2238: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #939 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/939/]) YARN-2238. Filtering on UI sticks even if I move away from the page. (xgong: rev 39077dba2e877420e7470df253f6154f6ecc64ec) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java * hadoop-yarn-project/CHANGES.txt filtering on UI sticks even if I move away from the page Key: YARN-2238 URL: https://issues.apache.org/jira/browse/YARN-2238 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Jian He Labels: usability Fix For: 2.7.1 Attachments: YARN-2238.patch, YARN-2238.png, filtered.png The main data table in many web pages (RM, AM, etc.) seems to show an unexpected filtering behavior. If I filter the table by typing something in the key or value field (or I suspect any search field), the data table gets filtered. The example I used is the job configuration page for a MR job. That is expected. However, when I move away from that page and visit any other web page of the same type (e.g. a job configuration page), the page is rendered with the filtering! That is unexpected. What's even stranger is that it does not render the filtering term. As a result, I have a page that's mysteriously filtered but doesn't tell me what it's filtering on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
[ https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559099#comment-14559099 ] Akira AJISAKA commented on YARN-2336: - Thanks [~ozawa] and [~kj-ki]! Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree -- Key: YARN-2336 URL: https://issues.apache.org/jira/browse/YARN-2336 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.1, 2.6.0 Reporter: Kenji Kikushima Assignee: Akira AJISAKA Labels: BB2015-05-RFC Fix For: 2.8.0 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch When we have sub queues in Fair Scheduler, REST api returns a missing '[' blacket JSON for childQueues. This issue found by [~ajisakaa] at YARN-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS
[ https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559475#comment-14559475 ] Varun Vasudev commented on YARN-160: The change for the Windows cpu limits fixes a bug in the current implementation. The current implementation allows YARN containers to exceed the configured cpu limit in some cases. nodemanagers should obtain cpu/memory values from underlying OS --- Key: YARN-160 URL: https://issues.apache.org/jira/browse/YARN-160 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Varun Vasudev Labels: BB2015-05-TBR Attachments: YARN-160.005.patch, YARN-160.006.patch, YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch As mentioned in YARN-2 *NM memory and CPU configs* Currently these values are coming from the config of the NM, we should be able to obtain those values from the OS (ie, in the case of Linux from /proc/meminfo /proc/cpuinfo). As this is highly OS dependent we should have an interface that obtains this information. In addition implementations of this interface should be able to specify a mem/cpu offset (amount of mem/cpu not to be avail as YARN resource), this would allow to reserve mem/cpu for the OS and other services outside of YARN containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3719) Improve Solaris support in YARN
[ https://issues.apache.org/jira/browse/YARN-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Burlison updated YARN-3719: Summary: Improve Solaris support in YARN (was: Improve Solaris support in HDFS) Improve Solaris support in YARN --- Key: YARN-3719 URL: https://issues.apache.org/jira/browse/YARN-3719 Project: Hadoop YARN Issue Type: Task Components: build Affects Versions: 2.7.0 Environment: Solaris x86, Solaris sparc Reporter: Alan Burlison At present the YARN native components aren't fully supported on Solaris primarily due to differences between Linux and Solaris. This top-level task will be used to group together both existing and new issues related to this work. A second goal is to improve YARN performance and functionality on Solaris wherever possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3719) Improve Solaris support in HDFS
Alan Burlison created YARN-3719: --- Summary: Improve Solaris support in HDFS Key: YARN-3719 URL: https://issues.apache.org/jira/browse/YARN-3719 Project: Hadoop YARN Issue Type: Task Components: build Affects Versions: 2.7.0 Environment: Solaris x86, Solaris sparc Reporter: Alan Burlison At present the YARN native components aren't fully supported on Solaris primarily due to differences between Linux and Solaris. This top-level task will be used to group together both existing and new issues related to this work. A second goal is to improve YARN performance and functionality on Solaris wherever possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page
[ https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559582#comment-14559582 ] Sangjin Lee commented on YARN-2238: --- Sorry for the belated comment. The changes look good to me. Thanks for working on this [~jianhe]! filtering on UI sticks even if I move away from the page Key: YARN-2238 URL: https://issues.apache.org/jira/browse/YARN-2238 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Jian He Labels: usability Fix For: 2.7.1 Attachments: YARN-2238.patch, YARN-2238.png, filtered.png The main data table in many web pages (RM, AM, etc.) seems to show an unexpected filtering behavior. If I filter the table by typing something in the key or value field (or I suspect any search field), the data table gets filtered. The example I used is the job configuration page for a MR job. That is expected. However, when I move away from that page and visit any other web page of the same type (e.g. a job configuration page), the page is rendered with the filtering! That is unexpected. What's even stranger is that it does not render the filtering term. As a result, I have a page that's mysteriously filtered but doesn't tell me what it's filtering on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-41) The RM should handle the graceful shutdown of the NM.
[ https://issues.apache.org/jira/browse/YARN-41?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-41: -- Attachment: YARN-41-7.patch [~djp] I have updated the patch with review comments. Can you have a look into this? In the latest patch I have added a new NodeState i.e. SHUTDOWN. bq. Add tests for new PB objects UnRegisterNodeManagerRequestPBImpl, UnRegisterNodeManagerResponsePBImpl into TestYarnServerApiClasses.java. I have added test for UnRegisterNodeManagerRequestPBImpl and I haven't added test for UnRegisterNodeManagerResponsePBImpl since it doesn't have any state to verify and no value add for test. The RM should handle the graceful shutdown of the NM. - Key: YARN-41 URL: https://issues.apache.org/jira/browse/YARN-41 Project: Hadoop YARN Issue Type: New Feature Components: nodemanager, resourcemanager Reporter: Ravi Teja Ch N V Assignee: Devaraj K Attachments: MAPREDUCE-3494.1.patch, MAPREDUCE-3494.2.patch, MAPREDUCE-3494.patch, YARN-41-1.patch, YARN-41-2.patch, YARN-41-3.patch, YARN-41-4.patch, YARN-41-5.patch, YARN-41-6.patch, YARN-41-7.patch, YARN-41.patch Instead of waiting for the NM expiry, RM should remove and handle the NM, which is shutdown gracefully. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3719) Improve Solaris support in YARN
[ https://issues.apache.org/jira/browse/YARN-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Burlison updated YARN-3719: Issue Type: New Feature (was: Task) Improve Solaris support in YARN --- Key: YARN-3719 URL: https://issues.apache.org/jira/browse/YARN-3719 Project: Hadoop YARN Issue Type: New Feature Components: build Affects Versions: 2.7.0 Environment: Solaris x86, Solaris sparc Reporter: Alan Burlison At present the YARN native components aren't fully supported on Solaris primarily due to differences between Linux and Solaris. This top-level task will be used to group together both existing and new issues related to this work. A second goal is to improve YARN performance and functionality on Solaris wherever possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3720) Need comprehensive documentation for configuration CPU/memory resources on NodeManager
Vinod Kumar Vavilapalli created YARN-3720: - Summary: Need comprehensive documentation for configuration CPU/memory resources on NodeManager Key: YARN-3720 URL: https://issues.apache.org/jira/browse/YARN-3720 Project: Hadoop YARN Issue Type: Task Components: documentation, nodemanager Reporter: Vinod Kumar Vavilapalli Things are getting more and more complex after the likes of YARN-160. We need a document explaining how to configure cpu/memory values on a NodeManager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3718) hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable
[ https://issues.apache.org/jira/browse/YARN-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559617#comment-14559617 ] Alan Burlison commented on YARN-3718: - Yes, I created a new top-level task and moved it under there as there are a couple of other YARN-related issues as well. -- Alan Burlison -- hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable - Key: YARN-3718 URL: https://issues.apache.org/jira/browse/YARN-3718 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.7.0 Environment: BSD OSX Solaris Windows Linux Reporter: Alan Burlison hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c makes use of the Linux-only Cgroups feature (http://en.wikipedia.org/wiki/Cgroups) when Hadoop is built on Linux, but there is no corresponding functionality for non-Linux platforms. Other platforms provide similar functionality, e.g. Solaris has an extensive range of resource management features (http://docs.oracle.com/cd/E23824_01/html/821-1460/index.html). Work is needed to abstract the resource management features of Yarn so that the same facilities for resource management can be provided on all platforms that provide the requisite functionality, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559577#comment-14559577 ] Xuan Gong commented on YARN-221: bq. All the known policies will be part of YARN including SampleRateContainerLogAggregationPolicy. So we still need to config sample rate for that policy. If we don't put it in YarnConfiguration, where can we put it? It seems we already have a bunch of configuration properties in YarnConfiguration that are specific the plugin implementation such as container executor properties. I thought about this. How about adding a new protocol field: String ContainerLogAggregationPolicyParameter along with ContainerLogAggregationPolicy in logAggregationContext. In ContainerLogAggregationPolicyParameter, users can define any parameter format which their ContainerLogAggregationPolicy can understand. For example, we could define ContainerLogAggregationPolicyParameter as SR:0.2 and in SampleRateContainerLogAggregationPolicy, we could add implementation to understand and parse the parameter. Also, we could change to {code} public interface ContainerLogAggregationPolicy { public boolean shouldDoLogAggregation(ContainerId containerId, int exitCode); public void parseParameters(String parameters) } {code} bq. How MR overrides the default policy. Maybe we can have YarnRunner at MR level honor yarn property yarn.container-log-aggregation-policy.class on per job level when it creates the ApplicationSubmissionContext with the proper LogAggregationContext. In that way we don't have to create extra log aggregation properties specific at MR layer. Good question. Another possible solution could be parsing them from command-line if users use ToolRunner.run to launch their MR application. NM should provide a way for AM to tell it not to aggregate logs. Key: YARN-221 URL: https://issues.apache.org/jira/browse/YARN-221 Project: Hadoop YARN Issue Type: Sub-task Components: log-aggregation, nodemanager Reporter: Robert Joseph Evans Assignee: Ming Ma Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch, YARN-221-trunk-v5.patch The NodeManager should provide a way for an AM to tell it that either the logs should not be aggregated, that they should be aggregated with a high priority, or that they should be aggregated but with a lower priority. The AM should be able to do this in the ContainerLaunch context to provide a default value, but should also be able to update the value when the container is released. This would allow for the NM to not aggregate logs in some cases, and avoid connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3718) hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable
[ https://issues.apache.org/jira/browse/YARN-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559584#comment-14559584 ] Karthik Kambatla commented on YARN-3718: We have container executors for each OS - default for all unix-based, Linux for linux, Windows for windows. Are you proposing adding a new executor for Solaris? If yes, we should mark it a new feature (instead of a bug) and update the title accordingly. hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable - Key: YARN-3718 URL: https://issues.apache.org/jira/browse/YARN-3718 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.7.0 Environment: BSD OSX Solaris Windows Linux Reporter: Alan Burlison hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c makes use of the Linux-only Cgroups feature (http://en.wikipedia.org/wiki/Cgroups) when Hadoop is built on Linux, but there is no corresponding functionality for non-Linux platforms. Other platforms provide similar functionality, e.g. Solaris has an extensive range of resource management features (http://docs.oracle.com/cd/E23824_01/html/821-1460/index.html). Work is needed to abstract the resource management features of Yarn so that the same facilities for resource management can be provided on all platforms that provide the requisite functionality, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3719) Improve Solaris support in YARN
[ https://issues.apache.org/jira/browse/YARN-3719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559598#comment-14559598 ] Alan Burlison commented on YARN-3719: - Solaris-related changes to HADOOP and HDFS are covered under the two top-level issues: HADOOP-11985 Improve Solaris support in Hadoop HDFS-8478 Improve Solaris support in HDFS Improve Solaris support in YARN --- Key: YARN-3719 URL: https://issues.apache.org/jira/browse/YARN-3719 Project: Hadoop YARN Issue Type: New Feature Components: build Affects Versions: 2.7.0 Environment: Solaris x86, Solaris sparc Reporter: Alan Burlison At present the YARN native components aren't fully supported on Solaris primarily due to differences between Linux and Solaris. This top-level task will be used to group together both existing and new issues related to this work. A second goal is to improve YARN performance and functionality on Solaris wherever possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS
[ https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559599#comment-14559599 ] Hudson commented on YARN-160: - SUCCESS: Integrated in Hadoop-trunk-Commit #7903 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7903/]) YARN-160. Enhanced NodeManager to automatically obtain cpu/memory values from underlying OS when configured to do so. Contributed by Varun Vasudev. (vinodkv: rev 500a1d9c76ec612b4e737888f4be79951c11591d) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java * hadoop-tools/hadoop-gridmix/src/test/java/org/apache/hadoop/mapred/gridmix/DummyResourceCalculatorPlugin.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestCgroupsLCEResourcesHandler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/WindowsResourceCalculatorPlugin.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/LinuxResourceCalculatorPlugin.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/ResourceCalculatorPlugin.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/CgroupsLCEResourcesHandler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerExecutor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/util/TestNodeManagerHardwareUtils.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestLinuxResourceCalculatorPlugin.java nodemanagers should obtain cpu/memory values from underlying OS --- Key: YARN-160 URL: https://issues.apache.org/jira/browse/YARN-160 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Varun Vasudev Labels: BB2015-05-TBR Fix For: 2.8.0 Attachments: YARN-160.005.patch, YARN-160.006.patch, YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch As mentioned in YARN-2 *NM memory and CPU configs* Currently these values are coming from the config of the NM, we should be able to obtain those values from the OS (ie, in the case of Linux from /proc/meminfo /proc/cpuinfo). As this is highly OS dependent we should have an interface that obtains this information. In addition implementations of this interface should be able to specify a mem/cpu offset (amount of mem/cpu not to be avail as YARN resource), this would allow to reserve mem/cpu for the OS and other services outside of YARN containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3715) Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on
[ https://issues.apache.org/jira/browse/YARN-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559479#comment-14559479 ] Sergey Svinarchuk commented on YARN-3715: - Yes, it was configuration issue. Thanks Oozie jobs are failed with IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') on secure cluster with RM HA -- Key: YARN-3715 URL: https://issues.apache.org/jira/browse/YARN-3715 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Sergey Svinarchuk 2015-05-21 16:06:55,887 WARN ActionStartXCommand:544 - SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive] JOB[001-150521123655733-oozie-mapr-W] ACTION[001-150521123655733-oozie-mapr-W@Hive] Error starting action [Hive]. ErrorType [ERROR], ErrorCode [IllegalArgumentException], Message [IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address')] org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:401) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:979) at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1134) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:228) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:63) at org.apache.oozie.command.XCommand.call(XCommand.java:281) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:323) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:252) at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:174) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: maprfs:/// (configuration property 'yarn.resourcemanager.address') at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211) at org.apache.hadoop.conf.Configuration.getSocketAddr(Configuration.java:1788) at org.apache.hadoop.mapred.Master.getMasterAddress(Master.java:58) at org.apache.hadoop.mapred.Master.getMasterPrincipal(Master.java:67) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:114) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:127) at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:460) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:343) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:964) ... 10 more 2015-05-21 16:06:55,889 WARN ActionStartXCommand:544 - SERVER[centos6.localdomain] USER[mapr] GROUP[-] TOKEN[] APP[Hive] JOB[001-150521123655733-oozie-mapr-W] ACTION[001-150521123655733-oozie-mapr-W@Hive]
[jira] [Commented] (YARN-1012) NM should report resource utilization of running containers to RM in heartbeat
[ https://issues.apache.org/jira/browse/YARN-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559496#comment-14559496 ] Hadoop QA commented on YARN-1012: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 53s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 35s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 35s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 5s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 48s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 22s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 58s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 6m 17s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 49m 10s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735357/YARN-1012-7.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 022f49d | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8088/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8088/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8088/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8088/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8088/console | This message was automatically generated. NM should report resource utilization of running containers to RM in heartbeat -- Key: YARN-1012 URL: https://issues.apache.org/jira/browse/YARN-1012 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.7.0 Reporter: Arun C Murthy Assignee: Inigo Goiri Attachments: YARN-1012-1.patch, YARN-1012-2.patch, YARN-1012-3.patch, YARN-1012-4.patch, YARN-1012-5.patch, YARN-1012-6.patch, YARN-1012-7.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3718) hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable
[ https://issues.apache.org/jira/browse/YARN-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Burlison updated YARN-3718: Issue Type: Sub-task (was: Bug) Parent: YARN-3719 hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable - Key: YARN-3718 URL: https://issues.apache.org/jira/browse/YARN-3718 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.7.0 Environment: BSD OSX Solaris Windows Linux Reporter: Alan Burlison hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c makes use of the Linux-only Cgroups feature (http://en.wikipedia.org/wiki/Cgroups) when Hadoop is built on Linux, but there is no corresponding functionality for non-Linux platforms. Other platforms provide similar functionality, e.g. Solaris has an extensive range of resource management features (http://docs.oracle.com/cd/E23824_01/html/821-1460/index.html). Work is needed to abstract the resource management features of Yarn so that the same facilities for resource management can be provided on all platforms that provide the requisite functionality, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS
[ https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559579#comment-14559579 ] Vinod Kumar Vavilapalli commented on YARN-160: -- Tx for the explanation, Varun. The new logic definitely makes sense to me. The patch looks good. Checking this in. nodemanagers should obtain cpu/memory values from underlying OS --- Key: YARN-160 URL: https://issues.apache.org/jira/browse/YARN-160 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Varun Vasudev Labels: BB2015-05-TBR Attachments: YARN-160.005.patch, YARN-160.006.patch, YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch As mentioned in YARN-2 *NM memory and CPU configs* Currently these values are coming from the config of the NM, we should be able to obtain those values from the OS (ie, in the case of Linux from /proc/meminfo /proc/cpuinfo). As this is highly OS dependent we should have an interface that obtains this information. In addition implementations of this interface should be able to specify a mem/cpu offset (amount of mem/cpu not to be avail as YARN resource), this would allow to reserve mem/cpu for the OS and other services outside of YARN containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3718) hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable
[ https://issues.apache.org/jira/browse/YARN-3718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559590#comment-14559590 ] Karthik Kambatla commented on YARN-3718: Never mind. I see this is a subtask of YARN-3719. hadoop-yarn-server-nodemanager's use of Linux Cgroups is non-portable - Key: YARN-3718 URL: https://issues.apache.org/jira/browse/YARN-3718 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.7.0 Environment: BSD OSX Solaris Windows Linux Reporter: Alan Burlison hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c makes use of the Linux-only Cgroups feature (http://en.wikipedia.org/wiki/Cgroups) when Hadoop is built on Linux, but there is no corresponding functionality for non-Linux platforms. Other platforms provide similar functionality, e.g. Solaris has an extensive range of resource management features (http://docs.oracle.com/cd/E23824_01/html/821-1460/index.html). Work is needed to abstract the resource management features of Yarn so that the same facilities for resource management can be provided on all platforms that provide the requisite functionality, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559345#comment-14559345 ] Vinod Kumar Vavilapalli commented on YARN-3712: --- What is the effect of today's way of doing it synchronously? Interesting you mention time taking for cleaning docker containers. /cc [~ashahab], [~sidharta-s] who are looking into that area. ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously - Key: YARN-3712 URL: https://issues.apache.org/jira/browse/YARN-3712 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Attachments: YARN-3712.01.patch, YARN-3712.02.patch It will save some time by handling event CLEANUP_CONTAINER asynchronously. This improvement will be useful for cases that cleaning up container cost a little long time(e.g. for our case: we are running Docker container on NM, it will take above 1 seconds to clean up one docker container. ) and many containers to clean up(e.g. NM need clean up all running containers when NM shutdown). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1012) NM should report resource utilization of running containers to RM in heartbeat
[ https://issues.apache.org/jira/browse/YARN-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559351#comment-14559351 ] Inigo Goiri commented on YARN-1012: --- I checked the issue with testContainerStatusPBImpl and I cannot figure out what's wrong there. Am I missing any method in ResourceUtilization? I also updated the interfaces and made it Unstable and Private (which I think matches our scope). Regarding the unit test, how would you check? Would you check to context.getContainers()? This related to your original quesiton of where should we store this information (ContainerMetrics or ContainerStatus). NM should report resource utilization of running containers to RM in heartbeat -- Key: YARN-1012 URL: https://issues.apache.org/jira/browse/YARN-1012 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.7.0 Reporter: Arun C Murthy Assignee: Inigo Goiri Attachments: YARN-1012-1.patch, YARN-1012-2.patch, YARN-1012-3.patch, YARN-1012-4.patch, YARN-1012-5.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559366#comment-14559366 ] Abin Shahab commented on YARN-3712: --- You an try changing the file system to aufs or overlayfs from the default devmapper. ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously - Key: YARN-3712 URL: https://issues.apache.org/jira/browse/YARN-3712 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Attachments: YARN-3712.01.patch, YARN-3712.02.patch It will save some time by handling event CLEANUP_CONTAINER asynchronously. This improvement will be useful for cases that cleaning up container cost a little long time(e.g. for our case: we are running Docker container on NM, it will take above 1 seconds to clean up one docker container. ) and many containers to clean up(e.g. NM need clean up all running containers when NM shutdown). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3518) default rm/am expire interval should not less than default resourcemanager connect wait time
[ https://issues.apache.org/jira/browse/YARN-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandflee updated YARN-3518: --- Attachment: YARN-3518.003.patch default rm/am expire interval should not less than default resourcemanager connect wait time Key: YARN-3518 URL: https://issues.apache.org/jira/browse/YARN-3518 Project: Hadoop YARN Issue Type: Bug Components: nodemanager, resourcemanager Reporter: sandflee Assignee: sandflee Labels: BB2015-05-TBR, configuration, newbie Attachments: YARN-3518.001.patch, YARN-3518.002.patch, YARN-3518.003.patch take am for example, if am can't connect to RM, after am expire (600s), RM relaunch am, and there will be two am at the same time util resourcemanager connect max wait time(900s) passed. DEFAULT_RESOURCEMANAGER_CONNECT_MAX_WAIT_MS = 15 * 60 * 1000; DEFAULT_RM_AM_EXPIRY_INTERVAL_MS = 60; DEFAULT_RM_NM_EXPIRY_INTERVAL_MS = 60; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
[ https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559260#comment-14559260 ] Hudson commented on YARN-2336: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2155 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2155/]) YARN-2336. Fair scheduler's REST API returns a missing '[' bracket JSON for deep queue tree. Contributed by Kenji Kikushima and Akira Ajisaka. (ozawa: rev 9a3d617b6325d8918f2833c3e9ce329ecada9242) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfoList.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesCapacitySched.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/TestRMWebServicesFairScheduler.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/dao/FairSchedulerQueueInfo.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/webapp/JAXBContextResolver.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree -- Key: YARN-2336 URL: https://issues.apache.org/jira/browse/YARN-2336 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.4.1, 2.6.0 Reporter: Kenji Kikushima Assignee: Akira AJISAKA Labels: BB2015-05-RFC Fix For: 2.8.0 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch When we have sub queues in Fair Scheduler, REST api returns a missing '[' blacket JSON for childQueues. This issue found by [~ajisakaa] at YARN-1050. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2238) filtering on UI sticks even if I move away from the page
[ https://issues.apache.org/jira/browse/YARN-2238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559262#comment-14559262 ] Hudson commented on YARN-2238: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2155 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2155/]) YARN-2238. Filtering on UI sticks even if I move away from the page. (xgong: rev 39077dba2e877420e7470df253f6154f6ecc64ec) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/view/JQueryUI.java filtering on UI sticks even if I move away from the page Key: YARN-2238 URL: https://issues.apache.org/jira/browse/YARN-2238 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.4.0 Reporter: Sangjin Lee Assignee: Jian He Labels: usability Fix For: 2.7.1 Attachments: YARN-2238.patch, YARN-2238.png, filtered.png The main data table in many web pages (RM, AM, etc.) seems to show an unexpected filtering behavior. If I filter the table by typing something in the key or value field (or I suspect any search field), the data table gets filtered. The example I used is the job configuration page for a MR job. That is expected. However, when I move away from that page and visit any other web page of the same type (e.g. a job configuration page), the page is rendered with the filtering! That is unexpected. What's even stranger is that it does not render the filtering term. As a result, I have a page that's mysteriously filtered but doesn't tell me what it's filtering on. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1012) NM should report resource utilization of running containers to RM in heartbeat
[ https://issues.apache.org/jira/browse/YARN-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Inigo Goiri updated YARN-1012: -- Attachment: YARN-1012-6.patch Changed annotations for ResourceUtilization. NM should report resource utilization of running containers to RM in heartbeat -- Key: YARN-1012 URL: https://issues.apache.org/jira/browse/YARN-1012 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Affects Versions: 2.7.0 Reporter: Arun C Murthy Assignee: Inigo Goiri Attachments: YARN-1012-1.patch, YARN-1012-2.patch, YARN-1012-3.patch, YARN-1012-4.patch, YARN-1012-5.patch, YARN-1012-6.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559362#comment-14559362 ] Sidharta Seethana commented on YARN-3712: - [~hex108] Are you referring to cleaning the docker image or the container instance itself ? Which of these takes 1 second? If I remember correctly, the docker container executor uses a docker run option that automatically cleans up the container once it exits and it becomes a part of the container lifetime as far as the node manager is concerned. thanks, -Sidharta ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously - Key: YARN-3712 URL: https://issues.apache.org/jira/browse/YARN-3712 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Attachments: YARN-3712.01.patch, YARN-3712.02.patch It will save some time by handling event CLEANUP_CONTAINER asynchronously. This improvement will be useful for cases that cleaning up container cost a little long time(e.g. for our case: we are running Docker container on NM, it will take above 1 seconds to clean up one docker container. ) and many containers to clean up(e.g. NM need clean up all running containers when NM shutdown). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations
[ https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559309#comment-14559309 ] Vinod Kumar Vavilapalli commented on YARN-3685: --- bq. Perhaps it's possible to move the classpath jar generation to the MR client or AM. It's not immediately obvious to me which of those 2 choices is better. For AM container, client is the right place. For the rest of the tasks, AM is. bq. We'd need to change the manifest to use relative paths in the Class-Path attribute instead of absolute paths. (The client and AM are not aware of the exact layout of the NodeManager's yarn.nodemanager.local-dirs, so the client can't predict the absolute paths at time of container launch.) I think this was one of the chief issues in the original patches - we need to investigate if manifest file can have relative paths or not. Otherwise, it's ugly but we can still get YARN to replace some sort of markers only in specific files like the manifest. bq. Some classpath entries are defined in terms of environment variables. These environment variables are expanded at the NodeManager via the container launch scripts. This was true of Linux even before YARN-316, so in that sense, YARN did already have some classpath logic indirectly. Which ones are these? bq. If we do move classpath handling out of the NodeManager, then it would be a backwards-incompatible change, and so it could not be shipped in the 2.x release line. Not clear this is true or not. Have to see the final solution/patch to realistically reason about this. NodeManager unnecessarily knows about classpath-jars due to Windows limitations --- Key: YARN-3685 URL: https://issues.apache.org/jira/browse/YARN-3685 Project: Hadoop YARN Issue Type: Sub-task Reporter: Vinod Kumar Vavilapalli Assignee: Vinod Kumar Vavilapalli Found this while looking at cleaning up ContainerExecutor via YARN-3648, making it a sub-task. YARN *should not* know about classpaths. Our original design modeled around this. But when we added windows suppport, due to classpath issues, we ended up breaking this abstraction via YARN-316. We should clean this up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-160) nodemanagers should obtain cpu/memory values from underlying OS
[ https://issues.apache.org/jira/browse/YARN-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559338#comment-14559338 ] Hadoop QA commented on YARN-160: \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 40s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 5 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 38s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 6s | The applied patch generated 1 new checkstyle issues (total was 214, now 215). | | {color:green}+1{color} | whitespace | 0m 28s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | tools/hadoop tests | 14m 39s | Tests passed in hadoop-gridmix. | | {color:green}+1{color} | yarn tests | 0m 24s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 58s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 6m 8s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 65m 20s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12735336/YARN-160.008.patch | | Optional Tests | javac unit findbugs checkstyle javadoc | | git revision | trunk / 022f49d | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-gridmix test log | https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/testrun_hadoop-gridmix.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8085/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8085/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8085/console | This message was automatically generated. nodemanagers should obtain cpu/memory values from underlying OS --- Key: YARN-160 URL: https://issues.apache.org/jira/browse/YARN-160 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.0.3-alpha Reporter: Alejandro Abdelnur Assignee: Varun Vasudev Labels: BB2015-05-TBR Attachments: YARN-160.005.patch, YARN-160.006.patch, YARN-160.007.patch, YARN-160.008.patch, apache-yarn-160.0.patch, apache-yarn-160.1.patch, apache-yarn-160.2.patch, apache-yarn-160.3.patch As mentioned in YARN-2 *NM memory and CPU configs* Currently these values are coming from the config of the NM, we should be able to obtain those values from the OS (ie, in the case of Linux from /proc/meminfo /proc/cpuinfo). As this is highly OS dependent we should have an interface that obtains this information. In addition implementations of this interface should be able to specify a mem/cpu offset (amount of mem/cpu not to be avail as YARN resource), this would allow to reserve mem/cpu for the OS and other services outside of YARN containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3652) A SchedulerMetrics may be need for evaluating the scheduler's performance
[ https://issues.apache.org/jira/browse/YARN-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559347#comment-14559347 ] Vinod Kumar Vavilapalli commented on YARN-3652: --- I haven't looked at the original SchedulerMetrics patches, but pointed them out as they seemed relevant. [~vvasudev], can you please comment on this? A SchedulerMetrics may be need for evaluating the scheduler's performance - Key: YARN-3652 URL: https://issues.apache.org/jira/browse/YARN-3652 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager, scheduler Reporter: Xianyin Xin As discussed in YARN-3630, a {{SchedulerMetrics}} may be need for evaluating the scheduler's performance. The performance indexes includes #events waiting for being handled by scheduler, the throughput, the scheduling delay and/or other indicators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)