[jira] [Updated] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated YARN-221: - Attachment: YARN-221-trunk-v4.patch Updated patch to fix warnings. > NM should provide a way for AM to tell it not to aggregate logs. > > > Key: YARN-221 > URL: https://issues.apache.org/jira/browse/YARN-221 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Robert Joseph Evans >Assignee: Ming Ma > Labels: BB2015-05-TBR > Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, > YARN-221-trunk-v3.patch, YARN-221-trunk-v4.patch > > > The NodeManager should provide a way for an AM to tell it that either the > logs should not be aggregated, that they should be aggregated with a high > priority, or that they should be aggregated but with a lower priority. The > AM should be able to do this in the ContainerLaunch context to provide a > default value, but should also be able to update the value when the container > is released. > This would allow for the NM to not aggregate logs in some cases, and avoid > connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3521) Support return structured NodeLabel objects in REST API when call getClusterNodeLabels
[ https://issues.apache.org/jira/browse/YARN-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536241#comment-14536241 ] Hadoop QA commented on YARN-3521: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 49s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 43s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 50s | The applied patch generated 5 new checkstyle issues (total was 61, now 54). | | {color:red}-1{color} | whitespace | 0m 6s | The patch has 22 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 15s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 52m 25s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 89m 15s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731679/0007-YARN-3521.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6471d18 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7844/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/7844/artifact/patchprocess/whitespace.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7844/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7844/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7844/console | This message was automatically generated. > Support return structured NodeLabel objects in REST API when call > getClusterNodeLabels > -- > > Key: YARN-3521 > URL: https://issues.apache.org/jira/browse/YARN-3521 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Wangda Tan >Assignee: Sunil G > Attachments: 0001-YARN-3521.patch, 0002-YARN-3521.patch, > 0003-YARN-3521.patch, 0004-YARN-3521.patch, 0005-YARN-3521.patch, > 0006-YARN-3521.patch, 0007-YARN-3521.patch > > > In YARN-3413, yarn cluster CLI returns NodeLabel instead of String, we should > make the same change in REST API side to make them consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536236#comment-14536236 ] Hadoop QA commented on YARN-3044: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 52s | Pre-patch YARN-2928 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 40s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 48s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 24s | The applied patch generated 3 new checkstyle issues (total was 273, now 275). | | {color:red}-1{color} | checkstyle | 1m 46s | The applied patch generated 1 new checkstyle issues (total was 9, now 10). | | {color:green}+1{color} | whitespace | 0m 3s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 41s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 39s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 3m 16s | The patch appears to introduce 14 new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 53m 13s | Tests passed in hadoop-yarn-server-resourcemanager. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 94m 12s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-resourcemanager | | | Unchecked/unconfirmed cast from org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to org.apache.hadoop.yarn.server.resourcemanager.metrics.AppAttemptFinishedEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.AppAttemptFinishedEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:[line 103] | | | Unchecked/unconfirmed cast from org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to org.apache.hadoop.yarn.server.resourcemanager.metrics.AppAttemptRegisteredEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.AppAttemptRegisteredEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:[line 100] | | | Unchecked/unconfirmed cast from org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationACLsUpdatedEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationACLsUpdatedEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:[line 97] | | | Unchecked/unconfirmed cast from org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationCreatedEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationCreatedEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:[line 91] | | | Unchecked/unconfirmed cast from org.apache.hadoop.yarn.server.resourcemanager.metrics.SystemMetricsEvent to org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationFinishedEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:org.apache.hadoop.yarn.server.resourcemanager.metrics.ApplicationFinishedEvent in org.apache.hadoop.yarn.server.resourcemanager.metrics.TimelineServiceV1Publisher.handle(SystemMetricsEvent) At TimelineServiceV1Publisher.java:[line 94] | | | Unchecked/unconfirmed c
[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage
[ https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536225#comment-14536225 ] Sangjin Lee commented on YARN-3411: --- Thanks [~vrushalic] for the latest patch. I took a quick look at it, and have a few comments. I think we should fill in those a couple of to-do's and get a patch ready. (TimelineServicePerformanceV2.java) - you might need to update the patch to be based on the latest from the branch; e.g addTImeSeries() -> addValues() (pom.xml) - it might be good to follow the standard practice (AFAIK) and specify the hbase dependency version in hadoop-project/pom.xml (CreateSchema.java) - l.57: it's not necessary to add hbase-site.xml manually if you're creating an explicit HBaseConfiguration. It does it by default - l.58: the connection should be closed at the end - We don't have to do this right now, but it might be good to expand this tool to be able to drop the schema and recreate it (drop and create) (HBaseTimelineWriterImpl.java) - l.71: let's make this a debug logging statement (also with a isDebugEnabled() call) - l.75: no need to override init() (the base implementation calls serviceInit() anyway) - l.146: since you're getting both key and value, using entrySet() vs. keySet() is a little more efficient (the same for other iterations) - l.225: style nit: you can simply define String key and Object value inside the loop (there is no real savings for reusing the variables): {code} for (Map.Entry entry : configs.entrySet()) { String key = entry.getKey(); Object value = entry.getValue(); ... } {code} - l.244: you want to check the type of the metric (single value v. time series) first and handle them differently, right? (TimelineWriterUtils.java) - l.123: it seems like Range should be a top level class rather than an inner class for EntityTableDetails; it's shared by several classes already > [Storage implementation] explore the native HBase write schema for storage > -- > > Key: YARN-3411 > URL: https://issues.apache.org/jira/browse/YARN-3411 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Vrushali C >Priority: Critical > Attachments: ATSv2BackendHBaseSchemaproposal.pdf, > YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, YARN-3411.poc.4.txt, > YARN-3411.poc.5.txt, YARN-3411.poc.txt > > > There is work that's in progress to implement the storage based on a Phoenix > schema (YARN-3134). > In parallel, we would like to explore an implementation based on a native > HBase schema for the write path. Such a schema does not exclude using > Phoenix, especially for reads and offline queries. > Once we have basic implementations of both options, we could evaluate them in > terms of performance, scalability, usability, etc. and make a call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536205#comment-14536205 ] Hadoop QA commented on YARN-221: \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 38s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:red}-1{color} | javac | 7m 31s | The applied patch generated 122 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 58s | There were no new checkstyle issues. | | {color:red}-1{color} | whitespace | 1m 2s | The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 46s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 22s | Tests passed in hadoop-yarn-api. | | {color:red}-1{color} | yarn tests | 1m 55s | Tests failed in hadoop-yarn-common. | | {color:red}-1{color} | yarn tests | 5m 59s | Tests failed in hadoop-yarn-server-nodemanager. | | | | 49m 26s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.conf.TestYarnConfigurationFields | | | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731667/YARN-221-trunk-v3.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6471d18 | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/7841/artifact/patchprocess/diffJavacWarnings.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/7841/artifact/patchprocess/whitespace.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/7841/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/7841/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7841/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7841/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7841/console | This message was automatically generated. > NM should provide a way for AM to tell it not to aggregate logs. > > > Key: YARN-221 > URL: https://issues.apache.org/jira/browse/YARN-221 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Robert Joseph Evans >Assignee: Ming Ma > Labels: BB2015-05-TBR > Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, > YARN-221-trunk-v3.patch > > > The NodeManager should provide a way for an AM to tell it that either the > logs should not be aggregated, that they should be aggregated with a high > priority, or that they should be aggregated but with a lower priority. The > AM should be able to do this in the ContainerLaunch context to provide a > default value, but should also be able to update the value when the container > is released. > This would allow for the NM to not aggregate logs in some cases, and avoid > connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1912) ResourceLocalizer started without any jvm memory control
[ https://issues.apache.org/jira/browse/YARN-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536203#comment-14536203 ] Hudson commented on YARN-1912: -- FAILURE: Integrated in Hadoop-trunk-Commit #7781 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7781/]) YARN-1912. ResourceLocalizer started without any jvm memory control. (xgong: rev 6471d18bc72bc6c83ce31a03b5c5f5737847bb6d) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ContainerLocalizer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutorWithMocks.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/WindowsSecureContainerExecutor.java > ResourceLocalizer started without any jvm memory control > > > Key: YARN-1912 > URL: https://issues.apache.org/jira/browse/YARN-1912 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: stanley shi >Assignee: Masatake Iwasaki > Labels: BB2015-05-RFC > Fix For: 2.8.0 > > Attachments: YARN-1912-0.patch, YARN-1912-1.patch, > YARN-1912.003.patch, YARN-1912.004.patch > > > In the LinuxContainerExecutor.java#startLocalizer, it does not specify any > "-Xmx" configurations in the command, this caused the ResourceLocalizer to be > started with default memory setting. > In an server-level hardware, it will use 25% of the system memory as the max > heap size, this will cause memory issue in some cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps
[ https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536199#comment-14536199 ] Hadoop QA commented on YARN-3505: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 4s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 42s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 4s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 50s | The applied patch generated 1 new checkstyle issues (total was 1, now 2). | | {color:red}-1{color} | checkstyle | 2m 7s | The applied patch generated 2 new checkstyle issues (total was 70, now 63). | | {color:green}+1{color} | whitespace | 0m 19s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 43s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 34s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 30s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-server-common. | | {color:red}-1{color} | yarn tests | 5m 53s | Tests failed in hadoop-yarn-server-nodemanager. | | {color:red}-1{color} | yarn tests | 27m 33s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 77m 15s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService | | | hadoop.yarn.server.resourcemanager.security.TestAMRMTokens | | | hadoop.yarn.server.resourcemanager.TestResourceTrackerService | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestWorkPreservingRMRestartForNodeLabel | | | hadoop.yarn.server.resourcemanager.TestContainerResourceUsage | | | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCNodeUpdates | | | hadoop.yarn.server.resourcemanager.TestResourceManager | | | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRMRPCResponseId | | | hadoop.yarn.server.resourcemanager.TestRM | | | hadoop.yarn.server.resourcemanager.TestApplicationMasterService | | | hadoop.yarn.server.resourcemanager.TestClientRMService | | | hadoop.yarn.server.resourcemanager.webapp.TestAppPage | | | hadoop.yarn.server.resourcemanager.resourcetracker.TestRMNMRPCResponseId | | | hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | | hadoop.yarn.server.resourcemanager.logaggregationstatus.TestRMAppLogAggregationStatus | | | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | | | hadoop.yarn.server.resourcemanager.security.TestClientToAMTokens | | | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | | | hadoop.yarn.server.resourcemanager.webapp.TestRMWebServicesHttpStaticUserPermissions | | | hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter | | | hadoop.yarn.server.resourcemanager.scheduler.TestSchedulerUtils | | | hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA | | | hadoop.yarn.server.resourcemanager.TestApplicationCleanup | | | hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler | | | hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerNodeLabelUpdate | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731665/YARN-3505.3.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6471d18 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7836/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt https://builds.apache.org/job/PreCommit-YARN-Build/7836/artifact/patchprocess/diffcheckstylehadoop-yarn-server-common.txt | | hadoop-yarn-api test log | https://builds.apache.org/
[jira] [Commented] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536197#comment-14536197 ] Hadoop QA commented on YARN-3344: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 2s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 47s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 50s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 50s | The applied patch generated 1 new checkstyle issues (total was 43, now 43). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 25s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 1m 59s | Tests passed in hadoop-yarn-common. | | | | 39m 29s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12704909/YARN-3344-branch-trunk.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6471d18 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7838/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/7838/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7838/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7838/console | This message was automatically generated. > procfs stat file is not in the expected format warning > -- > > Key: YARN-3344 > URL: https://issues.apache.org/jira/browse/YARN-3344 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jon Bringhurst >Assignee: Ravindra Kumar Naik > Labels: BB2015-05-RFC > Attachments: YARN-3344-branch-trunk.001.patch, > YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch > > > Although this doesn't appear to be causing any functional issues, it is > spamming our log files quite a bit. :) > It appears that the regex in ProcfsBasedProcessTree doesn't work for all > /proc//stat files. > Here's the error I'm seeing: > {noformat} > "source_host": "asdf", > "method": "constructProcessInfo", > "level": "WARN", > "message": "Unexpected: procfs stat file is not in the expected format > for process with pid 6953" > "file": "ProcfsBasedProcessTree.java", > "line_number": "514", > "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree", > {noformat} > And here's the basic info on process with pid 6953: > {noformat} > [asdf ~]$ cat /proc/6953/stat > 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 > 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 > 2 18446744073709551615 0 0 17 13 0 0 0 0 0 > [asdf ~]$ ps aux|grep 6953 > root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 > /export/apps/salt/minion-scripts/module-sync.py > jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 > [asdf ~]$ > {noformat} > This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done
[ https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536192#comment-14536192 ] Hadoop QA commented on YARN-3324: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 5m 14s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 19s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 21s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 1s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:red}-1{color} | yarn tests | 5m 56s | Tests failed in hadoop-yarn-server-nodemanager. | | | | 22m 34s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12703955/YARN-3324-branch-2.6.0.002.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 6471d18 | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7842/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7842/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7842/console | This message was automatically generated. > TestDockerContainerExecutor should clean test docker image from local > repository after test is done > --- > > Key: YARN-3324 > URL: https://issues.apache.org/jira/browse/YARN-3324 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Chen He > Labels: BB2015-05-TBR > Attachments: YARN-3324-branch-2.6.0.002.patch, > YARN-3324-trunk.002.patch > > > Current TestDockerContainerExecutor only cleans the temp directory in local > file system but leaves the test docker image in local docker repository. It > should be cleaned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3595) Performance optimization using connection cache of Phoenix timeline writer
[ https://issues.apache.org/jira/browse/YARN-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536187#comment-14536187 ] Sangjin Lee commented on YARN-3595: --- The guava loading cache in YARN-3134 doesn't really work here because the cache semantics is different than the pooling semantics. The only thing you can do with a cache is "get", but the cache has no way of knowing the connection is checked out and in active use. So either the size or the "idleness" from the cache point of view can always kick and cause havoc. I believe what we need is basically connection pooling. JDBC itself has the connection pooling API (ConnectionPoolDataSource), for which pooling implementations can plug into (e.g. Apache DBCP). It sounds to me a most natural way of doing this. > Performance optimization using connection cache of Phoenix timeline writer > -- > > Key: YARN-3595 > URL: https://issues.apache.org/jira/browse/YARN-3595 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > > The story about the connection cache in Phoenix timeline storage is a little > bit long. In YARN-3033 we planned to have shared writer layer for all > collectors in the same collector manager. In this way we can better reuse the > same heavy-weight storage layer connection, therefore it's more friendly to > conventional storage layer connections which are typically heavy-weight. > Phoenix, on the other hand, implements its own connection interface layer to > be light-weight, thread-unsafe. To make these connections work with our > "multiple collector, single writer" model, we're adding a thread indexed > connection cache. However, many performance critical factors are yet to be > tested. > In this JIRA we're tracing performance optimization efforts using this > connection cache. Previously we had a draft, but there was one implementation > challenge on cache evictions: There may be races between Guava cache's > removal listener calls (which close the connection) and normal references to > the connection. We need to carefully define the way they synchronize. > Performance-wise, at the very beginning stage we may need to understand: > # If the current, thread-based indexing is an appropriate approach, or we can > use some better ways to index the connections. > # the best size of the cache, presumably as the proposed default value of a > configuration. > # how long we need to preserve a connection in the cache. > Please feel free to add this list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1917) Add "waitForApplicationState" interface to YarnClient
[ https://issues.apache.org/jira/browse/YARN-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536182#comment-14536182 ] Zhijie Shen commented on YARN-1917: --- Wangda, sorry for raising one more issue after +1, but I just noticed that the new API is not marked with \@Stable, which is good to have at beginning. Thoughts? BTW, some lines are over 80 chars, you may want to fix them. > Add "waitForApplicationState" interface to YarnClient > - > > Key: YARN-1917 > URL: https://issues.apache.org/jira/browse/YARN-1917 > Project: Hadoop YARN > Issue Type: New Feature > Components: client >Affects Versions: 2.4.0 >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-1917.20150501.1.patch, YARN-1917.20150508.1.patch, > YARN-1917.patch, YARN-1917.patch, YARN-1917.patch > > > Currently, YARN dosen't have this method. Users needs to write > implementations like UnmanagedAMLauncher.monitorApplication or > mapreduce.Job.monitorAndPrintJob on their own. This feature should be helpful > to end users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3261) rewrite resourcemanager restart doc to remove roadmap bits
[ https://issues.apache.org/jira/browse/YARN-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536179#comment-14536179 ] Hadoop QA commented on YARN-3261: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 2m 50s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 55s | Site still builds. | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix. | | | | 6m 8s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12704385/YARN-3261.01.patch | | Optional Tests | site | | git revision | trunk / 6471d18 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/7840/artifact/patchprocess/whitespace.txt | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7840/console | This message was automatically generated. > rewrite resourcemanager restart doc to remove roadmap bits > --- > > Key: YARN-3261 > URL: https://issues.apache.org/jira/browse/YARN-3261 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Allen Wittenauer >Assignee: Gururaj Shetty > Labels: BB2015-05-TBR > Attachments: YARN-3261.01.patch > > > Another mixture of roadmap and instruction manual that seems to be ever > present in a lot of the recently written documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3276) Refactor and fix null casting in some map cast for TimelineEntity (old and new) and fix findbug warnings
[ https://issues.apache.org/jira/browse/YARN-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536177#comment-14536177 ] Zhijie Shen commented on YARN-3276: --- Some comments about the latest patch. 1. TimelineServiceUtils -> TimelineServiceHelper? 2. Is mapreduce using it? Maybe simply \@Private {code} 26 @LimitedPrivate({ "MapReduce", "YARN" }) 27 public final class TimelineServiceUtils { {code} 3. TimelineEvent are not covered? 4. AllocateResponsePBImpl change is not related? > Refactor and fix null casting in some map cast for TimelineEntity (old and > new) and fix findbug warnings > > > Key: YARN-3276 > URL: https://issues.apache.org/jira/browse/YARN-3276 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Attachments: YARN-3276-YARN-2928.v3.patch, > YARN-3276-YARN-2928.v4.patch, YARN-3276-v2.patch, YARN-3276-v3.patch, > YARN-3276.patch > > > Per discussion in YARN-3087, we need to refactor some similar logic to cast > map to hashmap and get rid of NPE issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3360) Add JMX metrics to TimelineDataManager
[ https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536172#comment-14536172 ] Hadoop QA commented on YARN-3360: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12705199/YARN-3360.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 6471d18 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7839/console | This message was automatically generated. > Add JMX metrics to TimelineDataManager > -- > > Key: YARN-3360 > URL: https://issues.apache.org/jira/browse/YARN-3360 > Project: Hadoop YARN > Issue Type: Improvement > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Labels: BB2015-05-TBR > Attachments: YARN-3360.001.patch > > > The TimelineDataManager currently has no metrics, outside of the standard JVM > metrics. It would be very useful to at least log basic counts of method > calls, time spent in those calls, and number of entities/events involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1519) check if sysconf is implemented before using it
[ https://issues.apache.org/jira/browse/YARN-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536170#comment-14536170 ] Hadoop QA commented on YARN-1519: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 5m 13s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | yarn tests | 6m 0s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 21m 17s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731625/YARN-1519.002.patch | | Optional Tests | javac unit | | git revision | trunk / 6471d18 | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7837/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7837/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7837/console | This message was automatically generated. > check if sysconf is implemented before using it > --- > > Key: YARN-1519 > URL: https://issues.apache.org/jira/browse/YARN-1519 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0, 2.3.0 >Reporter: Radim Kolar >Assignee: Radim Kolar > Labels: BB2015-05-TBR > Attachments: YARN-1519.002.patch, nodemgr-sysconf.txt > > > If sysconf value _SC_GETPW_R_SIZE_MAX is not implemented, it leads to > segfault because invalid pointer gets passed to libc function. > fix: enforce minimum value 1024, same method is used in hadoop-common native > code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536164#comment-14536164 ] Hadoop QA commented on YARN-1287: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 5m 12s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 10 new or modified test files. | | {color:green}+1{color} | javac | 7m 29s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 2m 29s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 31s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 30s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:red}-1{color} | mapreduce tests | 8m 42s | Tests failed in hadoop-mapreduce-client-app. | | {color:green}+1{color} | yarn tests | 1m 55s | Tests passed in hadoop-yarn-common. | | {color:red}-1{color} | yarn tests | 20m 55s | Tests failed in hadoop-yarn-server-nodemanager. | | {color:red}-1{color} | yarn tests | 62m 28s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 116m 15s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.mapreduce.v2.app.TestRuntimeEstimators | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairSchedulerPreemption | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestContinuousScheduling | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppAttempt | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestMaxRunningAppsEnforcer | | Timed out tests | org.apache.hadoop.yarn.server.nodemanager.util.TestCgroupsLCEResourcesHandler | | | org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731642/YARN-1287.004.patch | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 333f9a8 | | hadoop-mapreduce-client-app test log | https://builds.apache.org/job/PreCommit-YARN-Build/7835/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/7835/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7835/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7835/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7835/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7835/console | This message was automatically generated. > Consolidate MockClocks > -- > > Key: YARN-1287 > URL: https://issues.apache.org/jira/browse/YARN-1287 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sandy Ryza >Assignee: Sebastian Wong > Labels: newbie > Attachments: YARN-1287-3.patch, YARN-1287.004.patch > > > A bunch of different tests have near-identical implementations of MockClock. > TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for > example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3521) Support return structured NodeLabel objects in REST API when call getClusterNodeLabels
[ https://issues.apache.org/jira/browse/YARN-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-3521: -- Attachment: 0007-YARN-3521.patch Thank you [~leftnoteasy] Uploading a new patch addressing the comments. > Support return structured NodeLabel objects in REST API when call > getClusterNodeLabels > -- > > Key: YARN-3521 > URL: https://issues.apache.org/jira/browse/YARN-3521 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Wangda Tan >Assignee: Sunil G > Attachments: 0001-YARN-3521.patch, 0002-YARN-3521.patch, > 0003-YARN-3521.patch, 0004-YARN-3521.patch, 0005-YARN-3521.patch, > 0006-YARN-3521.patch, 0007-YARN-3521.patch > > > In YARN-3413, yarn cluster CLI returns NodeLabel instead of String, we should > make the same change in REST API side to make them consistency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1912) ResourceLocalizer started without any jvm memory control
[ https://issues.apache.org/jira/browse/YARN-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536154#comment-14536154 ] Xuan Gong commented on YARN-1912: - Committed into trunk/branch-2. Thanks, Masatake Iwasaki > ResourceLocalizer started without any jvm memory control > > > Key: YARN-1912 > URL: https://issues.apache.org/jira/browse/YARN-1912 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: stanley shi >Assignee: Masatake Iwasaki > Labels: BB2015-05-RFC > Attachments: YARN-1912-0.patch, YARN-1912-1.patch, > YARN-1912.003.patch, YARN-1912.004.patch > > > In the LinuxContainerExecutor.java#startLocalizer, it does not specify any > "-Xmx" configurations in the command, this caused the ResourceLocalizer to be > started with default memory setting. > In an server-level hardware, it will use 25% of the system memory as the max > heap size, this will cause memory issue in some cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1912) ResourceLocalizer started without any jvm memory control
[ https://issues.apache.org/jira/browse/YARN-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536149#comment-14536149 ] Xuan Gong commented on YARN-1912: - +1 LGTM. Will commit > ResourceLocalizer started without any jvm memory control > > > Key: YARN-1912 > URL: https://issues.apache.org/jira/browse/YARN-1912 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.2.0 >Reporter: stanley shi >Assignee: Masatake Iwasaki > Labels: BB2015-05-RFC > Attachments: YARN-1912-0.patch, YARN-1912-1.patch, > YARN-1912.003.patch, YARN-1912.004.patch > > > In the LinuxContainerExecutor.java#startLocalizer, it does not specify any > "-Xmx" configurations in the command, this caused the ResourceLocalizer to be > started with default memory setting. > In an server-level hardware, it will use 25% of the system memory as the max > heap size, this will cause memory issue in some cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3176) In Fair Scheduler, child queue should inherit maxApp from its parent
[ https://issues.apache.org/jira/browse/YARN-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3176: - Labels: (was: BB2015-05-TBR) > In Fair Scheduler, child queue should inherit maxApp from its parent > > > Key: YARN-3176 > URL: https://issues.apache.org/jira/browse/YARN-3176 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Siqi Li >Assignee: Siqi Li > Attachments: YARN-3176.v1.patch > > > if the child queue does not have a maxRunningApp limit, it will use the > queueMaxAppsDefault. This behavior is not quite right, since > queueMaxAppsDefault is normally a small number, whereas some parent queues do > have maxRunningApp set to be more than the default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3261) rewrite resourcemanager restart doc to remove roadmap bits
[ https://issues.apache.org/jira/browse/YARN-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536145#comment-14536145 ] Junping Du commented on YARN-3261: -- [~jianhe], can you kindly review attached doc patch against RM restart work preserving? Thanks! > rewrite resourcemanager restart doc to remove roadmap bits > --- > > Key: YARN-3261 > URL: https://issues.apache.org/jira/browse/YARN-3261 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Reporter: Allen Wittenauer >Assignee: Gururaj Shetty > Labels: BB2015-05-TBR > Attachments: YARN-3261.01.patch > > > Another mixture of roadmap and instruction manual that seems to be ever > present in a lot of the recently written documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3324) TestDockerContainerExecutor should clean test docker image from local repository after test is done
[ https://issues.apache.org/jira/browse/YARN-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3324: - Target Version/s: (was: 2.6.0) > TestDockerContainerExecutor should clean test docker image from local > repository after test is done > --- > > Key: YARN-3324 > URL: https://issues.apache.org/jira/browse/YARN-3324 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.6.0 >Reporter: Chen He > Labels: BB2015-05-TBR > Attachments: YARN-3324-branch-2.6.0.002.patch, > YARN-3324-trunk.002.patch > > > Current TestDockerContainerExecutor only cleans the temp directory in local > file system but leaves the test docker image in local docker repository. It > should be cleaned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3360) Add JMX metrics to TimelineDataManager
[ https://issues.apache.org/jira/browse/YARN-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536142#comment-14536142 ] Junping Du commented on YARN-3360: -- It sounds like Jenkins isn't kicked off. Manually kick off test again. > Add JMX metrics to TimelineDataManager > -- > > Key: YARN-3360 > URL: https://issues.apache.org/jira/browse/YARN-3360 > Project: Hadoop YARN > Issue Type: Improvement > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Jason Lowe > Labels: BB2015-05-TBR > Attachments: YARN-3360.001.patch > > > The TimelineDataManager currently has no metrics, outside of the standard JVM > metrics. It would be very useful to at least log basic counts of method > calls, time spent in those calls, and number of entities/events involved. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536139#comment-14536139 ] Junping Du commented on YARN-3344: -- Manually kick off Jenkins test again. > procfs stat file is not in the expected format warning > -- > > Key: YARN-3344 > URL: https://issues.apache.org/jira/browse/YARN-3344 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jon Bringhurst >Assignee: Ravindra Kumar Naik > Labels: BB2015-05-RFC > Attachments: YARN-3344-branch-trunk.001.patch, > YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch > > > Although this doesn't appear to be causing any functional issues, it is > spamming our log files quite a bit. :) > It appears that the regex in ProcfsBasedProcessTree doesn't work for all > /proc//stat files. > Here's the error I'm seeing: > {noformat} > "source_host": "asdf", > "method": "constructProcessInfo", > "level": "WARN", > "message": "Unexpected: procfs stat file is not in the expected format > for process with pid 6953" > "file": "ProcfsBasedProcessTree.java", > "line_number": "514", > "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree", > {noformat} > And here's the basic info on process with pid 6953: > {noformat} > [asdf ~]$ cat /proc/6953/stat > 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 > 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 > 2 18446744073709551615 0 0 17 13 0 0 0 0 0 > [asdf ~]$ ps aux|grep 6953 > root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 > /export/apps/salt/minion-scripts/module-sync.py > jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 > [asdf ~]$ > {noformat} > This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3344: - Assignee: Ravindra Kumar Naik > procfs stat file is not in the expected format warning > -- > > Key: YARN-3344 > URL: https://issues.apache.org/jira/browse/YARN-3344 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jon Bringhurst >Assignee: Ravindra Kumar Naik > Labels: BB2015-05-RFC > Attachments: YARN-3344-branch-trunk.001.patch, > YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch > > > Although this doesn't appear to be causing any functional issues, it is > spamming our log files quite a bit. :) > It appears that the regex in ProcfsBasedProcessTree doesn't work for all > /proc//stat files. > Here's the error I'm seeing: > {noformat} > "source_host": "asdf", > "method": "constructProcessInfo", > "level": "WARN", > "message": "Unexpected: procfs stat file is not in the expected format > for process with pid 6953" > "file": "ProcfsBasedProcessTree.java", > "line_number": "514", > "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree", > {noformat} > And here's the basic info on process with pid 6953: > {noformat} > [asdf ~]$ cat /proc/6953/stat > 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 > 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 > 2 18446744073709551615 0 0 17 13 0 0 0 0 0 > [asdf ~]$ ps aux|grep 6953 > root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 > /export/apps/salt/minion-scripts/module-sync.py > jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 > [asdf ~]$ > {noformat} > This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3344: - Target Version/s: (was: 2.6.0) > procfs stat file is not in the expected format warning > -- > > Key: YARN-3344 > URL: https://issues.apache.org/jira/browse/YARN-3344 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jon Bringhurst > Labels: BB2015-05-RFC > Attachments: YARN-3344-branch-trunk.001.patch, > YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch > > > Although this doesn't appear to be causing any functional issues, it is > spamming our log files quite a bit. :) > It appears that the regex in ProcfsBasedProcessTree doesn't work for all > /proc//stat files. > Here's the error I'm seeing: > {noformat} > "source_host": "asdf", > "method": "constructProcessInfo", > "level": "WARN", > "message": "Unexpected: procfs stat file is not in the expected format > for process with pid 6953" > "file": "ProcfsBasedProcessTree.java", > "line_number": "514", > "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree", > {noformat} > And here's the basic info on process with pid 6953: > {noformat} > [asdf ~]$ cat /proc/6953/stat > 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 > 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 > 2 18446744073709551615 0 0 17 13 0 0 0 0 0 > [asdf ~]$ ps aux|grep 6953 > root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 > /export/apps/salt/minion-scripts/module-sync.py > jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 > [asdf ~]$ > {noformat} > This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1917) Add "waitForApplicationState" interface to YarnClient
[ https://issues.apache.org/jira/browse/YARN-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536131#comment-14536131 ] Hadoop QA commented on YARN-1917: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 41s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 34s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 40s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 56s | The applied patch generated 3 new checkstyle issues (total was 54, now 57). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 39s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 27s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | mapreduce tests | 105m 46s | Tests passed in hadoop-mapreduce-client-jobclient. | | {color:green}+1{color} | yarn tests | 6m 55s | Tests passed in hadoop-yarn-client. | | | | 149m 40s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731569/YARN-1917.20150508.1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / ed0f4db | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7832/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/7832/artifact/patchprocess/whitespace.txt | | hadoop-mapreduce-client-jobclient test log | https://builds.apache.org/job/PreCommit-YARN-Build/7832/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/7832/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7832/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7832/console | This message was automatically generated. > Add "waitForApplicationState" interface to YarnClient > - > > Key: YARN-1917 > URL: https://issues.apache.org/jira/browse/YARN-1917 > Project: Hadoop YARN > Issue Type: New Feature > Components: client >Affects Versions: 2.4.0 >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-1917.20150501.1.patch, YARN-1917.20150508.1.patch, > YARN-1917.patch, YARN-1917.patch, YARN-1917.patch > > > Currently, YARN dosen't have this method. Users needs to write > implementations like UnmanagedAMLauncher.monitorApplication or > mapreduce.Job.monitorAndPrintJob on their own. This feature should be helpful > to end users. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3423) RM HA setup, "Cluster" tab links populated with AM hostname instead of RM
[ https://issues.apache.org/jira/browse/YARN-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536116#comment-14536116 ] Hadoop QA commented on YARN-3423: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 45s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 33s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 52s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 33s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 40s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 57s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | mapreduce tests | 9m 26s | Tests passed in hadoop-mapreduce-client-app. | | | | 45m 44s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12708597/YARN-3423.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 333f9a8 | | hadoop-mapreduce-client-app test log | https://builds.apache.org/job/PreCommit-YARN-Build/7834/artifact/patchprocess/testrun_hadoop-mapreduce-client-app.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7834/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7834/console | This message was automatically generated. > RM HA setup, "Cluster" tab links populated with AM hostname instead of RM > -- > > Key: YARN-3423 > URL: https://issues.apache.org/jira/browse/YARN-3423 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 > Environment: centOS-6.x >Reporter: Aroop Maliakkal >Assignee: zhaoyunjiong >Priority: Minor > Labels: BB2015-05-TBR > Attachments: YARN-3423.patch > > > In RM HA setup ( e.g. > http://rm-1.vip.abc.com:50030/proxy/application_1427789305393_0002/ ), go to > the job details and click on the "Cluster tab" on left top side. Click on any > of the links , "About", Applications" , "Scheduler". You can see that the > hyperlink is pointing to http://am-1.vip.abc.com:port/cluster ). > The port details for secure and unsecure cluster is given below :- > 8088 ( DEFAULT_RM_WEBAPP_PORT = 8088 ) > 8090 ( DEFAULT_RM_WEBAPP_HTTPS_PORT = 8090 ) > Ideally, it should have pointed to resourcemanager hostname instead of AM > hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3480) Recovery may get very slow with lots of services with lots of app-attempts
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536112#comment-14536112 ] Jun Gong commented on YARN-3480: [~vinodkv] Thanks for the suggestions. {quote} Part of why you are seeing the problem today itself is precisely because you don't have YARN-611. Once you have YARN-611, assuming a validity interval in the order of 10s of minutes, to reach 10K objects, you need consistent failures for >100 days to see what you are seeing. {quote} Yes, YARN-611 will benefit us a lot. Our own AM will fail for some conditions, and it also makes number of retried attempts very large. {quote} Assuming some history is important, we can have a limit the amount of completed app-attempts' history that the platform remembers. Apps can control how much they want the platform to remember but they cannot specify more than a cluster configured global limit. {quote} Some details to clarify: we might need keep failed attempts those are in validity window, so it is the minimum number of attempts that we should keep. So when apps specify how much they want the platform to remember, we need consider it as another minimum number of attempts that we should keep. {quote} instead of throwing away all history, I'd instead also do the recovery of very old attempts outside of the recovery path. That way recovery can still be fast (only recovering few of the most recent attempts synchronously) and given enough time, older history will get read offline. {quote} It makes recovery more fast, and does not lose any attempts' history. However it will makes recovery process a little more complicated. The former method(removing attempts) is more concise, and just likes logrotate, if we could accept the absence of some attempts' history information, I would prefer it. > Recovery may get very slow with lots of services with lots of app-attempts > -- > > Key: YARN-3480 > URL: https://issues.apache.org/jira/browse/YARN-3480 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Jun Gong >Assignee: Jun Gong > Attachments: YARN-3480.01.patch, YARN-3480.02.patch, > YARN-3480.03.patch, YARN-3480.04.patch > > > When RM HA is enabled and running containers are kept across attempts, apps > are more likely to finish successfully with more retries(attempts), so it > will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However > it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make > RM recover process much slower. It might be better to set max attempts to be > stored in RMStateStore. > BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to > a small value, retried attempts might be very large. So we need to delete > some attempts stored in RMStateStore and RMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3489) RMServerUtils.validateResourceRequests should only obtain queue info once
[ https://issues.apache.org/jira/browse/YARN-3489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536095#comment-14536095 ] Hadoop QA commented on YARN-3489: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 41s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 32s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 41s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 52s | The applied patch generated 1 new checkstyle issues (total was 76, now 77). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 16s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:red}-1{color} | yarn tests | 52m 6s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 88m 45s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731613/YARN-3489.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / ed0f4db | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/7833/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7833/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7833/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7833/console | This message was automatically generated. > RMServerUtils.validateResourceRequests should only obtain queue info once > - > > Key: YARN-3489 > URL: https://issues.apache.org/jira/browse/YARN-3489 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Labels: BB2015-05-RFC > Attachments: YARN-3489.01.patch, YARN-3489.02.patch, > YARN-3489.03.patch > > > Since the label support was added we now get the queue info for each request > being validated in SchedulerUtils.validateResourceRequest. If > validateResourceRequests needs to validate a lot of requests at a time (e.g.: > large cluster with lots of varied locality in the requests) then it will get > the queue info for each request. Since we build the queue info this > generates a lot of unnecessary garbage, as the queue isn't changing between > requests. We should grab the queue info once and pass it down rather than > building it again for each request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536093#comment-14536093 ] Rajesh Kartha commented on YARN-3344: - I tried the patch in my env and did not see any new failures. > procfs stat file is not in the expected format warning > -- > > Key: YARN-3344 > URL: https://issues.apache.org/jira/browse/YARN-3344 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jon Bringhurst > Labels: BB2015-05-RFC > Attachments: YARN-3344-branch-trunk.001.patch, > YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch > > > Although this doesn't appear to be causing any functional issues, it is > spamming our log files quite a bit. :) > It appears that the regex in ProcfsBasedProcessTree doesn't work for all > /proc//stat files. > Here's the error I'm seeing: > {noformat} > "source_host": "asdf", > "method": "constructProcessInfo", > "level": "WARN", > "message": "Unexpected: procfs stat file is not in the expected format > for process with pid 6953" > "file": "ProcfsBasedProcessTree.java", > "line_number": "514", > "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree", > {noformat} > And here's the basic info on process with pid 6953: > {noformat} > [asdf ~]$ cat /proc/6953/stat > 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 > 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 > 2 18446744073709551615 0 0 17 13 0 0 0 0 0 > [asdf ~]$ ps aux|grep 6953 > root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 > /export/apps/salt/minion-scripts/module-sync.py > jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 > [asdf ~]$ > {noformat} > This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning
[ https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Kartha updated YARN-3344: Labels: BB2015-05-RFC (was: BB2015-05-TBR) > procfs stat file is not in the expected format warning > -- > > Key: YARN-3344 > URL: https://issues.apache.org/jira/browse/YARN-3344 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.6.0 >Reporter: Jon Bringhurst > Labels: BB2015-05-RFC > Attachments: YARN-3344-branch-trunk.001.patch, > YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch > > > Although this doesn't appear to be causing any functional issues, it is > spamming our log files quite a bit. :) > It appears that the regex in ProcfsBasedProcessTree doesn't work for all > /proc//stat files. > Here's the error I'm seeing: > {noformat} > "source_host": "asdf", > "method": "constructProcessInfo", > "level": "WARN", > "message": "Unexpected: procfs stat file is not in the expected format > for process with pid 6953" > "file": "ProcfsBasedProcessTree.java", > "line_number": "514", > "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree", > {noformat} > And here's the basic info on process with pid 6953: > {noformat} > [asdf ~]$ cat /proc/6953/stat > 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 > 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 > 2 18446744073709551615 0 0 17 13 0 0 0 0 0 > [asdf ~]$ ps aux|grep 6953 > root 6953 0.0 0.0 200484 23424 ?S21:44 0:00 python2.6 > /export/apps/salt/minion-scripts/module-sync.py > jbringhu 13481 0.0 0.0 105312 872 pts/0S+ 22:13 0:00 grep -i 6953 > [asdf ~]$ > {noformat} > This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3044) [Event producers] Implement RM writing app lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536087#comment-14536087 ] Sangjin Lee commented on YARN-3044: --- [~Naganarasimha], the latest patch looks good to me. Could you please file a JIRA on handling the child entity as mentioned? I'd love to commit it soon as this has been open for quite a while, but I recognize the timing is rather awkward (Friday evening US time). I'll wait until next Monday before I commit your patch. I would greatly appreciate it if I could get others' feedback on the latest patch until then. Thanks! > [Event producers] Implement RM writing app lifecycle events to ATS > -- > > Key: YARN-3044 > URL: https://issues.apache.org/jira/browse/YARN-3044 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Sangjin Lee >Assignee: Naganarasimha G R > Labels: BB2015-05-TBR > Attachments: YARN-3044-YARN-2928.004.patch, > YARN-3044-YARN-2928.005.patch, YARN-3044-YARN-2928.006.patch, > YARN-3044-YARN-2928.007.patch, YARN-3044.20150325-1.patch, > YARN-3044.20150406-1.patch, YARN-3044.20150416-1.patch > > > Per design in YARN-2928, implement RM writing app lifecycle events to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-401) ClientRMService.getQueueInfo can return stale application reports
[ https://issues.apache.org/jira/browse/YARN-401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536086#comment-14536086 ] Rohith commented on YARN-401: - [~jlowe] In trunk, I see that in schedulers(both CS and FS) create a new instance of QueueInfo. So even multiple client requesting for QueueInfo for the same queue or from parent queue should not get stale application reports. > ClientRMService.getQueueInfo can return stale application reports > - > > Key: YARN-401 > URL: https://issues.apache.org/jira/browse/YARN-401 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.0.2-alpha, 0.23.6 >Reporter: Jason Lowe >Priority: Minor > > ClientRMService.getQueueInfo is modifying a QueueInfo object when application > reports are requested. Unfortunately this QueueInfo object could be a > persisting object in the scheduler, and modifying it in this way can lead to > stale application reports being returned to the client. Here's an example > scenario with CapacityScheduler: > # A client asks for queue info on queue X with application reports > # ClientRMService.getQueueInfo modifies the queue's QueueInfo object and sets > application reports on it > # Another client asks for recursive queue info from the root queue without > application reports > # Since the old application reports are still attached to queue X's QueueInfo > object, these stale reports appear in the QueueInfo data for queue X in the > results > Normally if the client is not asking for application reports it won't be > looking for and act upon any application reports that happen to appear in the > queue info result. However we shouldn't be returning application reports in > the first place, and when we do, they shouldn't be stale. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3608) Apps submitted to MiniYarnCluster always stay in ACCEPTED state.
[ https://issues.apache.org/jira/browse/YARN-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536076#comment-14536076 ] Spandan Dutta commented on YARN-3608: - Yes please. > Apps submitted to MiniYarnCluster always stay in ACCEPTED state. > > > Key: YARN-3608 > URL: https://issues.apache.org/jira/browse/YARN-3608 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 2.6.0 >Reporter: Spandan Dutta > > So I adapted a test case to submit a yarn app to a MiniYarnCluster and wait > for it to reach running state. Turns out that the app gets stuck in > "ACCEPTED" state. > {noformat} > @Test > public void testGetAllQueues() throws IOException, YarnException, > InterruptedException { > MiniYARNCluster cluster = new MiniYARNCluster("testMRAMTokens", 1, 1, 1); > YarnClient rmClient = null; > try { > cluster.init(new YarnConfiguration()); > cluster.start(); > final Configuration yarnConf = cluster.getConfig(); > rmClient = YarnClient.createYarnClient(); > rmClient.init(yarnConf); > rmClient.start(); > YarnClientApplication newApp = rmClient.createApplication(); > ApplicationId appId = > newApp.getNewApplicationResponse().getApplicationId(); > // Create launch context for app master > ApplicationSubmissionContext appContext > = Records.newRecord(ApplicationSubmissionContext.class); > // set the application id > appContext.setApplicationId(appId); > // set the application name > appContext.setApplicationName("test"); > // Set up the container launch context for the application master > ContainerLaunchContext amContainer > = Records.newRecord(ContainerLaunchContext.class); > appContext.setAMContainerSpec(amContainer); > appContext.setResource(Resource.newInstance(1024, 1)); > // Submit the application to the applications manager > rmClient.submitApplication(appContext); > ApplicationReport applicationReport = > rmClient.getApplicationReport(appContext.getApplicationId()); > int timeout = 10; > while(timeout > 0 && applicationReport.getYarnApplicationState() != > YarnApplicationState.RUNNING) { > Thread.sleep(5 * 1000); > timeout--; > } > Assert.assertTrue(timeout != 0); > Assert.assertTrue(applicationReport.getYarnApplicationState() > == YarnApplicationState.RUNNING); > List queues = rmClient.getAllQueues(); > Assert.assertNotNull(queues); > Assert.assertTrue(!queues.isEmpty()); > QueueInfo queue = queues.get(0); > List queueApplications = queue.getApplications(); > Assert.assertFalse(queueApplications.isEmpty()); > } catch (YarnException e) { > Assert.assertTrue(e.getMessage().contains("Failed to submit")); > } finally { > if (rmClient != null) { > rmClient.stop(); > } > cluster.stop(); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3271) FairScheduler: Move tests related to max-runnable-apps from TestFairScheduler to TestAppRunnability
[ https://issues.apache.org/jira/browse/YARN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536063#comment-14536063 ] Hudson commented on YARN-3271: -- FAILURE: Integrated in Hadoop-trunk-Commit #7780 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7780/]) YARN-3271. FairScheduler: Move tests related to max-runnable-apps from TestFairScheduler to TestAppRunnability. (nijel via kasha) (kasha: rev 2fb44c8aaf7f35f425d80b133a28b1c1f305c3e6) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestAppRunnability.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * hadoop-yarn-project/CHANGES.txt > FairScheduler: Move tests related to max-runnable-apps from TestFairScheduler > to TestAppRunnability > --- > > Key: YARN-3271 > URL: https://issues.apache.org/jira/browse/YARN-3271 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Karthik Kambatla >Assignee: nijel > Fix For: 2.8.0 > > Attachments: YARN-3271.1.patch, YARN-3271.2.patch, YARN-3271.3.patch, > YARN-3271.4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3473) Fix RM Web UI configuration for some properties
[ https://issues.apache.org/jira/browse/YARN-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536069#comment-14536069 ] Hudson commented on YARN-3473: -- FAILURE: Integrated in Hadoop-trunk-Commit #7780 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7780/]) YARN-3473. Fix RM Web UI configuration for some properties (rchiang via rkanter) (rkanter: rev 5658998845bbeb3f09037a891f6b254585848de7) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java * hadoop-yarn-project/CHANGES.txt > Fix RM Web UI configuration for some properties > --- > > Key: YARN-3473 > URL: https://issues.apache.org/jira/browse/YARN-3473 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > Attachments: YARN-3473.001.patch > > > Using the RM Web UI, the Tools->Configuration page shows some properties as > something like "BufferedInputStream" instead of the appropriate .xml file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2206) Update document for applications REST API response examples
[ https://issues.apache.org/jira/browse/YARN-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536070#comment-14536070 ] Hudson commented on YARN-2206: -- FAILURE: Integrated in Hadoop-trunk-Commit #7780 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7780/]) YARN-2206. Updated document for applications REST API response examples. Contributed by Kenji Kikushima and Brahma Reddy Battula. (zjshen: rev 08f0ae403c649b28925d3b339b7b6de1d7ec2c0c) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md * hadoop-yarn-project/CHANGES.txt > Update document for applications REST API response examples > --- > > Key: YARN-2206 > URL: https://issues.apache.org/jira/browse/YARN-2206 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 2.4.0 >Reporter: Kenji Kikushima >Assignee: Brahma Reddy Battula >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: YARN-2206-002.patch, YARN-2206.patch > > > In ResourceManagerRest.apt.vm, Applications API responses are missing some > elements. > - JSON response should have "applicationType" and "applicationTags". > - XML response should have "applicationTags". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1050) Document the Fair Scheduler REST API
[ https://issues.apache.org/jira/browse/YARN-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536072#comment-14536072 ] Hudson commented on YARN-1050: -- FAILURE: Integrated in Hadoop-trunk-Commit #7780 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7780/]) YARN-1050. Document the Fair Scheduler REST API. (Kenji Kikushima and Roman Shaposhnik via kasha) (kasha: rev 96473bdc2b090c13708fd467fd626621ef1d47eb) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/ResourceManagerRest.md * hadoop-yarn-project/CHANGES.txt > Document the Fair Scheduler REST API > > > Key: YARN-1050 > URL: https://issues.apache.org/jira/browse/YARN-1050 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation, fairscheduler >Reporter: Sandy Ryza >Assignee: Kenji Kikushima > Fix For: 2.8.0 > > Attachments: YARN-1050-2.patch, YARN-1050-3.patch, YARN-1050-4.patch, > YARN-1050.patch > > > The documentation should be placed here along with the Capacity Scheduler > documentation: > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3602) TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IOException from cleanup
[ https://issues.apache.org/jira/browse/YARN-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536068#comment-14536068 ] Hudson commented on YARN-3602: -- FAILURE: Integrated in Hadoop-trunk-Commit #7780 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7780/]) YARN-3602. TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IOException from cleanup. Contributed by zhihai xu (xgong: rev 333f9a896d8a4407ce69cfd0dc8314587a339233) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java > TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails > Intermittently due to IOException from cleanup > -- > > Key: YARN-3602 > URL: https://issues.apache.org/jira/browse/YARN-3602 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Labels: BB2015-05-RFC > Fix For: 2.8.0 > > Attachments: YARN-3602.000.patch > > > ResourceLocalizationService.testPublicResourceInitializesLocalDir fails > Intermittently due to IOException from cleanup. The stack trace is the > following from test report at > https://builds.apache.org/job/PreCommit-YARN-Build/7729/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer/TestResourceLocalizationService/testPublicResourceInitializesLocalDir/ > {code} > Error Message > Unable to delete directory > target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/2/filecache. > Stacktrace > java.io.IOException: Unable to delete directory > target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/2/filecache. > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1541) > at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) > at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) > at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) > at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.cleanup(TestResourceLocalizationService.java:187) > {code} > It looks like we can safely ignore the IOException in cleanup which is called > after test. > The IOException may be due to the test machine environment because > TestResourceLocalizationService/2/filecache is created by > ResourceLocalizationService#initializeLocalDir. > testPublicResourceInitializesLocalDir created 0/filecache, 1/filecache, > 2/filecache and 3/filecache > {code} > for (int i = 0; i < 4; ++i) { > localDirs.add(lfs.makeQualified(new Path(basedir, i + ""))); > sDirs[i] = localDirs.get(i).toString(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3476) Nodemanager can fail to delete local logs if log aggregation fails
[ https://issues.apache.org/jira/browse/YARN-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536066#comment-14536066 ] Hudson commented on YARN-3476: -- FAILURE: Integrated in Hadoop-trunk-Commit #7780 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/7780/]) YARN-3476. Nodemanager can fail to delete local logs if log aggregation fails. Contributed by Rohith (jlowe: rev 25e2b02122c4ed760227ab33c49d3445c23b9276) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java * hadoop-yarn-project/CHANGES.txt > Nodemanager can fail to delete local logs if log aggregation fails > -- > > Key: YARN-3476 > URL: https://issues.apache.org/jira/browse/YARN-3476 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Rohith > Labels: BB2015-05-TBR > Fix For: 2.7.1 > > Attachments: 0001-YARN-3476.patch, 0001-YARN-3476.patch, > 0002-YARN-3476.patch > > > If log aggregation encounters an error trying to upload the file then the > underlying TFile can throw an illegalstateexception which will bubble up > through the top of the thread and prevent the application logs from being > deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3608) Apps submitted to MiniYarnCluster always stay in ACCEPTED state.
[ https://issues.apache.org/jira/browse/YARN-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536057#comment-14536057 ] Tsuyoshi Ozawa commented on YARN-3608: -- I faced similar issue when I was taking on YARN-2921... Can I take a look at this issue? > Apps submitted to MiniYarnCluster always stay in ACCEPTED state. > > > Key: YARN-3608 > URL: https://issues.apache.org/jira/browse/YARN-3608 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 2.6.0 >Reporter: Spandan Dutta > > So I adapted a test case to submit a yarn app to a MiniYarnCluster and wait > for it to reach running state. Turns out that the app gets stuck in > "ACCEPTED" state. > {noformat} > @Test > public void testGetAllQueues() throws IOException, YarnException, > InterruptedException { > MiniYARNCluster cluster = new MiniYARNCluster("testMRAMTokens", 1, 1, 1); > YarnClient rmClient = null; > try { > cluster.init(new YarnConfiguration()); > cluster.start(); > final Configuration yarnConf = cluster.getConfig(); > rmClient = YarnClient.createYarnClient(); > rmClient.init(yarnConf); > rmClient.start(); > YarnClientApplication newApp = rmClient.createApplication(); > ApplicationId appId = > newApp.getNewApplicationResponse().getApplicationId(); > // Create launch context for app master > ApplicationSubmissionContext appContext > = Records.newRecord(ApplicationSubmissionContext.class); > // set the application id > appContext.setApplicationId(appId); > // set the application name > appContext.setApplicationName("test"); > // Set up the container launch context for the application master > ContainerLaunchContext amContainer > = Records.newRecord(ContainerLaunchContext.class); > appContext.setAMContainerSpec(amContainer); > appContext.setResource(Resource.newInstance(1024, 1)); > // Submit the application to the applications manager > rmClient.submitApplication(appContext); > ApplicationReport applicationReport = > rmClient.getApplicationReport(appContext.getApplicationId()); > int timeout = 10; > while(timeout > 0 && applicationReport.getYarnApplicationState() != > YarnApplicationState.RUNNING) { > Thread.sleep(5 * 1000); > timeout--; > } > Assert.assertTrue(timeout != 0); > Assert.assertTrue(applicationReport.getYarnApplicationState() > == YarnApplicationState.RUNNING); > List queues = rmClient.getAllQueues(); > Assert.assertNotNull(queues); > Assert.assertTrue(!queues.isEmpty()); > QueueInfo queue = queues.get(0); > List queueApplications = queue.getApplications(); > Assert.assertFalse(queueApplications.isEmpty()); > } catch (YarnException e) { > Assert.assertTrue(e.getMessage().contains("Failed to submit")); > } finally { > if (rmClient != null) { > rmClient.stop(); > } > cluster.stop(); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated YARN-221: - Attachment: YARN-221-trunk-v3.patch Thanks [~gtCarrera9]. Here is the rebased patch. > NM should provide a way for AM to tell it not to aggregate logs. > > > Key: YARN-221 > URL: https://issues.apache.org/jira/browse/YARN-221 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Robert Joseph Evans >Assignee: Ming Ma > Labels: BB2015-05-TBR > Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch, > YARN-221-trunk-v3.patch > > > The NodeManager should provide a way for an AM to tell it that either the > logs should not be aggregated, that they should be aggregated with a high > priority, or that they should be aggregated but with a lower priority. The > AM should be able to do this in the ContainerLaunch context to provide a > default value, but should also be able to update the value when the container > is released. > This would allow for the NM to not aggregate logs in some cases, and avoid > connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3282) DockerContainerExecutor should support environment variables setting
[ https://issues.apache.org/jira/browse/YARN-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3282: - Labels: (was: BB2015-05-TBR) > DockerContainerExecutor should support environment variables setting > > > Key: YARN-3282 > URL: https://issues.apache.org/jira/browse/YARN-3282 > Project: Hadoop YARN > Issue Type: Improvement > Components: applications, nodemanager >Affects Versions: 2.6.0 >Reporter: Leitao Guo > Attachments: YARN-3282.01.patch > > > Currently, DockerContainerExecutor will mount "yarn.nodemanager.local-dirs" > and "yarn.nodemanager.log-dirs" to containers automatically. However > applications maybe need set more environment variables before launching > containers. > In our applications, just as the following command, we need to attach several > directories and set some environment variables to docker containers. > {code} > docker run -i -t -v /data/transcode:/data/tmp -v /etc/qcs:/etc/qcs -v > /mnt:/mnt -e VTC_MQTYPE=rabbitmq -e VTC_APP=ugc -e VTC_LOCATION=sh -e > VTC_RUNTIME=vtc sequenceiq/hadoop-docker:2.6.0 /bin/bash > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3602) TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IOException from cleanup
[ https://issues.apache.org/jira/browse/YARN-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536050#comment-14536050 ] Xuan Gong commented on YARN-3602: - Committed into trunk/branch-2. Thanks, zhihai. > TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails > Intermittently due to IOException from cleanup > -- > > Key: YARN-3602 > URL: https://issues.apache.org/jira/browse/YARN-3602 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Labels: BB2015-05-RFC > Fix For: 2.8.0 > > Attachments: YARN-3602.000.patch > > > ResourceLocalizationService.testPublicResourceInitializesLocalDir fails > Intermittently due to IOException from cleanup. The stack trace is the > following from test report at > https://builds.apache.org/job/PreCommit-YARN-Build/7729/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer/TestResourceLocalizationService/testPublicResourceInitializesLocalDir/ > {code} > Error Message > Unable to delete directory > target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/2/filecache. > Stacktrace > java.io.IOException: Unable to delete directory > target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/2/filecache. > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1541) > at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) > at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) > at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) > at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.cleanup(TestResourceLocalizationService.java:187) > {code} > It looks like we can safely ignore the IOException in cleanup which is called > after test. > The IOException may be due to the test machine environment because > TestResourceLocalizationService/2/filecache is created by > ResourceLocalizationService#initializeLocalDir. > testPublicResourceInitializesLocalDir created 0/filecache, 1/filecache, > 2/filecache and 3/filecache > {code} > for (int i = 0; i < 4; ++i) { > localDirs.add(lfs.makeQualified(new Path(basedir, i + ""))); > sDirs[i] = localDirs.get(i).toString(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3602) TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails Intermittently due to IOException from cleanup
[ https://issues.apache.org/jira/browse/YARN-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536046#comment-14536046 ] Xuan Gong commented on YARN-3602: - +1 LGTM. Will commit > TestResourceLocalizationService.testPublicResourceInitializesLocalDir fails > Intermittently due to IOException from cleanup > -- > > Key: YARN-3602 > URL: https://issues.apache.org/jira/browse/YARN-3602 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Labels: BB2015-05-RFC > Attachments: YARN-3602.000.patch > > > ResourceLocalizationService.testPublicResourceInitializesLocalDir fails > Intermittently due to IOException from cleanup. The stack trace is the > following from test report at > https://builds.apache.org/job/PreCommit-YARN-Build/7729/testReport/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer/TestResourceLocalizationService/testPublicResourceInitializesLocalDir/ > {code} > Error Message > Unable to delete directory > target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/2/filecache. > Stacktrace > java.io.IOException: Unable to delete directory > target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/2/filecache. > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1541) > at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) > at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) > at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) > at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) > at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.cleanup(TestResourceLocalizationService.java:187) > {code} > It looks like we can safely ignore the IOException in cleanup which is called > after test. > The IOException may be due to the test machine environment because > TestResourceLocalizationService/2/filecache is created by > ResourceLocalizationService#initializeLocalDir. > testPublicResourceInitializesLocalDir created 0/filecache, 1/filecache, > 2/filecache and 3/filecache > {code} > for (int i = 0; i < 4; ++i) { > localDirs.add(lfs.makeQualified(new Path(basedir, i + ""))); > sDirs[i] = localDirs.get(i).toString(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2517) Implement TimelineClientAsync
[ https://issues.apache.org/jira/browse/YARN-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2517: -- Labels: (was: BB2015-05-TBR) > Implement TimelineClientAsync > - > > Key: YARN-2517 > URL: https://issues.apache.org/jira/browse/YARN-2517 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Zhijie Shen >Assignee: Tsuyoshi Ozawa > Attachments: YARN-2517.1.patch, YARN-2517.2.patch > > > In some scenarios, we'd like to put timeline entities in another thread no to > block the current one. > It's good to have a TimelineClientAsync like AMRMClientAsync and > NMClientAsync. It can buffer entities, put them in a separate thread, and > have callback to handle the responses. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps
[ https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536037#comment-14536037 ] Xuan Gong commented on YARN-3505: - bq. In addition, what would happen if (this.logAggregationSucceed + this.logAggregationFailed) != this.logAggregationStatus.size()? If this happens, that means the log aggregation still happens in some of NMs. Upload a new patch to address other comments > Node's Log Aggregation Report with SUCCEED should not cached in RMApps > -- > > Key: YARN-3505 > URL: https://issues.apache.org/jira/browse/YARN-3505 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 2.8.0 >Reporter: Junping Du >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-3505.1.patch, YARN-3505.2.patch, > YARN-3505.2.rebase.patch, YARN-3505.3.patch > > > Per discussions in YARN-1402, we shouldn't cache all node's log aggregation > reports in RMApps for always, especially for those finished with SUCCEED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3505) Node's Log Aggregation Report with SUCCEED should not cached in RMApps
[ https://issues.apache.org/jira/browse/YARN-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-3505: Attachment: YARN-3505.3.patch > Node's Log Aggregation Report with SUCCEED should not cached in RMApps > -- > > Key: YARN-3505 > URL: https://issues.apache.org/jira/browse/YARN-3505 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 2.8.0 >Reporter: Junping Du >Assignee: Xuan Gong >Priority: Critical > Attachments: YARN-3505.1.patch, YARN-3505.2.patch, > YARN-3505.2.rebase.patch, YARN-3505.3.patch > > > Per discussions in YARN-1402, we shouldn't cache all node's log aggregation > reports in RMApps for always, especially for those finished with SUCCEED. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend
[ https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536033#comment-14536033 ] Junping Du commented on YARN-3134: -- +1. Latest patch LGTM. > [Storage implementation] Exploiting the option of using Phoenix to access > HBase backend > --- > > Key: YARN-3134 > URL: https://issues.apache.org/jira/browse/YARN-3134 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Li Lu > Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, > YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, > YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, > YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, > YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, > YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, > YARN-3134-YARN-2928.007.patch, YARN-3134DataSchema.pdf, > hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out > > > Quote the introduction on Phoenix web page: > {code} > Apache Phoenix is a relational database layer over HBase delivered as a > client-embedded JDBC driver targeting low latency queries over HBase data. > Apache Phoenix takes your SQL query, compiles it into a series of HBase > scans, and orchestrates the running of those scans to produce regular JDBC > result sets. The table metadata is stored in an HBase table and versioned, > such that snapshot queries over prior versions will automatically use the > correct schema. Direct use of the HBase API, along with coprocessors and > custom filters, results in performance on the order of milliseconds for small > queries, or seconds for tens of millions of rows. > {code} > It may simply our implementation read/write data from/to HBase, and can > easily build index and compose complex query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend
[ https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536024#comment-14536024 ] Hadoop QA commented on YARN-3134: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 51s | Pre-patch YARN-2928 compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 39s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 35s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 26m 12s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731615/YARN-3134-YARN-2928.007.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / d4a2362 | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/7831/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7831/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7831/console | This message was automatically generated. > [Storage implementation] Exploiting the option of using Phoenix to access > HBase backend > --- > > Key: YARN-3134 > URL: https://issues.apache.org/jira/browse/YARN-3134 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Li Lu > Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, > YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, > YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, > YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, > YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, > YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, > YARN-3134-YARN-2928.007.patch, YARN-3134DataSchema.pdf, > hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out > > > Quote the introduction on Phoenix web page: > {code} > Apache Phoenix is a relational database layer over HBase delivered as a > client-embedded JDBC driver targeting low latency queries over HBase data. > Apache Phoenix takes your SQL query, compiles it into a series of HBase > scans, and orchestrates the running of those scans to produce regular JDBC > result sets. The table metadata is stored in an HBase table and versioned, > such that snapshot queries over prior versions will automatically use the > correct schema. Direct use of the HBase API, along with coprocessors and > custom filters, results in performance on the order of milliseconds for small > queries, or seconds for tens of millions of rows. > {code} > It may simply our implementation read/write data from/to HBase, and can > easily build index and compose complex query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3611) Support Docker Containers In LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536023#comment-14536023 ] Sidharta Seethana commented on YARN-3611: - /cc [~ashahab], [~vinodkv] : Please chime in > Support Docker Containers In LinuxContainerExecutor > --- > > Key: YARN-3611 > URL: https://issues.apache.org/jira/browse/YARN-3611 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana > > Support Docker Containers In LinuxContainerExecutor > LinuxContainerExecutor provides useful functionality today with respect to > localization, cgroups based resource management and isolation for CPU, > network, disk etc. as well as security with a well-defined mechanism to > execute privileged operations using the container-executor utility. Bringing > docker support to LinuxContainerExecutor lets us use all of this > functionality when running docker containers under YARN, while not requiring > users and admins to configure and use a different ContainerExecutor. > There are several aspects here that need to be worked through : > * Mechanism(s) to let clients request docker-specific functionality - we > could initially implement this via environment variables without impacting > the client API. > * Security - both docker daemon as well as application > * Docker image localization > * Running a docker container via container-executor as a specified user > * “Isolate” the docker container in terms of CPU/network/disk/etc > * Communicating with and/or signaling the running container (ensure correct > pid handling) > * Figure out workarounds for certain performance-sensitive scenarios like > HDFS short-circuit reads > * All of these need to be achieved without changing the current behavior of > LinuxContainerExecutor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3611) Support Docker Containers In LinuxContainerExecutor
Sidharta Seethana created YARN-3611: --- Summary: Support Docker Containers In LinuxContainerExecutor Key: YARN-3611 URL: https://issues.apache.org/jira/browse/YARN-3611 Project: Hadoop YARN Issue Type: Bug Components: yarn Reporter: Sidharta Seethana Assignee: Sidharta Seethana Support Docker Containers In LinuxContainerExecutor LinuxContainerExecutor provides useful functionality today with respect to localization, cgroups based resource management and isolation for CPU, network, disk etc. as well as security with a well-defined mechanism to execute privileged operations using the container-executor utility. Bringing docker support to LinuxContainerExecutor lets us use all of this functionality when running docker containers under YARN, while not requiring users and admins to configure and use a different ContainerExecutor. There are several aspects here that need to be worked through : * Mechanism(s) to let clients request docker-specific functionality - we could initially implement this via environment variables without impacting the client API. * Security - both docker daemon as well as application * Docker image localization * Running a docker container via container-executor as a specified user * “Isolate” the docker container in terms of CPU/network/disk/etc * Communicating with and/or signaling the running container (ensure correct pid handling) * Figure out workarounds for certain performance-sensitive scenarios like HDFS short-circuit reads * All of these need to be achieved without changing the current behavior of LinuxContainerExecutor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-2206) Update document for applications REST API response examples
[ https://issues.apache.org/jira/browse/YARN-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brahma Reddy Battula reassigned YARN-2206: -- Assignee: Brahma Reddy Battula (was: Kenji Kikushima) > Update document for applications REST API response examples > --- > > Key: YARN-2206 > URL: https://issues.apache.org/jira/browse/YARN-2206 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 2.4.0 >Reporter: Kenji Kikushima >Assignee: Brahma Reddy Battula >Priority: Minor > Labels: newbie > Fix For: 2.8.0 > > Attachments: YARN-2206-002.patch, YARN-2206.patch > > > In ResourceManagerRest.apt.vm, Applications API responses are missing some > elements. > - JSON response should have "applicationType" and "applicationTags". > - XML response should have "applicationTags". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3395) [Fair Scheduler] Handle the user name correctly when user name is used as default queue name.
[ https://issues.apache.org/jira/browse/YARN-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535959#comment-14535959 ] zhihai xu commented on YARN-3395: - TestNodeLabelContainerAllocation failure is not related to the patch. Also the testReport https://builds.apache.org/job/PreCommit-YARN-Build/7826/testReport/ doesn't show this test failure. > [Fair Scheduler] Handle the user name correctly when user name is used as > default queue name. > - > > Key: YARN-3395 > URL: https://issues.apache.org/jira/browse/YARN-3395 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: YARN-3395.000.patch, YARN-3395.001.patch > > > Handle the user name correctly when user name is used as default queue name > in fair scheduler. > It will be better to remove the trailing and leading whitespace of the user > name when we use user name as default queue name, otherwise it will be > rejected by InvalidQueueNameException from QueueManager. I think it is > reasonable to make this change, because we already did special handling for > '.' in user name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1519) check if sysconf is implemented before using it
[ https://issues.apache.org/jira/browse/YARN-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535954#comment-14535954 ] Ravi Prakash commented on YARN-1519: Nitpick: We don't set an upper limit for something we are going to malloc. Earlier it was atleast limited to INT_MAX. Now its LONG_MAX. I'd rather keep typecasting the long to int. Otherwise +1. Please change that and I'm happy to commit > check if sysconf is implemented before using it > --- > > Key: YARN-1519 > URL: https://issues.apache.org/jira/browse/YARN-1519 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0, 2.3.0 >Reporter: Radim Kolar >Assignee: Radim Kolar > Labels: BB2015-05-TBR > Attachments: YARN-1519.002.patch, nodemgr-sysconf.txt > > > If sysconf value _SC_GETPW_R_SIZE_MAX is not implemented, it leads to > segfault because invalid pointer gets passed to libc function. > fix: enforce minimum value 1024, same method is used in hadoop-common native > code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1519) check if sysconf is implemented before using it
[ https://issues.apache.org/jira/browse/YARN-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-1519: --- Labels: BB2015-05-TBR (was: ) > check if sysconf is implemented before using it > --- > > Key: YARN-1519 > URL: https://issues.apache.org/jira/browse/YARN-1519 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0, 2.3.0 >Reporter: Radim Kolar >Assignee: Radim Kolar > Labels: BB2015-05-TBR > Attachments: YARN-1519.002.patch, nodemgr-sysconf.txt > > > If sysconf value _SC_GETPW_R_SIZE_MAX is not implemented, it leads to > segfault because invalid pointer gets passed to libc function. > fix: enforce minimum value 1024, same method is used in hadoop-common native > code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3395) [Fair Scheduler] Handle the user name correctly when user name is used as default queue name.
[ https://issues.apache.org/jira/browse/YARN-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535953#comment-14535953 ] Hadoop QA commented on YARN-3395: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 33s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 46s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 37s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 17s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:red}-1{color} | yarn tests | 62m 28s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 98m 58s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731543/YARN-3395.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / effcc5c | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/7826/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7826/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7826/console | This message was automatically generated. > [Fair Scheduler] Handle the user name correctly when user name is used as > default queue name. > - > > Key: YARN-3395 > URL: https://issues.apache.org/jira/browse/YARN-3395 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: YARN-3395.000.patch, YARN-3395.001.patch > > > Handle the user name correctly when user name is used as default queue name > in fair scheduler. > It will be better to remove the trailing and leading whitespace of the user > name when we use user name as default queue name, otherwise it will be > rejected by InvalidQueueNameException from QueueManager. I think it is > reasonable to make this change, because we already did special handling for > '.' in user name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3595) Performance optimization using connection cache of Phoenix timeline writer
[ https://issues.apache.org/jira/browse/YARN-3595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535945#comment-14535945 ] Li Lu commented on YARN-3595: - I put some thoughts on this. The problem of simply letting the write process and the removal listener calls to synchronize with each other is on the visibility of the connection object. Before we get the connection from the cache it's not possible for us to synchronize on it. However, after we get the connection from the cache and right before we start the synchronized block, the connection may have already been evicted from the cache and got closed. To address this problem, we need to do speculative read on these connections, and "rollback" if we notice we had a stale connection from the cache. I think we can have a thin layer of "connection wrapper" for each connection. The wrapper stores a flag to indicate if the connection inside it is still valid. On a speculative get call (from our write(TimelineEntity) method), we keep trying: # get a connection wrapper # synchronize on the wrapper # if the wrapper is invalid, try next round. # do normal write operations with the connection inside the wrapper On a removal call, we do the following: # synchronize on the wrapper # mark the wrapper as invalid # close the connection We can think about the case when a removal call's synchronization block serialized just before a write's. In this case, even if the write call got a stale connection wrapper that contains a connection to be closed, it will notice the flag and attempt to obtain a newer connection. Concurrent modifications to the same connection are of course not possible between write calls and removal calls, as they work on the same lock. Given the fine-grained synchronization pattern (only synchronizing between one writer thread and one Guava cache's clean up thread), contention should not be a big problem. Overhead for acquiring the lock for the synchronization statement for each writer is also acceptable I assume. The only problem may be, since we're performing JDBC operations inside a synchronized statement, we may block the cache removal process for a long time. Ideally we may make this algorithm obstruction-free, but maybe we can firstly understand how severe the problem is before we make more complex algorithm changes. Also, it appears to be possible to make the removal methods asynchronous. > Performance optimization using connection cache of Phoenix timeline writer > -- > > Key: YARN-3595 > URL: https://issues.apache.org/jira/browse/YARN-3595 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > > The story about the connection cache in Phoenix timeline storage is a little > bit long. In YARN-3033 we planned to have shared writer layer for all > collectors in the same collector manager. In this way we can better reuse the > same heavy-weight storage layer connection, therefore it's more friendly to > conventional storage layer connections which are typically heavy-weight. > Phoenix, on the other hand, implements its own connection interface layer to > be light-weight, thread-unsafe. To make these connections work with our > "multiple collector, single writer" model, we're adding a thread indexed > connection cache. However, many performance critical factors are yet to be > tested. > In this JIRA we're tracing performance optimization efforts using this > connection cache. Previously we had a draft, but there was one implementation > challenge on cache evictions: There may be races between Guava cache's > removal listener calls (which close the connection) and normal references to > the connection. We need to carefully define the way they synchronize. > Performance-wise, at the very beginning stage we may need to understand: > # If the current, thread-based indexing is an appropriate approach, or we can > use some better ways to index the connections. > # the best size of the cache, presumably as the proposed default value of a > configuration. > # how long we need to preserve a connection in the cache. > Please feel free to add this list. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535937#comment-14535937 ] Karthik Kambatla commented on YARN-1287: +1, pending Jenkins. > Consolidate MockClocks > -- > > Key: YARN-1287 > URL: https://issues.apache.org/jira/browse/YARN-1287 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sandy Ryza >Assignee: Sebastian Wong > Labels: newbie > Attachments: YARN-1287-3.patch, YARN-1287.004.patch > > > A bunch of different tests have near-identical implementations of MockClock. > TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for > example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2748) Upload logs in the sub-folders under the local log dir when aggregating logs
[ https://issues.apache.org/jira/browse/YARN-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535935#comment-14535935 ] Hadoop QA commented on YARN-2748: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 14m 39s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 53s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 38s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 23s | The patch does not introduce any new Findbugs (version 2.0.3) warnings. | | {color:green}+1{color} | yarn tests | 1m 55s | Tests passed in hadoop-yarn-common. | | | | 38m 40s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731620/YARN-2748.04.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 315074b | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/7828/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/7828/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7828/console | This message was automatically generated. > Upload logs in the sub-folders under the local log dir when aggregating logs > > > Key: YARN-2748 > URL: https://issues.apache.org/jira/browse/YARN-2748 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation >Affects Versions: 2.6.0 >Reporter: Zhijie Shen >Assignee: Varun Saxena > Labels: BB2015-05-RFC > Attachments: YARN-2748.001.patch, YARN-2748.002.patch, > YARN-2748.03.patch, YARN-2748.04.patch > > > YARN-2734 has a temporal fix to skip sub folders to avoid exception. Ideally, > if the app is creating a sub folder and putting its rolling logs there, we > need to upload these logs as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3473) Fix RM Web UI configuration for some properties
[ https://issues.apache.org/jira/browse/YARN-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535932#comment-14535932 ] Ray Chiang commented on YARN-3473: -- Thanks for the review and the commit! > Fix RM Web UI configuration for some properties > --- > > Key: YARN-3473 > URL: https://issues.apache.org/jira/browse/YARN-3473 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Minor > Labels: supportability > Fix For: 2.8.0 > > Attachments: YARN-3473.001.patch > > > Using the RM Web UI, the Tools->Configuration page shows some properties as > something like "BufferedInputStream" instead of the appropriate .xml file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3381) A typographical error in "InvalidStateTransitonException"
[ https://issues.apache.org/jira/browse/YARN-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535931#comment-14535931 ] Ray Chiang commented on YARN-3381: -- If we want to avoid the Incompatible flag, then I think it would be a good idea since a renamed class means you will be generating incompatible jars. [~aw] or [~vinodkv], would you agree? > A typographical error in "InvalidStateTransitonException" > - > > Key: YARN-3381 > URL: https://issues.apache.org/jira/browse/YARN-3381 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.6.0 >Reporter: Xiaoshuang LU >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3381-002.patch, YARN-3381-003.patch, YARN-3381.patch > > > Appears that "InvalidStateTransitonException" should be > "InvalidStateTransitionException". Transition was misspelled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1519) check if sysconf is implemented before using it
[ https://issues.apache.org/jira/browse/YARN-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravi Prakash updated YARN-1519: --- Labels: (was: BB2015-05-RFC) > check if sysconf is implemented before using it > --- > > Key: YARN-1519 > URL: https://issues.apache.org/jira/browse/YARN-1519 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.0.0, 2.3.0 >Reporter: Radim Kolar >Assignee: Radim Kolar > Attachments: YARN-1519.002.patch, nodemgr-sysconf.txt > > > If sysconf value _SC_GETPW_R_SIZE_MAX is not implemented, it leads to > segfault because invalid pointer gets passed to libc function. > fix: enforce minimum value 1024, same method is used in hadoop-common native > code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1050) Document the Fair Scheduler REST API
[ https://issues.apache.org/jira/browse/YARN-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535923#comment-14535923 ] Hadoop QA commented on YARN-1050: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731598/YARN-1050-4.patch | | Optional Tests | site | | git revision | trunk / 08f0ae4 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7830/console | This message was automatically generated. > Document the Fair Scheduler REST API > > > Key: YARN-1050 > URL: https://issues.apache.org/jira/browse/YARN-1050 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation, fairscheduler >Reporter: Sandy Ryza >Assignee: Kenji Kikushima > Fix For: 2.8.0 > > Attachments: YARN-1050-2.patch, YARN-1050-3.patch, YARN-1050-4.patch, > YARN-1050.patch > > > The documentation should be placed here along with the Capacity Scheduler > documentation: > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3271) FairScheduler: Move tests related to max-runnable-apps from TestFairScheduler to TestAppRunnability
[ https://issues.apache.org/jira/browse/YARN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535907#comment-14535907 ] Karthik Kambatla commented on YARN-3271: +1 > FairScheduler: Move tests related to max-runnable-apps from TestFairScheduler > to TestAppRunnability > --- > > Key: YARN-3271 > URL: https://issues.apache.org/jira/browse/YARN-3271 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Karthik Kambatla >Assignee: nijel > Attachments: YARN-3271.1.patch, YARN-3271.2.patch, YARN-3271.3.patch, > YARN-3271.4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3426) Add jdiff support to YARN
[ https://issues.apache.org/jira/browse/YARN-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535903#comment-14535903 ] Li Lu commented on YARN-3426: - Sure. I'll cancel the patch. > Add jdiff support to YARN > - > > Key: YARN-3426 > URL: https://issues.apache.org/jira/browse/YARN-3426 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Li Lu >Assignee: Li Lu >Priority: Blocker > Labels: BB2015-05-TBR > Attachments: YARN-3426-040615-1.patch, YARN-3426-040615.patch, > YARN-3426-040715.patch, YARN-3426-040815.patch > > > Maybe we'd like to extend our current jdiff tool for hadoop-common and hdfs > to YARN as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3604) removeApplication in ZKRMStateStore should also disable watch.
[ https://issues.apache.org/jira/browse/YARN-3604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-3604: - Labels: (was: BB2015-05-RFC) > removeApplication in ZKRMStateStore should also disable watch. > -- > > Key: YARN-3604 > URL: https://issues.apache.org/jira/browse/YARN-3604 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Reporter: zhihai xu >Assignee: zhihai xu >Priority: Minor > Fix For: 2.8.0 > > Attachments: YARN-3604.000.patch > > > removeApplication in ZKRMStateStore should also disable watch. > Function removeApplication is added from YARN-3410. > YARN-3469 is to disable watch for all function in ZKRMStateStore. > So it looks like YARN-3410 missed the change from YARN-3469 because YARN-3410 > added removeApplication after YARN-3469 is committed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1050) Document the Fair Scheduler REST API
[ https://issues.apache.org/jira/browse/YARN-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-1050: --- Labels: (was: BB2015-05-TBR) > Document the Fair Scheduler REST API > > > Key: YARN-1050 > URL: https://issues.apache.org/jira/browse/YARN-1050 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation, fairscheduler >Reporter: Sandy Ryza >Assignee: Kenji Kikushima > Fix For: 2.8.0 > > Attachments: YARN-1050-2.patch, YARN-1050-3.patch, YARN-1050-4.patch, > YARN-1050.patch > > > The documentation should be placed here along with the Capacity Scheduler > documentation: > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2206) Update document for applications REST API response examples
[ https://issues.apache.org/jira/browse/YARN-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535888#comment-14535888 ] Zhijie Shen commented on YARN-2206: --- +1 will commit the patch. > Update document for applications REST API response examples > --- > > Key: YARN-2206 > URL: https://issues.apache.org/jira/browse/YARN-2206 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 2.4.0 >Reporter: Kenji Kikushima >Assignee: Kenji Kikushima >Priority: Minor > Labels: newbie > Attachments: YARN-2206-002.patch, YARN-2206.patch > > > In ResourceManagerRest.apt.vm, Applications API responses are missing some > elements. > - JSON response should have "applicationType" and "applicationTags". > - XML response should have "applicationTags". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1050) Document the Fair Scheduler REST API
[ https://issues.apache.org/jira/browse/YARN-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535886#comment-14535886 ] Karthik Kambatla commented on YARN-1050: +1. Committed this to trunk and branch-2. > Document the Fair Scheduler REST API > > > Key: YARN-1050 > URL: https://issues.apache.org/jira/browse/YARN-1050 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation, fairscheduler >Reporter: Sandy Ryza >Assignee: Kenji Kikushima > Labels: BB2015-05-TBR > Attachments: YARN-1050-2.patch, YARN-1050-3.patch, YARN-1050-4.patch, > YARN-1050.patch > > > The documentation should be placed here along with the Capacity Scheduler > documentation: > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3423) RM HA setup, "Cluster" tab links populated with AM hostname instead of RM
[ https://issues.apache.org/jira/browse/YARN-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-3423: -- Assignee: zhaoyunjiong > RM HA setup, "Cluster" tab links populated with AM hostname instead of RM > -- > > Key: YARN-3423 > URL: https://issues.apache.org/jira/browse/YARN-3423 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 > Environment: centOS-6.x >Reporter: Aroop Maliakkal >Assignee: zhaoyunjiong >Priority: Minor > Labels: BB2015-05-TBR > Attachments: YARN-3423.patch > > > In RM HA setup ( e.g. > http://rm-1.vip.abc.com:50030/proxy/application_1427789305393_0002/ ), go to > the job details and click on the "Cluster tab" on left top side. Click on any > of the links , "About", Applications" , "Scheduler". You can see that the > hyperlink is pointing to http://am-1.vip.abc.com:port/cluster ). > The port details for secure and unsecure cluster is given below :- > 8088 ( DEFAULT_RM_WEBAPP_PORT = 8088 ) > 8090 ( DEFAULT_RM_WEBAPP_HTTPS_PORT = 8090 ) > Ideally, it should have pointed to resourcemanager hostname instead of AM > hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3426) Add jdiff support to YARN
[ https://issues.apache.org/jira/browse/YARN-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535881#comment-14535881 ] Junping Du commented on YARN-3426: -- bq. The problem for the current solution is we're duplicating many maven code for hadoop-common/hdfs and yarn. We're also introducing duplications to mapreduce in the current approach. The next step for this work should be removing the duplications for those maven code. Hi [~gtCarrera9], shall we cancel the patch here until we figured out these problems? > Add jdiff support to YARN > - > > Key: YARN-3426 > URL: https://issues.apache.org/jira/browse/YARN-3426 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Li Lu >Assignee: Li Lu >Priority: Blocker > Labels: BB2015-05-TBR > Attachments: YARN-3426-040615-1.patch, YARN-3426-040615.patch, > YARN-3426-040715.patch, YARN-3426-040815.patch > > > Maybe we'd like to extend our current jdiff tool for hadoop-common and hdfs > to YARN as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3271) FairScheduler: Move tests related to max-runnable-apps from TestFairScheduler to TestAppRunnability
[ https://issues.apache.org/jira/browse/YARN-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kambatla updated YARN-3271: --- Labels: (was: BB2015-05-RFC) > FairScheduler: Move tests related to max-runnable-apps from TestFairScheduler > to TestAppRunnability > --- > > Key: YARN-3271 > URL: https://issues.apache.org/jira/browse/YARN-3271 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Karthik Kambatla >Assignee: nijel > Attachments: YARN-3271.1.patch, YARN-3271.2.patch, YARN-3271.3.patch, > YARN-3271.4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3608) Apps submitted to MiniYarnCluster always stay in ACCEPTED state.
[ https://issues.apache.org/jira/browse/YARN-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535875#comment-14535875 ] Karthik Kambatla commented on YARN-3608: Not sure if it was a MiniYarnCluster bug, but definitely TestRMContainerAllocator runs into a similar issue. > Apps submitted to MiniYarnCluster always stay in ACCEPTED state. > > > Key: YARN-3608 > URL: https://issues.apache.org/jira/browse/YARN-3608 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 2.6.0 >Reporter: Spandan Dutta > > So I adapted a test case to submit a yarn app to a MiniYarnCluster and wait > for it to reach running state. Turns out that the app gets stuck in > "ACCEPTED" state. > {noformat} > @Test > public void testGetAllQueues() throws IOException, YarnException, > InterruptedException { > MiniYARNCluster cluster = new MiniYARNCluster("testMRAMTokens", 1, 1, 1); > YarnClient rmClient = null; > try { > cluster.init(new YarnConfiguration()); > cluster.start(); > final Configuration yarnConf = cluster.getConfig(); > rmClient = YarnClient.createYarnClient(); > rmClient.init(yarnConf); > rmClient.start(); > YarnClientApplication newApp = rmClient.createApplication(); > ApplicationId appId = > newApp.getNewApplicationResponse().getApplicationId(); > // Create launch context for app master > ApplicationSubmissionContext appContext > = Records.newRecord(ApplicationSubmissionContext.class); > // set the application id > appContext.setApplicationId(appId); > // set the application name > appContext.setApplicationName("test"); > // Set up the container launch context for the application master > ContainerLaunchContext amContainer > = Records.newRecord(ContainerLaunchContext.class); > appContext.setAMContainerSpec(amContainer); > appContext.setResource(Resource.newInstance(1024, 1)); > // Submit the application to the applications manager > rmClient.submitApplication(appContext); > ApplicationReport applicationReport = > rmClient.getApplicationReport(appContext.getApplicationId()); > int timeout = 10; > while(timeout > 0 && applicationReport.getYarnApplicationState() != > YarnApplicationState.RUNNING) { > Thread.sleep(5 * 1000); > timeout--; > } > Assert.assertTrue(timeout != 0); > Assert.assertTrue(applicationReport.getYarnApplicationState() > == YarnApplicationState.RUNNING); > List queues = rmClient.getAllQueues(); > Assert.assertNotNull(queues); > Assert.assertTrue(!queues.isEmpty()); > QueueInfo queue = queues.get(0); > List queueApplications = queue.getApplications(); > Assert.assertFalse(queueApplications.isEmpty()); > } catch (YarnException e) { > Assert.assertTrue(e.getMessage().contains("Failed to submit")); > } finally { > if (rmClient != null) { > rmClient.stop(); > } > cluster.stop(); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3608) Apps submitted to MiniYarnCluster always stay in ACCEPTED state.
[ https://issues.apache.org/jira/browse/YARN-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535877#comment-14535877 ] Karthik Kambatla commented on YARN-3608: May be, something to do with the dispatcher? > Apps submitted to MiniYarnCluster always stay in ACCEPTED state. > > > Key: YARN-3608 > URL: https://issues.apache.org/jira/browse/YARN-3608 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 2.6.0 >Reporter: Spandan Dutta > > So I adapted a test case to submit a yarn app to a MiniYarnCluster and wait > for it to reach running state. Turns out that the app gets stuck in > "ACCEPTED" state. > {noformat} > @Test > public void testGetAllQueues() throws IOException, YarnException, > InterruptedException { > MiniYARNCluster cluster = new MiniYARNCluster("testMRAMTokens", 1, 1, 1); > YarnClient rmClient = null; > try { > cluster.init(new YarnConfiguration()); > cluster.start(); > final Configuration yarnConf = cluster.getConfig(); > rmClient = YarnClient.createYarnClient(); > rmClient.init(yarnConf); > rmClient.start(); > YarnClientApplication newApp = rmClient.createApplication(); > ApplicationId appId = > newApp.getNewApplicationResponse().getApplicationId(); > // Create launch context for app master > ApplicationSubmissionContext appContext > = Records.newRecord(ApplicationSubmissionContext.class); > // set the application id > appContext.setApplicationId(appId); > // set the application name > appContext.setApplicationName("test"); > // Set up the container launch context for the application master > ContainerLaunchContext amContainer > = Records.newRecord(ContainerLaunchContext.class); > appContext.setAMContainerSpec(amContainer); > appContext.setResource(Resource.newInstance(1024, 1)); > // Submit the application to the applications manager > rmClient.submitApplication(appContext); > ApplicationReport applicationReport = > rmClient.getApplicationReport(appContext.getApplicationId()); > int timeout = 10; > while(timeout > 0 && applicationReport.getYarnApplicationState() != > YarnApplicationState.RUNNING) { > Thread.sleep(5 * 1000); > timeout--; > } > Assert.assertTrue(timeout != 0); > Assert.assertTrue(applicationReport.getYarnApplicationState() > == YarnApplicationState.RUNNING); > List queues = rmClient.getAllQueues(); > Assert.assertNotNull(queues); > Assert.assertTrue(!queues.isEmpty()); > QueueInfo queue = queues.get(0); > List queueApplications = queue.getApplications(); > Assert.assertFalse(queueApplications.isEmpty()); > } catch (YarnException e) { > Assert.assertTrue(e.getMessage().contains("Failed to submit")); > } finally { > if (rmClient != null) { > rmClient.stop(); > } > cluster.stop(); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3423) RM HA setup, "Cluster" tab links populated with AM hostname instead of RM
[ https://issues.apache.org/jira/browse/YARN-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535863#comment-14535863 ] Junping Du commented on YARN-3423: -- bq. So we need to review all uses on a case by case basis. Agree. Just check all other places call getResolvedRMWebAppURLWithoutScheme(), all (except this one) consider HA case so no need to replace. +1 on latest patch. Will commit it shortly. > RM HA setup, "Cluster" tab links populated with AM hostname instead of RM > -- > > Key: YARN-3423 > URL: https://issues.apache.org/jira/browse/YARN-3423 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.4.0 > Environment: centOS-6.x >Reporter: Aroop Maliakkal >Priority: Minor > Labels: BB2015-05-TBR > Attachments: YARN-3423.patch > > > In RM HA setup ( e.g. > http://rm-1.vip.abc.com:50030/proxy/application_1427789305393_0002/ ), go to > the job details and click on the "Cluster tab" on left top side. Click on any > of the links , "About", Applications" , "Scheduler". You can see that the > hyperlink is pointing to http://am-1.vip.abc.com:port/cluster ). > The port details for secure and unsecure cluster is given below :- > 8088 ( DEFAULT_RM_WEBAPP_PORT = 8088 ) > 8090 ( DEFAULT_RM_WEBAPP_HTTPS_PORT = 8090 ) > Ideally, it should have pointed to resourcemanager hostname instead of AM > hostname. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2306) leak of reservation metrics (fair scheduler)
[ https://issues.apache.org/jira/browse/YARN-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-2306: - Labels: BB2015-05-TBR (was: ) > leak of reservation metrics (fair scheduler) > > > Key: YARN-2306 > URL: https://issues.apache.org/jira/browse/YARN-2306 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Hong Zhiguo >Assignee: Hong Zhiguo >Priority: Minor > Labels: BB2015-05-TBR > Attachments: YARN-2306-2.patch, YARN-2306.patch > > > This only applies to fair scheduler. Capacity scheduler is OK. > When appAttempt or node is removed, the metrics for > reservation(reservedContainers, reservedMB, reservedVCores) is not reduced > back. > These are important metrics for administrator. The wrong metrics confuses may > confuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3473) Fix RM Web UI configuration for some properties
[ https://issues.apache.org/jira/browse/YARN-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535856#comment-14535856 ] Robert Kanter commented on YARN-3473: - +1 > Fix RM Web UI configuration for some properties > --- > > Key: YARN-3473 > URL: https://issues.apache.org/jira/browse/YARN-3473 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Minor > Labels: supportability > Attachments: YARN-3473.001.patch > > > Using the RM Web UI, the Tools->Configuration page shows some properties as > something like "BufferedInputStream" instead of the appropriate .xml file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1050) Document the Fair Scheduler REST API
[ https://issues.apache.org/jira/browse/YARN-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535853#comment-14535853 ] Karthik Kambatla commented on YARN-1050: Thanks Roman for updating the patch, and Ray for the review. Given this patch covers most of the REST API and has been sitting for a while, I ll go ahead and commit this. Filed YARN-3610 for the follow-up, so it doesn't fall through the cracks. > Document the Fair Scheduler REST API > > > Key: YARN-1050 > URL: https://issues.apache.org/jira/browse/YARN-1050 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation, fairscheduler >Reporter: Sandy Ryza >Assignee: Kenji Kikushima > Labels: BB2015-05-TBR > Attachments: YARN-1050-2.patch, YARN-1050-3.patch, YARN-1050-4.patch, > YARN-1050.patch > > > The documentation should be placed here along with the Capacity Scheduler > documentation: > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3473) Fix RM Web UI configuration for some properties
[ https://issues.apache.org/jira/browse/YARN-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter updated YARN-3473: Labels: supportability (was: BB2015-05-RFC supportability) > Fix RM Web UI configuration for some properties > --- > > Key: YARN-3473 > URL: https://issues.apache.org/jira/browse/YARN-3473 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Minor > Labels: supportability > Attachments: YARN-3473.001.patch > > > Using the RM Web UI, the Tools->Configuration page shows some properties as > something like "BufferedInputStream" instead of the appropriate .xml file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3610) FairScheduler: Add steady-fair-shares to the REST API documentation
Karthik Kambatla created YARN-3610: -- Summary: FairScheduler: Add steady-fair-shares to the REST API documentation Key: YARN-3610 URL: https://issues.apache.org/jira/browse/YARN-3610 Project: Hadoop YARN Issue Type: Improvement Components: documentation, fairscheduler Affects Versions: 2.7.0 Reporter: Karthik Kambatla Assignee: Ray Chiang YARN-1050 adds documentation for FairScheduler REST API, but is missing the steady-fair-share. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1297) Miscellaneous Fair Scheduler speedups
[ https://issues.apache.org/jira/browse/YARN-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535842#comment-14535842 ] Karthik Kambatla commented on YARN-1297: The test failures may be because of how the patch changes the resource-usage calculations. [~asuresh] - can you look into the failures? > Miscellaneous Fair Scheduler speedups > - > > Key: YARN-1297 > URL: https://issues.apache.org/jira/browse/YARN-1297 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Reporter: Sandy Ryza >Assignee: Sandy Ryza > Labels: BB2015-05-TBR > Attachments: YARN-1297-1.patch, YARN-1297-2.patch, YARN-1297.3.patch, > YARN-1297.patch, YARN-1297.patch > > > I ran the Fair Scheduler's core scheduling loop through a profiler tool and > identified a bunch of minimally invasive changes that can shave off a few > milliseconds. > The main one is demoting a couple INFO log messages to DEBUG, which brought > my benchmark down from 16000 ms to 6000. > A few others (which had way less of an impact) were > * Most of the time in comparisons was being spent in Math.signum. I switched > this to direct ifs and elses and it halved the percent of time spent in > comparisons. > * I removed some unnecessary instantiations of Resource objects > * I made it so that queues' usage wasn't calculated from the applications up > each time getResourceUsage was called. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1050) Document the Fair Scheduler REST API
[ https://issues.apache.org/jira/browse/YARN-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535832#comment-14535832 ] Ray Chiang commented on YARN-1050: -- +1 (nonbinding) The fields are missing, but that can be done as a follow up JIRA. > Document the Fair Scheduler REST API > > > Key: YARN-1050 > URL: https://issues.apache.org/jira/browse/YARN-1050 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation, fairscheduler >Reporter: Sandy Ryza >Assignee: Kenji Kikushima > Labels: BB2015-05-TBR > Attachments: YARN-1050-2.patch, YARN-1050-3.patch, YARN-1050-4.patch, > YARN-1050.patch > > > The documentation should be placed here along with the Capacity Scheduler > documentation: > http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Scheduler_API -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3381) A typographical error in "InvalidStateTransitonException"
[ https://issues.apache.org/jira/browse/YARN-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535823#comment-14535823 ] Brahma Reddy Battula commented on YARN-3381: do you mean , I need to upload patch based on [~vinodkv]..? > A typographical error in "InvalidStateTransitonException" > - > > Key: YARN-3381 > URL: https://issues.apache.org/jira/browse/YARN-3381 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.6.0 >Reporter: Xiaoshuang LU >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3381-002.patch, YARN-3381-003.patch, YARN-3381.patch > > > Appears that "InvalidStateTransitonException" should be > "InvalidStateTransitionException". Transition was misspelled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1287) Consolidate MockClocks
[ https://issues.apache.org/jira/browse/YARN-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-1287: Attachment: YARN-1287.004.patch Added a patch that uses ControlledClock instead of MockClock and updated ControlledClock to add those convenience methods > Consolidate MockClocks > -- > > Key: YARN-1287 > URL: https://issues.apache.org/jira/browse/YARN-1287 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Sandy Ryza >Assignee: Sebastian Wong > Labels: newbie > Attachments: YARN-1287-3.patch, YARN-1287.004.patch > > > A bunch of different tests have near-identical implementations of MockClock. > TestFairScheduler, TestFSSchedulerApp, and TestCgroupsLCEResourcesHandler for > example. They should be consolidated into a single MockClock. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3381) A typographical error in "InvalidStateTransitonException"
[ https://issues.apache.org/jira/browse/YARN-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535809#comment-14535809 ] Ray Chiang commented on YARN-3381: -- Heh. I've avoided all the typos in class names. I didn't think of that approach to fixing a typo in a class name. Putting back TBR label until we get an updated patch. > A typographical error in "InvalidStateTransitonException" > - > > Key: YARN-3381 > URL: https://issues.apache.org/jira/browse/YARN-3381 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.6.0 >Reporter: Xiaoshuang LU >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3381-002.patch, YARN-3381-003.patch, YARN-3381.patch > > > Appears that "InvalidStateTransitonException" should be > "InvalidStateTransitionException". Transition was misspelled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3381) A typographical error in "InvalidStateTransitonException"
[ https://issues.apache.org/jira/browse/YARN-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-3381: - Labels: BB2015-05-TBR (was: ) > A typographical error in "InvalidStateTransitonException" > - > > Key: YARN-3381 > URL: https://issues.apache.org/jira/browse/YARN-3381 > Project: Hadoop YARN > Issue Type: Improvement > Components: api >Affects Versions: 2.6.0 >Reporter: Xiaoshuang LU >Assignee: Brahma Reddy Battula > Labels: BB2015-05-TBR > Attachments: YARN-3381-002.patch, YARN-3381-003.patch, YARN-3381.patch > > > Appears that "InvalidStateTransitonException" should be > "InvalidStateTransitionException". Transition was misspelled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2206) Update document for applications REST API response examples
[ https://issues.apache.org/jira/browse/YARN-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhijie Shen updated YARN-2206: -- Labels: newbie (was: ) > Update document for applications REST API response examples > --- > > Key: YARN-2206 > URL: https://issues.apache.org/jira/browse/YARN-2206 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 2.4.0 >Reporter: Kenji Kikushima >Assignee: Kenji Kikushima >Priority: Minor > Labels: newbie > Attachments: YARN-2206-002.patch, YARN-2206.patch > > > In ResourceManagerRest.apt.vm, Applications API responses are missing some > elements. > - JSON response should have "applicationType" and "applicationTags". > - XML response should have "applicationTags". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-3608) Apps submitted to MiniYarnCluster always stay in ACCEPTED state.
[ https://issues.apache.org/jira/browse/YARN-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Spandan Dutta reopened YARN-3608: - > Apps submitted to MiniYarnCluster always stay in ACCEPTED state. > > > Key: YARN-3608 > URL: https://issues.apache.org/jira/browse/YARN-3608 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 2.6.0 >Reporter: Spandan Dutta > > So I adapted a test case to submit a yarn app to a MiniYarnCluster and wait > for it to reach running state. Turns out that the app gets stuck in > "ACCEPTED" state. > {noformat} > @Test > public void testGetAllQueues() throws IOException, YarnException, > InterruptedException { > MiniYARNCluster cluster = new MiniYARNCluster("testMRAMTokens", 1, 1, 1); > YarnClient rmClient = null; > try { > cluster.init(new YarnConfiguration()); > cluster.start(); > final Configuration yarnConf = cluster.getConfig(); > rmClient = YarnClient.createYarnClient(); > rmClient.init(yarnConf); > rmClient.start(); > YarnClientApplication newApp = rmClient.createApplication(); > ApplicationId appId = > newApp.getNewApplicationResponse().getApplicationId(); > // Create launch context for app master > ApplicationSubmissionContext appContext > = Records.newRecord(ApplicationSubmissionContext.class); > // set the application id > appContext.setApplicationId(appId); > // set the application name > appContext.setApplicationName("test"); > // Set up the container launch context for the application master > ContainerLaunchContext amContainer > = Records.newRecord(ContainerLaunchContext.class); > appContext.setAMContainerSpec(amContainer); > appContext.setResource(Resource.newInstance(1024, 1)); > // Submit the application to the applications manager > rmClient.submitApplication(appContext); > ApplicationReport applicationReport = > rmClient.getApplicationReport(appContext.getApplicationId()); > int timeout = 10; > while(timeout > 0 && applicationReport.getYarnApplicationState() != > YarnApplicationState.RUNNING) { > Thread.sleep(5 * 1000); > timeout--; > } > Assert.assertTrue(timeout != 0); > Assert.assertTrue(applicationReport.getYarnApplicationState() > == YarnApplicationState.RUNNING); > List queues = rmClient.getAllQueues(); > Assert.assertNotNull(queues); > Assert.assertTrue(!queues.isEmpty()); > QueueInfo queue = queues.get(0); > List queueApplications = queue.getApplications(); > Assert.assertFalse(queueApplications.isEmpty()); > } catch (YarnException e) { > Assert.assertTrue(e.getMessage().contains("Failed to submit")); > } finally { > if (rmClient != null) { > rmClient.stop(); > } > cluster.stop(); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3608) Apps submitted to MiniYarnCluster always stay in ACCEPTED state.
[ https://issues.apache.org/jira/browse/YARN-3608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535802#comment-14535802 ] Spandan Dutta commented on YARN-3608: - I think this is a bug at the MiniYarnCluster implementation part. The app gets stuck for a MiniYarnCluster. > Apps submitted to MiniYarnCluster always stay in ACCEPTED state. > > > Key: YARN-3608 > URL: https://issues.apache.org/jira/browse/YARN-3608 > Project: Hadoop YARN > Issue Type: Bug > Components: applications >Affects Versions: 2.6.0 >Reporter: Spandan Dutta > > So I adapted a test case to submit a yarn app to a MiniYarnCluster and wait > for it to reach running state. Turns out that the app gets stuck in > "ACCEPTED" state. > {noformat} > @Test > public void testGetAllQueues() throws IOException, YarnException, > InterruptedException { > MiniYARNCluster cluster = new MiniYARNCluster("testMRAMTokens", 1, 1, 1); > YarnClient rmClient = null; > try { > cluster.init(new YarnConfiguration()); > cluster.start(); > final Configuration yarnConf = cluster.getConfig(); > rmClient = YarnClient.createYarnClient(); > rmClient.init(yarnConf); > rmClient.start(); > YarnClientApplication newApp = rmClient.createApplication(); > ApplicationId appId = > newApp.getNewApplicationResponse().getApplicationId(); > // Create launch context for app master > ApplicationSubmissionContext appContext > = Records.newRecord(ApplicationSubmissionContext.class); > // set the application id > appContext.setApplicationId(appId); > // set the application name > appContext.setApplicationName("test"); > // Set up the container launch context for the application master > ContainerLaunchContext amContainer > = Records.newRecord(ContainerLaunchContext.class); > appContext.setAMContainerSpec(amContainer); > appContext.setResource(Resource.newInstance(1024, 1)); > // Submit the application to the applications manager > rmClient.submitApplication(appContext); > ApplicationReport applicationReport = > rmClient.getApplicationReport(appContext.getApplicationId()); > int timeout = 10; > while(timeout > 0 && applicationReport.getYarnApplicationState() != > YarnApplicationState.RUNNING) { > Thread.sleep(5 * 1000); > timeout--; > } > Assert.assertTrue(timeout != 0); > Assert.assertTrue(applicationReport.getYarnApplicationState() > == YarnApplicationState.RUNNING); > List queues = rmClient.getAllQueues(); > Assert.assertNotNull(queues); > Assert.assertTrue(!queues.isEmpty()); > QueueInfo queue = queues.get(0); > List queueApplications = queue.getApplications(); > Assert.assertFalse(queueApplications.isEmpty()); > } catch (YarnException e) { > Assert.assertTrue(e.getMessage().contains("Failed to submit")); > } finally { > if (rmClient != null) { > rmClient.stop(); > } > cluster.stop(); > } > } > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2306) leak of reservation metrics (fair scheduler)
[ https://issues.apache.org/jira/browse/YARN-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535799#comment-14535799 ] Ray Chiang commented on YARN-2306: -- Running against trunk, I got failures 3 times in 3 runs. > leak of reservation metrics (fair scheduler) > > > Key: YARN-2306 > URL: https://issues.apache.org/jira/browse/YARN-2306 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Hong Zhiguo >Assignee: Hong Zhiguo >Priority: Minor > Attachments: YARN-2306-2.patch, YARN-2306.patch > > > This only applies to fair scheduler. Capacity scheduler is OK. > When appAttempt or node is removed, the metrics for > reservation(reservedContainers, reservedMB, reservedVCores) is not reduced > back. > These are important metrics for administrator. The wrong metrics confuses may > confuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3134) [Storage implementation] Exploiting the option of using Phoenix to access HBase backend
[ https://issues.apache.org/jira/browse/YARN-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535797#comment-14535797 ] Zhijie Shen commented on YARN-3134: --- bq. Looked into the concurrency bug. The problem is caused by concurrent operations on Connections and the Guava cache removalListener calls. So on cache evictions, active connections may be mistakenly closed. I believe a concurrent algorithm to resolve this is possible, but not quite trivial. For now, I'm removing the connection cache to make the first step right. I'll change the description of YARN-3595 for the connection cache. Yeah, let's optimize the connection separately. And currently, Phoenix dependency version is 4.3.0. I assume we're going to change it to 4.4+ in YARN-3529. The last patch looks good to me overall. I suggest deferring further code optimization as a follow up task if necessary. Let's use this version for POC. I'll commit the patch late today if no more comments come in. > [Storage implementation] Exploiting the option of using Phoenix to access > HBase backend > --- > > Key: YARN-3134 > URL: https://issues.apache.org/jira/browse/YARN-3134 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Zhijie Shen >Assignee: Li Lu > Attachments: SettingupPhoenixstorageforatimelinev2end-to-endtest.pdf, > YARN-3134-040915_poc.patch, YARN-3134-041015_poc.patch, > YARN-3134-041415_poc.patch, YARN-3134-042115.patch, YARN-3134-042715.patch, > YARN-3134-YARN-2928.001.patch, YARN-3134-YARN-2928.002.patch, > YARN-3134-YARN-2928.003.patch, YARN-3134-YARN-2928.004.patch, > YARN-3134-YARN-2928.005.patch, YARN-3134-YARN-2928.006.patch, > YARN-3134-YARN-2928.007.patch, YARN-3134DataSchema.pdf, > hadoop-zshen-nodemanager-d-128-95-184-84.dhcp4.washington.edu.out > > > Quote the introduction on Phoenix web page: > {code} > Apache Phoenix is a relational database layer over HBase delivered as a > client-embedded JDBC driver targeting low latency queries over HBase data. > Apache Phoenix takes your SQL query, compiles it into a series of HBase > scans, and orchestrates the running of those scans to produce regular JDBC > result sets. The table metadata is stored in an HBase table and versioned, > such that snapshot queries over prior versions will automatically use the > correct schema. Direct use of the HBase API, along with coprocessors and > custom filters, results in performance on the order of milliseconds for small > queries, or seconds for tens of millions of rows. > {code} > It may simply our implementation read/write data from/to HBase, and can > easily build index and compose complex query. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1426) YARN Components need to unregister their beans upon shutdown
[ https://issues.apache.org/jira/browse/YARN-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated YARN-1426: -- Attachment: YARN-1426.2.patch Removed extra whitespace. Checkstyle is regarding method with too many lines. This patch doesn't add any lines to the method, only modifies one line. > YARN Components need to unregister their beans upon shutdown > > > Key: YARN-1426 > URL: https://issues.apache.org/jira/browse/YARN-1426 > Project: Hadoop YARN > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0 >Reporter: Jonathan Eagles >Assignee: Jonathan Eagles > Labels: BB2015-05-TBR > Attachments: YARN-1426.2.patch, YARN-1426.patch, YARN-1426.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3587) Fix the javadoc of DelegationTokenSecretManager in yarn project
[ https://issues.apache.org/jira/browse/YARN-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535791#comment-14535791 ] Junping Du commented on YARN-3587: -- Thanks [~gliptak] for updating the patch. Latest patch LGTM. Also, I checked other reviewer's comments in YARN-3599 get addressed as well. +1 pending on Jenkins result. It could be one line or 2 slightly longer than 80 characters but if that is only complain together with no unit test, I should be fine. > Fix the javadoc of DelegationTokenSecretManager in yarn project > --- > > Key: YARN-3587 > URL: https://issues.apache.org/jira/browse/YARN-3587 > Project: Hadoop YARN > Issue Type: Bug > Components: documentation >Affects Versions: 2.7.0 >Reporter: Akira AJISAKA >Assignee: Gabor Liptak >Priority: Minor > Labels: newbie > Attachments: YARN-3587.1.patch, YARN-3587.patch > > > In RMDelegationTokenSecretManager and TimelineDelegationTokenSecretManager, > the javadoc of the constructor is as follows: > {code} > /** >* Create a secret manager >* @param delegationKeyUpdateInterval the number of seconds for rolling new >*secret keys. >* @param delegationTokenMaxLifetime the maximum lifetime of the delegation >*tokens >* @param delegationTokenRenewInterval how often the tokens must be renewed >* @param delegationTokenRemoverScanInterval how often the tokens are > scanned >*for expired tokens >*/ > {code} > 1. "the number of seconds" should be "the number of milliseconds". > 2. It's better to add time unit to the description of other parameters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3569) YarnClient.getAllQueues returns a list of queues that do not display running apps.
[ https://issues.apache.org/jira/browse/YARN-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Spandan Dutta updated YARN-3569: Attachment: YARN-3569.patch > YarnClient.getAllQueues returns a list of queues that do not display running > apps. > -- > > Key: YARN-3569 > URL: https://issues.apache.org/jira/browse/YARN-3569 > Project: Hadoop YARN > Issue Type: Bug > Components: api >Affects Versions: 2.8.0 >Reporter: Spandan Dutta >Assignee: Spandan Dutta > Attachments: YARN-3569.patch > > > YarnClient.getAllQueues() returns a list of queues. If we pick a queue from > this list and call getApplications on it, we always get an empty list > even-though applications are running on that queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3476) Nodemanager can fail to delete local logs if log aggregation fails
[ https://issues.apache.org/jira/browse/YARN-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535784#comment-14535784 ] Jason Lowe commented on YARN-3476: -- +1 lgtm. Test failure is unrelated, and I'll fix whitespace nit on commit. > Nodemanager can fail to delete local logs if log aggregation fails > -- > > Key: YARN-3476 > URL: https://issues.apache.org/jira/browse/YARN-3476 > Project: Hadoop YARN > Issue Type: Bug > Components: log-aggregation, nodemanager >Affects Versions: 2.6.0 >Reporter: Jason Lowe >Assignee: Rohith > Labels: BB2015-05-TBR > Attachments: 0001-YARN-3476.patch, 0001-YARN-3476.patch, > 0002-YARN-3476.patch > > > If log aggregation encounters an error trying to upload the file then the > underlying TFile can throw an illegalstateexception which will bubble up > through the top of the thread and prevent the application logs from being > deleted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2306) leak of reservation metrics (fair scheduler)
[ https://issues.apache.org/jira/browse/YARN-2306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-2306: - Labels: (was: BB2015-05-TBR) > leak of reservation metrics (fair scheduler) > > > Key: YARN-2306 > URL: https://issues.apache.org/jira/browse/YARN-2306 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Hong Zhiguo >Assignee: Hong Zhiguo >Priority: Minor > Attachments: YARN-2306-2.patch, YARN-2306.patch > > > This only applies to fair scheduler. Capacity scheduler is OK. > When appAttempt or node is removed, the metrics for > reservation(reservedContainers, reservedMB, reservedVCores) is not reduced > back. > These are important metrics for administrator. The wrong metrics confuses may > confuse them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2206) Update document for applications REST API response examples
[ https://issues.apache.org/jira/browse/YARN-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535780#comment-14535780 ] Hadoop QA commented on YARN-2206: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 2m 52s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | release audit | 0m 20s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | site | 2m 56s | Site still builds. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | | | 6m 11s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12731602/YARN-2206-002.patch | | Optional Tests | site | | git revision | trunk / effcc5c | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/7827/console | This message was automatically generated. > Update document for applications REST API response examples > --- > > Key: YARN-2206 > URL: https://issues.apache.org/jira/browse/YARN-2206 > Project: Hadoop YARN > Issue Type: Improvement > Components: documentation >Affects Versions: 2.4.0 >Reporter: Kenji Kikushima >Assignee: Kenji Kikushima >Priority: Minor > Attachments: YARN-2206-002.patch, YARN-2206.patch > > > In ResourceManagerRest.apt.vm, Applications API responses are missing some > elements. > - JSON response should have "applicationType" and "applicationTags". > - XML response should have "applicationTags". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3473) Fix RM Web UI configuration for some properties
[ https://issues.apache.org/jira/browse/YARN-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ray Chiang updated YARN-3473: - Labels: BB2015-05-RFC supportability (was: BB2015-05-RFC) > Fix RM Web UI configuration for some properties > --- > > Key: YARN-3473 > URL: https://issues.apache.org/jira/browse/YARN-3473 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.7.0 >Reporter: Ray Chiang >Assignee: Ray Chiang >Priority: Minor > Labels: BB2015-05-RFC, supportability > Attachments: YARN-3473.001.patch > > > Using the RM Web UI, the Tools->Configuration page shows some properties as > something like "BufferedInputStream" instead of the appropriate .xml file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)