[jira] [Commented] (YARN-3975) WebAppProxyServlet should not redirect to RM page if AHS is enabled
[ https://issues.apache.org/jira/browse/YARN-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700271#comment-14700271 ] Hadoop QA commented on YARN-3975: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 4s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 10s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 50s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 25s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 42s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 6m 58s | Tests passed in hadoop-yarn-client. | | {color:green}+1{color} | yarn tests | 0m 22s | Tests passed in hadoop-yarn-server-web-proxy. | | | | 47m 34s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750866/YARN-3975.6.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / c77bd6a | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8867/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | hadoop-yarn-server-web-proxy test log | https://builds.apache.org/job/PreCommit-YARN-Build/8867/artifact/patchprocess/testrun_hadoop-yarn-server-web-proxy.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8867/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8867/console | This message was automatically generated. WebAppProxyServlet should not redirect to RM page if AHS is enabled --- Key: YARN-3975 URL: https://issues.apache.org/jira/browse/YARN-3975 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.1 Reporter: Mit Desai Assignee: Mit Desai Attachments: YARN-3975.2.b2.patch, YARN-3975.3.patch, YARN-3975.4.patch, YARN-3975.5.patch, YARN-3975.6.patch WebAppProxyServlet should be updated to handle the case when the appreport doesn't have a tracking URL and the Application History Server is eanbled. As we would have already tried the RM and got the ApplicationNotFoundException we should not direct the user to the RM app page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-679) add an entry point that can start any Yarn service
[ https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700290#comment-14700290 ] Hadoop QA commented on YARN-679: \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 56s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 31 new or modified test files. | | {color:red}-1{color} | javac | 7m 51s | The applied patch generated 8 additional warning messages. | | {color:green}+1{color} | javadoc | 9m 41s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 6s | The applied patch generated 165 new checkstyle issues (total was 140, now 302). | | {color:red}-1{color} | whitespace | 0m 9s | The patch has 5 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 20s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 51s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 22m 29s | Tests failed in hadoop-common. | | | | 62m 25s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.net.TestNetUtils | | | hadoop.ha.TestZKFailoverController | | | hadoop.service.launcher.TestServiceLaunchedRunning | | | hadoop.service.launcher.TestServiceLaunchNoArgsAllowed | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750874/YARN-679-004.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / c77bd6a | | javac | https://builds.apache.org/job/PreCommit-YARN-Build/8866/artifact/patchprocess/diffJavacWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8866/artifact/patchprocess/diffcheckstylehadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/8866/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8866/artifact/patchprocess/testrun_hadoop-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8866/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8866/console | This message was automatically generated. add an entry point that can start any Yarn service -- Key: YARN-679 URL: https://issues.apache.org/jira/browse/YARN-679 Project: Hadoop YARN Issue Type: New Feature Components: api Affects Versions: 2.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Labels: BB2015-05-TBR Attachments: YARN-679-001.patch, YARN-679-002.patch, YARN-679-002.patch, YARN-679-003.patch, YARN-679-004.patch, org.apache.hadoop.servic...mon 3.0.0-SNAPSHOT API).pdf Time Spent: 72h Remaining Estimate: 0h There's no need to write separate .main classes for every Yarn service, given that the startup mechanism should be identical: create, init, start, wait for stopped -with an interrupt handler to trigger a clean shutdown on a control-c interrrupt. Provide one that takes any classname, and a list of config files/options -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700755#comment-14700755 ] Rohith Sharma K S commented on YARN-4014: - Updated the patch fixing race condition in updating the priority vs SchedulerApplicationAttemp creation which would take up old priority rather than updated priority. Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch, 0003-YARN-4014.patch, 0004-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699424#comment-14699424 ] Hadoop QA commented on YARN-4024: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 27s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 44s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 44s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 29s | The applied patch generated 1 new checkstyle issues (total was 211, now 211). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 59s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 0m 19s | Tests failed in hadoop-yarn-api. | | {color:red}-1{color} | yarn tests | 52m 58s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 95m 12s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-resourcemanager | | Failed unit tests | hadoop.yarn.conf.TestYarnConfigurationFields | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation | | | hadoop.yarn.server.resourcemanager.rmapp.TestNodesListManager | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750780/YARN-4024-draft.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 13604bd | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8861/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8861/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8861/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8861/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8861/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8861/console | This message was automatically generated. YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat -- Key: YARN-4024 URL: https://issues.apache.org/jira/browse/YARN-4024 Project: Hadoop YARN Issue Type: Improvement Reporter: Wangda Tan Assignee: Hong Zhiguo Attachments: YARN-4024-draft.patch Currently, YARN RM NodesListManager will resolve IP address every time when node doing heartbeat. When DNS server becomes slow, NM heartbeat will be blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3534) Collect memory/cpu usage on the node
[ https://issues.apache.org/jira/browse/YARN-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699363#comment-14699363 ] Hudson commented on YARN-3534: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #290 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/290/]) YARN-3534. Collect memory/cpu usage on the node. (Inigo Goiri via kasha) (kasha: rev def12933b38efd5e47c5144b729c1a1496f09229) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeResourceMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeResourceMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeResourceMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Collect memory/cpu usage on the node Key: YARN-3534 URL: https://issues.apache.org/jira/browse/YARN-3534 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Affects Versions: 2.7.0 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-3534-1.patch, YARN-3534-10.patch, YARN-3534-11.patch, YARN-3534-12.patch, YARN-3534-14.patch, YARN-3534-15.patch, YARN-3534-16.patch, YARN-3534-16.patch, YARN-3534-17.patch, YARN-3534-17.patch, YARN-3534-18.patch, YARN-3534-2.patch, YARN-3534-3.patch, YARN-3534-3.patch, YARN-3534-4.patch, YARN-3534-5.patch, YARN-3534-6.patch, YARN-3534-7.patch, YARN-3534-8.patch, YARN-3534-9.patch Original Estimate: 336h Remaining Estimate: 336h YARN should be aware of the resource utilization of the nodes when scheduling containers. For this, this task will implement the collection of memory/cpu usage on the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4055) Report node resource utilization in heartbeat
[ https://issues.apache.org/jira/browse/YARN-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699364#comment-14699364 ] Hudson commented on YARN-4055: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #290 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/290/]) YARN-4055. Report node resource utilization in heartbeat. (Inigo Goiri via kasha) (kasha: rev 13604bd5f119fc81b9942190dfa366afad61bc92) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java Report node resource utilization in heartbeat - Key: YARN-4055 URL: https://issues.apache.org/jira/browse/YARN-4055 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.1 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-4055-v0.patch, YARN-4055-v1.patch Send the resource utilization from the node (obtained in the NodeResourceMonitor) to the RM in the heartbeat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup
[ https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699328#comment-14699328 ] Hadoop QA commented on YARN-2923: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 11s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 57s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 2s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 44s | The applied patch generated 1 new checkstyle issues (total was 211, now 211). | | {color:green}+1{color} | whitespace | 0m 7s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 44s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 4m 22s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-api. | | {color:red}-1{color} | yarn tests | 1m 56s | Tests failed in hadoop-yarn-common. | | {color:red}-1{color} | yarn tests | 6m 16s | Tests failed in hadoop-yarn-server-nodemanager. | | | | 55m 9s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.util.TestRackResolver | | | hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750760/YARN-2923.20150817-1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 13604bd | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8859/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8859/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8859/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8859/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8859/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8859/console | This message was automatically generated. Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup - Key: YARN-2923 URL: https://issues.apache.org/jira/browse/YARN-2923 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Naganarasimha G R Assignee: Naganarasimha G R Fix For: 2.8.0 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, YARN-2923.20150517-1.patch, YARN-2923.20150817-1.patch As part of Distributed Node Labels configuration we need to support Node labels to be configured in Yarn-site.xml. And on modification of Node Labels configuration in yarn-site.xml, NM should be able to get modified Node labels from this NodeLabelsprovider service without NM restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3980) Plumb resource-utilization info in node heartbeat through to the scheduler
[ https://issues.apache.org/jira/browse/YARN-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699244#comment-14699244 ] Hadoop QA commented on YARN-3980: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 21m 31s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 4 new or modified test files. | | {color:green}+1{color} | javac | 10m 12s | There were no new javac warning messages. | | {color:red}-1{color} | javadoc | 12m 47s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | release audit | 0m 31s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 40s | The applied patch generated 6 new checkstyle issues (total was 263, now 269). | | {color:green}+1{color} | whitespace | 0m 4s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 45s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 48s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | tools/hadoop tests | 0m 56s | Tests passed in hadoop-sls. | | {color:red}-1{color} | yarn tests | 47m 50s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 100m 47s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation | | | hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens | | Timed out tests | org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750755/YARN-3980-v2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 13604bd | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/8858/artifact/patchprocess/diffJavadocWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8858/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | hadoop-sls test log | https://builds.apache.org/job/PreCommit-YARN-Build/8858/artifact/patchprocess/testrun_hadoop-sls.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8858/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8858/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8858/console | This message was automatically generated. Plumb resource-utilization info in node heartbeat through to the scheduler -- Key: YARN-3980 URL: https://issues.apache.org/jira/browse/YARN-3980 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.7.1 Reporter: Karthik Kambatla Assignee: Inigo Goiri Attachments: YARN-3980-v0.patch, YARN-3980-v1.patch, YARN-3980-v2.patch YARN-1012 and YARN-3534 collect resource utilization information for all containers and the node respectively and send it to the RM on node heartbeat. We should plumb it through to the scheduler so the scheduler can make use of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time
[ https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-4057: --- Attachment: YARN-4057.01.patch If ContainersMonitor is not enabled, only print related log info one time - Key: YARN-4057 URL: https://issues.apache.org/jira/browse/YARN-4057 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Priority: Minor Attachments: YARN-4057.01.patch ContainersMonitorImpl will check whether it is enabled when handling every event, and it will print following messages again and again if not enabled: {quote} 2015-08-17 13:20:13,792 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory is needed. Not running the monitor-thread {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699285#comment-14699285 ] Hong Zhiguo commented on YARN-4024: --- In this patch, both positive and negative lookup result is cached and has the same expiry interval. YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat -- Key: YARN-4024 URL: https://issues.apache.org/jira/browse/YARN-4024 Project: Hadoop YARN Issue Type: Improvement Reporter: Wangda Tan Assignee: Hong Zhiguo Attachments: YARN-4024-draft.patch Currently, YARN RM NodesListManager will resolve IP address every time when node doing heartbeat. When DNS server becomes slow, NM heartbeat will be blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4055) Report node resource utilization in heartbeat
[ https://issues.apache.org/jira/browse/YARN-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699361#comment-14699361 ] Hudson commented on YARN-4055: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1020 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1020/]) YARN-4055. Report node resource utilization in heartbeat. (Inigo Goiri via kasha) (kasha: rev 13604bd5f119fc81b9942190dfa366afad61bc92) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java Report node resource utilization in heartbeat - Key: YARN-4055 URL: https://issues.apache.org/jira/browse/YARN-4055 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.1 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-4055-v0.patch, YARN-4055-v1.patch Send the resource utilization from the node (obtained in the NodeResourceMonitor) to the RM in the heartbeat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3534) Collect memory/cpu usage on the node
[ https://issues.apache.org/jira/browse/YARN-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699360#comment-14699360 ] Hudson commented on YARN-3534: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1020 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1020/]) YARN-3534. Collect memory/cpu usage on the node. (Inigo Goiri via kasha) (kasha: rev def12933b38efd5e47c5144b729c1a1496f09229) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeResourceMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeResourceMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeResourceMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml Collect memory/cpu usage on the node Key: YARN-3534 URL: https://issues.apache.org/jira/browse/YARN-3534 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Affects Versions: 2.7.0 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-3534-1.patch, YARN-3534-10.patch, YARN-3534-11.patch, YARN-3534-12.patch, YARN-3534-14.patch, YARN-3534-15.patch, YARN-3534-16.patch, YARN-3534-16.patch, YARN-3534-17.patch, YARN-3534-17.patch, YARN-3534-18.patch, YARN-3534-2.patch, YARN-3534-3.patch, YARN-3534-3.patch, YARN-3534-4.patch, YARN-3534-5.patch, YARN-3534-6.patch, YARN-3534-7.patch, YARN-3534-8.patch, YARN-3534-9.patch Original Estimate: 336h Remaining Estimate: 336h YARN should be aware of the resource utilization of the nodes when scheduling containers. For this, this task will implement the collection of memory/cpu usage on the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Zhiguo updated YARN-4024: -- Attachment: YARN-4024-draft.patch Add an configuration option yarn.resourcemanager.node-ip-cache.expiry-interval-secs, while -1 disables caching. YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat -- Key: YARN-4024 URL: https://issues.apache.org/jira/browse/YARN-4024 Project: Hadoop YARN Issue Type: Improvement Reporter: Wangda Tan Assignee: Hong Zhiguo Attachments: YARN-4024-draft.patch Currently, YARN RM NodesListManager will resolve IP address every time when node doing heartbeat. When DNS server becomes slow, NM heartbeat will be blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time
[ https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699375#comment-14699375 ] Hadoop QA commented on YARN-4057: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 2s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 49s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 46s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 36s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 22s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 15s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 6m 17s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 44m 7s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750783/YARN-4057.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 13604bd | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8860/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8860/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8860/console | This message was automatically generated. If ContainersMonitor is not enabled, only print related log info one time - Key: YARN-4057 URL: https://issues.apache.org/jira/browse/YARN-4057 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Priority: Minor Attachments: YARN-4057.01.patch ContainersMonitorImpl will check whether it is enabled when handling every event, and it will print following messages again and again if not enabled: {quote} 2015-08-17 13:20:13,792 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory is needed. Not running the monitor-thread {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4054) Fix findbugs warnings in YARN-2928 branch
[ https://issues.apache.org/jira/browse/YARN-4054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699418#comment-14699418 ] Junping Du commented on YARN-4054: -- Hi [~varun_saxena], thanks for your comments! bq. We can keep on closing it and reopening it whenever new findbugs warnings come. In general, we don't recommend to keep track of multiple commits within the same JIRA. The reason is for release as it will bring extra difficulties to track down commits with the same JIRA number. I saw you already resolve this. My additional comments on fixing minor things (findbugs, checkstyle, etc.) on a dev branch is: if it related to trunk, leave it to the fix on trunk or it will have unnecessary merge conflict in future; if there are many things to fix, we can create a separated JIRA to fix or merge the single line of fix to an existing patch sounds like a better way. Fix findbugs warnings in YARN-2928 branch - Key: YARN-4054 URL: https://issues.apache.org/jira/browse/YARN-4054 Project: Hadoop YARN Issue Type: Bug Reporter: Varun Saxena Assignee: Varun Saxena -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode
[ https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700376#comment-14700376 ] zhihai xu commented on YARN-3857: - +1 for the latest patch, If no objection, I will commit it tomorrow. Memory leak in ResourceManager with SIMPLE mode --- Key: YARN-3857 URL: https://issues.apache.org/jira/browse/YARN-3857 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: mujunchao Assignee: mujunchao Priority: Critical Labels: patch Fix For: 2.7.2 Attachments: YARN-3857-1.patch, YARN-3857-2.patch, YARN-3857-3.patch, YARN-3857-4.patch, hadoop-yarn-server-resourcemanager.patch We register the ClientTokenMasterKey to avoid client may hold an invalid ClientToken after RM restarts. In SIMPLE mode, we register PairApplicationAttemptId, null , But we never remove it from HashMap, as unregister only runing while in Security mode, so memory leak coming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code
[ https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700524#comment-14700524 ] Li Lu commented on YARN-4025: - Latest patch LGTM. Thanks [~sjlee0]! Deal with byte representations of Longs in writer code -- Key: YARN-4025 URL: https://issues.apache.org/jira/browse/YARN-4025 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Sangjin Lee Attachments: YARN-4025-YARN-2928.001.patch, YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl code. There seem to be some places in the code where there are conversions between Long to byte[] to String for easier argument passing between function calls. Then these values end up being converted back to byte[] while storing in hbase. It would be better to pass around byte[] or the Longs themselves as applicable. This may result in some api changes (store function) as well in adding a few more function calls like getColumnQualifier which accepts a pre-encoded byte array. It will be in addition to the existing api which accepts a String and the ColumnHelper to return a byte[] column name instead of a String one. Filing jira to track these changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4058) Miscellaneous issues in NodeManager project
Naganarasimha G R created YARN-4058: --- Summary: Miscellaneous issues in NodeManager project Key: YARN-4058 URL: https://issues.apache.org/jira/browse/YARN-4058 Project: Hadoop YARN Issue Type: Sub-task Reporter: Naganarasimha G R Assignee: Naganarasimha G R Priority: Minor # TestSystemMetricsPublisherForV2.testPublishApplicationMetrics is failing # Unused ApplicationACLsManager in ContainerManagerImpl # In ContainerManagerImpl.startContainerInternal ApplicationImpl instance is created and then checked whether it exists in context.getApplications(). everytime ApplicationImpl is created state machine is intialized and TimelineClient is created which is required only if added to the context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code
[ https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700628#comment-14700628 ] Hadoop QA commented on YARN-4025: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 17m 25s | Findbugs (version ) appears to be broken on YARN-2928. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 29s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 17s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 6s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 43s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 55s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 27s | Tests passed in hadoop-yarn-server-timelineservice. | | | | 41m 57s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750907/YARN-4025-YARN-2928.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / a029ce1 | | hadoop-yarn-server-timelineservice test log | https://builds.apache.org/job/PreCommit-YARN-Build/8870/artifact/patchprocess/testrun_hadoop-yarn-server-timelineservice.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8870/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8870/console | This message was automatically generated. Deal with byte representations of Longs in writer code -- Key: YARN-4025 URL: https://issues.apache.org/jira/browse/YARN-4025 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Sangjin Lee Attachments: YARN-4025-YARN-2928.001.patch, YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl code. There seem to be some places in the code where there are conversions between Long to byte[] to String for easier argument passing between function calls. Then these values end up being converted back to byte[] while storing in hbase. It would be better to pass around byte[] or the Longs themselves as applicable. This may result in some api changes (store function) as well in adding a few more function calls like getColumnQualifier which accepts a pre-encoded byte array. It will be in addition to the existing api which accepts a String and the ColumnHelper to return a byte[] column name instead of a String one. Filing jira to track these changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1644) RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing
[ https://issues.apache.org/jira/browse/YARN-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700676#comment-14700676 ] Jian He commented on YARN-1644: --- After thinking more, we may not need a version Id. Below is my suggestion: 1) still keep the separate increasedContainers in the protocol 2) If increased-container RMContainer, RM ignores this. 3) For the race condition on RM recovery, we may fix this by synchronizing ContainerMangagerImpl#increaseContainer and NM register call. RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing - Key: YARN-1644 URL: https://issues.apache.org/jira/browse/YARN-1644 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Wangda Tan Assignee: MENG DING Attachments: YARN-1644-YARN-1197.4.patch, YARN-1644-YARN-1197.5.patch, YARN-1644.1.patch, YARN-1644.2.patch, YARN-1644.3.patch, yarn-1644.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700381#comment-14700381 ] Sangjin Lee commented on YARN-4053: --- And I do think that we need to support floating type values. Another scenario to think about is what if users write metric values in an inconsistent manner. Suppose the user stored an integral value for a metric initially, but later attempted to store a floating value for the same metric. It sounds like it could be a silent failure? This should be a rare occurrence, but I think we need to give it some thought... Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time
[ https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-4057: Hadoop Flags: Reviewed If ContainersMonitor is not enabled, only print related log info one time - Key: YARN-4057 URL: https://issues.apache.org/jira/browse/YARN-4057 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Priority: Minor Attachments: YARN-4057.01.patch ContainersMonitorImpl will check whether it is enabled when handling every event, and it will print following messages again and again if not enabled: {quote} 2015-08-17 13:20:13,792 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory is needed. Not running the monitor-thread {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3545) Investigate the concurrency issue with the map of timeline collector
[ https://issues.apache.org/jira/browse/YARN-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700542#comment-14700542 ] Li Lu commented on YARN-3545: - Hi [~xgong], you're right that right now this critical section can only be accessed by the AM and RM, and there's no concurrency for the same application id (unless I'm missing something, which is possible :) ). The fix here is to be consistent with the original TimelineCollectorManager, which has a synchronized map to store appId-collector mappings. We may want to preserve this thread-safe semantic for collector managers for future usages. Right now this patch is stale and I would personally assign it with a low priority. Investigate the concurrency issue with the map of timeline collector Key: YARN-3545 URL: https://issues.apache.org/jira/browse/YARN-3545 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Li Lu Attachments: YARN-3545-YARN-2928.000.patch See the discussion in YARN-3390 for details. Let's continue the discussion here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup
[ https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-2923: Attachment: YARN-2923.20150818-1.patch Test cases are passing locally multiple times. Adding a patch to get more logs for TestNodeStatusUpdaterForLabels Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup - Key: YARN-2923 URL: https://issues.apache.org/jira/browse/YARN-2923 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Naganarasimha G R Assignee: Naganarasimha G R Fix For: 2.8.0 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, YARN-2923.20150517-1.patch, YARN-2923.20150817-1.patch, YARN-2923.20150818-1.patch As part of Distributed Node Labels configuration we need to support Node labels to be configured in Yarn-site.xml. And on modification of Node Labels configuration in yarn-site.xml, NM should be able to get modified Node labels from this NodeLabelsprovider service without NM restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3942) Timeline store to read events from HDFS
[ https://issues.apache.org/jira/browse/YARN-3942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700657#comment-14700657 ] Rajesh Balamohan commented on YARN-3942: Should this be resilient to cluster restarts? For e.g, when cluster restart happens, timeline server automatically gets killed with the following exception. {noformat} 2015-08-18 01:03:31,523 [EntityLogPluginWorker #6] ERROR org.apache.hadoop.yarn.server.timeline.EntityFileTimelineStore: Error scanning active files ... ... [EntityLogPluginWorker #0] ERROR org.apache.hadoop.yarn.server.timeline.EntityFileTimelineStore: Error scanning active files java.io.EOFException: End of File Exception between local host is: atsmachine; destination host is: m1:8020; : java.io.EOFException; For more details see: http://wiki.apache.org/hadoop/EOFException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:765) at org.apache.hadoop.ipc.Client.call(Client.java:1444) at org.apache.hadoop.ipc.Client.call(Client.java:1371) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy26.getListing(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:574) at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy27.getListing(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1748) at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.init(DistributedFileSystem.java:973) at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.init(DistributedFileSystem.java:984) at org.apache.hadoop.hdfs.DistributedFileSystem$DirListingIterator.init(DistributedFileSystem.java:956) at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:935) at org.apache.hadoop.hdfs.DistributedFileSystem$21.doCall(DistributedFileSystem.java:931) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusIterator(DistributedFileSystem.java:943) at org.apache.hadoop.yarn.server.timeline.EntityFileTimelineStore.scanActiveLogs(EntityFileTimelineStore.java:314) at org.apache.hadoop.yarn.server.timeline.EntityFileTimelineStore.access$1300(EntityFileTimelineStore.java:79) at org.apache.hadoop.yarn.server.timeline.EntityFileTimelineStore$EntityLogScanner.run(EntityFileTimelineStore.java:771) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:392) at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1098) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:993) 2015-08-18 01:03:35,600 [SIGTERM handler] ERROR org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: RECEIVED SIGNAL 15: SIGTERM 2015-08-18 01:03:35,608 [Thread-1] INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@atsmachine:8188 2015-08-18 01:03:35,710 [Thread-1] INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping ApplicationHistoryServer metrics system... 2015-08-18 01:03:35,712 [Thread-1] INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl:
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700378#comment-14700378 ] Sangjin Lee commented on YARN-4053: --- Thanks [~varun_saxena] for pointing out an important issue. I would agree with [~gtCarrera9] that this is bit lower in priority compared to YARN-3814, but it's an important issue nonetheless. I'm just curious (and perhaps this is a totally dumb question for a HBase newbie), is there a way to specify that the value type is a numeric type when we create the table or the column family? Does HBase itself support something like that? Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1644) RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing
[ https://issues.apache.org/jira/browse/YARN-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700423#comment-14700423 ] Wangda Tan commented on YARN-1644: -- Discussed with [~jianhe], some thoughts: There're 3 corner cases we need to handle: 1. AM send decrease container to RM before send increase container to NM 2. RM crashes after issued increase container, and AM increase container to NM during NM registering 3. Same as 2. but AM send decrease container request to RM before RM receives NM reported increase container. What we may need to consider is version of container, RM will add 1 to container version if increased/decreased a container. And container-version will be added to ContainerTokenIdentifier, NM reported increased container and NMContainerStatus while registering. From RM's view, it should keep the latest updated container resource. So for above corner cases: 1. Result: container decreased 2. Result: container increased 3. Result: container decreased (because the latest resource AM sent to RM is decrese). So in RM side, it will check: {code} if (rm.version = nm.version) { // keep existing container in RM unchanged, and tell NM about this // why include == here is, if rm.version == nm.version, corner case #3 happened. } else { // change container in RM } {code} So in summary what we need in protocol is: - Container-version in ContainerTokenIdentifier - COntainer-version in NMContainerStatus - add a IncreasedContainer of NM-RM heartbeat, and include container-version in IncreasedContainer. Thoughts? [~mding] RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing - Key: YARN-1644 URL: https://issues.apache.org/jira/browse/YARN-1644 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Wangda Tan Assignee: MENG DING Attachments: YARN-1644-YARN-1197.4.patch, YARN-1644-YARN-1197.5.patch, YARN-1644.1.patch, YARN-1644.2.patch, YARN-1644.3.patch, yarn-1644.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run table
[ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700546#comment-14700546 ] Sangjin Lee commented on YARN-3901: --- This may be a good idea. The only thing I'd add to that is the name {{equals()}} may not be the best name. The {{equals()}} method is specific about the contract, and we shouldn't implement it to support equality between a string and the timeline entity type. And since {{TimelineEntiteType}} is an enum, you can't override it anyway. Any other method might be just fine. For example, {code} public enum TimelineEntityType { ... public boolean typeMatches(String type) { return toString().equals(type); } } {code} Populate flow run data in the flow_run table Key: YARN-3901 URL: https://issues.apache.org/jira/browse/YARN-3901 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Vrushali C Attachments: YARN-3901-YARN-2928.WIP.patch As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf filing jira to track creation and population of data in the flow run table. Some points that are being considered: - Stores per flow run information aggregated across applications, flow version RM’s collector writes to on app creation and app completion - Per App collector writes to it for metric updates at a slower frequency than the metric updates to application table primary key: cluster ! user ! flow ! flow run id - Only the latest version of flow-level aggregated metrics will be kept, even if the entity and application level keep a timeseries. - The running_apps column will be incremented on app creation, and decremented on app completion. - For min_start_time the RM writer will simply write a value with the tag for the applicationId. A coprocessor will return the min value of all written values. - - Upon flush and compactions, the min value between all the cells of this column will be written to the cell without any tag (empty tag) and all the other cells will be discarded. - Ditto for the max_end_time, but then the max will be kept. - Tags are represented as #type:value. The type can be not set (0), or can indicate running (1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed on compaction. - The m! values are aggregated (summed) upon read. Only when applications are completed (indicated by tag type 2) can the values be collapsed. - The application ids that have completed and been aggregated into the flow numbers are retained in a separate column for historical tracking: we don’t want to re-aggregate for those upon replay -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong Zhiguo updated YARN-4024: -- Attachment: YARN-4024-draft-v2.patch updated the patch with flushing when node state is transiting between USABLE and UNUSABLE. YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat -- Key: YARN-4024 URL: https://issues.apache.org/jira/browse/YARN-4024 Project: Hadoop YARN Issue Type: Improvement Reporter: Wangda Tan Assignee: Hong Zhiguo Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft.patch Currently, YARN RM NodesListManager will resolve IP address every time when node doing heartbeat. When DNS server becomes slow, NM heartbeat will be blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4058) Miscellaneous issues in NodeManager project
[ https://issues.apache.org/jira/browse/YARN-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700679#comment-14700679 ] Naganarasimha G R commented on YARN-4058: - Hi [~djp] [~vinodkv], Have created this jira for the issues identified during YARN-2928 reset and other minor issues. Please add other issues (if any) which were found during the reset which needs to be taken care. Miscellaneous issues in NodeManager project --- Key: YARN-4058 URL: https://issues.apache.org/jira/browse/YARN-4058 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Naganarasimha G R Assignee: Naganarasimha G R Priority: Minor # TestSystemMetricsPublisherForV2.testPublishApplicationMetrics is failing # Unused ApplicationACLsManager in ContainerManagerImpl # In ContainerManagerImpl.startContainerInternal ApplicationImpl instance is created and then checked whether it exists in context.getApplications(). everytime ApplicationImpl is created state machine is intialized and TimelineClient is created which is required only if added to the context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode
[ https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu reopened YARN-3857: - Memory leak in ResourceManager with SIMPLE mode --- Key: YARN-3857 URL: https://issues.apache.org/jira/browse/YARN-3857 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: mujunchao Assignee: mujunchao Priority: Critical Labels: patch Fix For: 2.7.2 Attachments: YARN-3857-1.patch, YARN-3857-2.patch, YARN-3857-3.patch, YARN-3857-4.patch, hadoop-yarn-server-resourcemanager.patch We register the ClientTokenMasterKey to avoid client may hold an invalid ClientToken after RM restarts. In SIMPLE mode, we register PairApplicationAttemptId, null , But we never remove it from HashMap, as unregister only runing while in Security mode, so memory leak coming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode
[ https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3857: Fix Version/s: (was: 2.7.2) Memory leak in ResourceManager with SIMPLE mode --- Key: YARN-3857 URL: https://issues.apache.org/jira/browse/YARN-3857 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: mujunchao Assignee: mujunchao Priority: Critical Labels: patch Attachments: YARN-3857-1.patch, YARN-3857-2.patch, YARN-3857-3.patch, YARN-3857-4.patch, hadoop-yarn-server-resourcemanager.patch We register the ClientTokenMasterKey to avoid client may hold an invalid ClientToken after RM restarts. In SIMPLE mode, we register PairApplicationAttemptId, null , But we never remove it from HashMap, as unregister only runing while in Security mode, so memory leak coming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode
[ https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-3857: Hadoop Flags: Reviewed Memory leak in ResourceManager with SIMPLE mode --- Key: YARN-3857 URL: https://issues.apache.org/jira/browse/YARN-3857 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: mujunchao Assignee: mujunchao Priority: Critical Labels: patch Attachments: YARN-3857-1.patch, YARN-3857-2.patch, YARN-3857-3.patch, YARN-3857-4.patch, hadoop-yarn-server-resourcemanager.patch We register the ClientTokenMasterKey to avoid client may hold an invalid ClientToken after RM restarts. In SIMPLE mode, we register PairApplicationAttemptId, null , But we never remove it from HashMap, as unregister only runing while in Security mode, so memory leak coming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700532#comment-14700532 ] Hadoop QA commented on YARN-4014: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 24m 36s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:red}-1{color} | javadoc | 10m 3s | The applied patch generated 2 additional warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 40s | The applied patch generated 6 new checkstyle issues (total was 31, now 37). | | {color:green}+1{color} | whitespace | 0m 11s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 28s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 6m 19s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | mapreduce tests | 103m 8s | Tests passed in hadoop-mapreduce-client-jobclient. | | {color:green}+1{color} | yarn tests | 0m 31s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 7m 8s | Tests passed in hadoop-yarn-client. | | {color:red}-1{color} | yarn tests | 2m 8s | Tests failed in hadoop-yarn-common. | | {color:red}-1{color} | yarn tests | 56m 16s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 224m 3s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.util.TestRackResolver | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750880/0003-YARN-4014.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / c77bd6a | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/8869/artifact/patchprocess/diffJavadocWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8869/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-mapreduce-client-jobclient test log | https://builds.apache.org/job/PreCommit-YARN-Build/8869/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8869/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8869/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8869/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8869/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8869/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8869/console | This message was automatically generated. Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch, 0003-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4014: Attachment: 0002-YARN-4017.patch Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4017.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3045) [Event producers] Implement NM writing container lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699557#comment-14699557 ] Junping Du commented on YARN-3045: -- Latest patch LGTM. But will confirm YARN-2928 branch status before committing this. [Event producers] Implement NM writing container lifecycle events to ATS Key: YARN-3045 URL: https://issues.apache.org/jira/browse/YARN-3045 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3045-YARN-2928.002.patch, YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch, YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, YARN-3045-YARN-2928.007.patch, YARN-3045-YARN-2928.008.patch, YARN-3045-YARN-2928.009.patch, YARN-3045-YARN-2928.010.patch, YARN-3045-YARN-2928.011.patch, YARN-3045.20150420-1.patch Per design in YARN-2928, implement NM writing container lifecycle events and container system metrics to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699591#comment-14699591 ] Rohith Sharma K S commented on YARN-4014: - bq. we can make updateApplicationPriority throw an ApplicationNotRunningException and let client catch the exception and prints “Application not running “ msg In {{ClientRMService#updateApplicationPriority}}, update priority to scheduler will not be called if application is in NEW , NEW_SAVING also. So I feel having new exception ApplicationNotRunningException would lead to confusion. I think we can throw YarnException with message Application in app-state state cannot be update priority. Any thoughts? Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700759#comment-14700759 ] Naganarasimha G R commented on YARN-4053: - bq. place a restriction on client that it should send values in floating point format at all times if it wants to store some metric value as floating point. We can mention this in our documentation. I think this approach is better as we will be able to have filters based on values too and less processing costs. Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3880) Writing more RM side app-level metrics
[ https://issues.apache.org/jira/browse/YARN-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700713#comment-14700713 ] Naganarasimha G R commented on YARN-3880: - Hi [~zjshen], Shall i handle this issue if you not started with it ? Writing more RM side app-level metrics -- Key: YARN-3880 URL: https://issues.apache.org/jira/browse/YARN-3880 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Zhijie Shen Assignee: Zhijie Shen In YARN-3044, we implemented an analog of metrics publisher for ATS v1. While it helps to write app/attempt/container life cycle events, it really doesn't write as many app-level system metrics that RM are now having. Just list the metrics that I found missing: * runningContainers * memorySeconds * vcoreSeconds * preemptedResourceMB * preemptedResourceVCores * numNonAMContainerPreempted * numAMContainerPreempted Please feel fee to add more into the list if you find it's not covered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700757#comment-14700757 ] Varun Saxena commented on YARN-4053: [~sjlee0], bq. I'm just curious (and perhaps this is a totally dumb question for a HBase newbie), is there a way to specify that the value type is a numeric type when we create the table or the column family? Does HBase itself support something like that? AFAIK, no there is no way to attach type with column qualifier or column family. HBase treats everything as just a sequence of bytes. It depends on the user how they encode or decode it. bq. Another scenario to think about is what if users write metric values in an inconsistent manner. Suppose the user stored an integral value for a metric initially, but later attempted to store a floating value for the same metric. It sounds like it could be a silent failure? This should be a rare occurrence, but I think we need to give it some thought... Yes I did consider this scenario. That is why I said we can place a restriction on client that it should send values in floating point format at all times if it wants to store some metric value as floating point. We can mention this in our documentation. Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3045) [Event producers] Implement NM writing container lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700717#comment-14700717 ] Naganarasimha G R commented on YARN-3045: - Thanks [~djp] for reviewing this jira. In continuation to this jira, i think we need to make some progress in YARN-3367 (as discussed earlier in this jira) . So shall i handle YARN-3367 jira and then revisit the missing NM container and application events? And i think similar modifications are required on RM side too and also we need to handle other events in RM side. so was thinking about working on YARN-3880 and include the changes there . Please share your opinon. [Event producers] Implement NM writing container lifecycle events to ATS Key: YARN-3045 URL: https://issues.apache.org/jira/browse/YARN-3045 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3045-YARN-2928.002.patch, YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch, YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, YARN-3045-YARN-2928.007.patch, YARN-3045-YARN-2928.008.patch, YARN-3045-YARN-2928.009.patch, YARN-3045-YARN-2928.010.patch, YARN-3045-YARN-2928.011.patch, YARN-3045.20150420-1.patch Per design in YARN-2928, implement NM writing container lifecycle events and container system metrics to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4014: Attachment: 0004-YARN-4014.patch Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch, 0003-YARN-4014.patch, 0004-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2005) Blacklisting support for scheduling AMs
[ https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-2005: Attachment: YARN-2005.005.patch Added a patch that maintains a separate system blacklist for launching AMs different than user blacklist. This avoid accidentally affecting the user's blacklist for launching containers. Blacklisting support for scheduling AMs --- Key: YARN-2005 URL: https://issues.apache.org/jira/browse/YARN-2005 Project: Hadoop YARN Issue Type: Improvement Components: resourcemanager Affects Versions: 0.23.10, 2.4.0 Reporter: Jason Lowe Assignee: Anubhav Dhoot Attachments: YARN-2005.001.patch, YARN-2005.002.patch, YARN-2005.003.patch, YARN-2005.004.patch, YARN-2005.005.patch It would be nice if the RM supported blacklisting a node for an AM launch after the same node fails a configurable number of AM attempts. This would be similar to the blacklisting support for scheduling task attempts in the MapReduce AM but for scheduling AM attempts on the RM side. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700753#comment-14700753 ] Rohith Sharma K S commented on YARN-4014: - bq. That means the updated priority is lost Discussed offline with Jian He, updated priority wont be lost if application is in ACCEPTED state. Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch, 0003-YARN-4014.patch, 0004-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3868) ContainerManager recovery for container resizing
[ https://issues.apache.org/jira/browse/YARN-3868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699797#comment-14699797 ] Hadoop QA commented on YARN-3868: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 13s | Findbugs (version ) appears to be broken on YARN-1197. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 40s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 43s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 20s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 5s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 23s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 15s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 6m 52s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 43m 35s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12748036/YARN-3868-YARN-1197.4.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-1197 / 23f28df | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8863/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8863/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8863/console | This message was automatically generated. ContainerManager recovery for container resizing Key: YARN-3868 URL: https://issues.apache.org/jira/browse/YARN-3868 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: MENG DING Assignee: MENG DING Attachments: YARN-3868-YARN-1197.3.patch, YARN-3868-YARN-1197.4.patch, YARN-3868.1.patch, YARN-3868.2.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699872#comment-14699872 ] Li Lu commented on YARN-4053: - Hi [~varun_saxena], thanks for the patch! With regard to the POC, I thought we agreed on the general plan of the POC on a web UI connected to the timeline reader during our weekly standup discussion? If this is the case, I would personally give the RESTful API patch slightly higher priority since that is critical to the whole workflow of the reader/webUI interface? About this patch, I totally agree that we should directly store the byte representation of the numbers instead of using generic object mapper. Having looked at the patch, I have some general comments here. Maybe what we want here is a way to model the types of the timeline metrics, so that the type information can be carried over from TimelineMetric objects to the storage layer? We may have something like TimelineData.FLOAT, TimelineData.LONG, TimelineData.GENERIC_OBJECT, etc., so that we can easily transfer those messages? My main concern is on the ColumnPrefix descriptions, where we now use a boolean flag to indicate if the column is numeric or not. This will also help us to better organize the serialization and deserialization helper methods and all related tests. Let me know if this idea works here, thanks! Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699918#comment-14699918 ] Jian He commented on YARN-4014: --- bq. I think we can throw YarnException with message Application in app-state state cannot be update priority. Any thoughts? sounds good to me. We may just check if app is at RUNNING state for below if condition ? {code} if (EnumSet.of(RMAppState.NEW, RMAppState.NEW_SAVING, RMAppState.FAILED, RMAppState.FINAL_SAVING, RMAppState.FINISHING, RMAppState.FINISHED, RMAppState.KILLED, RMAppState.KILLING, RMAppState.FAILED).contains( application.getState())) { {code} Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699930#comment-14699930 ] Rohith Sharma K S commented on YARN-4014: - I did the above check leaving SUBMITTED, ACCEPTED, RUNNING state because thinking that application priority should be able to update in these states. Should we update only for RUNNING? I feel these states should be allowed to change priority. What do you think? Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run table
[ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699937#comment-14699937 ] Joep Rottinghuis commented on YARN-3901: It seems that in general in order to get information from the client to the coprocessor we have two mechanisms: 1) use put attributes 2) use Cell tags. We know that on the read side tags are stripped off. Did you have a specific reason to choose one of the other? Populate flow run data in the flow_run table Key: YARN-3901 URL: https://issues.apache.org/jira/browse/YARN-3901 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Vrushali C Attachments: YARN-3901-YARN-2928.WIP.patch As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf filing jira to track creation and population of data in the flow run table. Some points that are being considered: - Stores per flow run information aggregated across applications, flow version RM’s collector writes to on app creation and app completion - Per App collector writes to it for metric updates at a slower frequency than the metric updates to application table primary key: cluster ! user ! flow ! flow run id - Only the latest version of flow-level aggregated metrics will be kept, even if the entity and application level keep a timeseries. - The running_apps column will be incremented on app creation, and decremented on app completion. - For min_start_time the RM writer will simply write a value with the tag for the applicationId. A coprocessor will return the min value of all written values. - - Upon flush and compactions, the min value between all the cells of this column will be written to the cell without any tag (empty tag) and all the other cells will be discarded. - Ditto for the max_end_time, but then the max will be kept. - Tags are represented as #type:value. The type can be not set (0), or can indicate running (1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed on compaction. - The m! values are aggregated (summed) upon read. Only when applications are completed (indicated by tag type 2) can the values be collapsed. - The application ids that have completed and been aggregated into the flow numbers are retained in a separate column for historical tracking: we don’t want to re-aggregate for those upon replay -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3975) WebAppProxyServlet should not redirect to RM page if AHS is enabled
[ https://issues.apache.org/jira/browse/YARN-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699816#comment-14699816 ] Hadoop QA commented on YARN-3975: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 50s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 57s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 47s | The applied patch generated 5 new checkstyle issues (total was 17, now 22). | | {color:green}+1{color} | whitespace | 0m 1s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 36s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 7m 0s | Tests passed in hadoop-yarn-client. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-server-web-proxy. | | | | 46m 56s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750843/YARN-3975.5.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 13604bd | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8864/artifact/patchprocess/diffcheckstylehadoop-yarn-server-web-proxy.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8864/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | hadoop-yarn-server-web-proxy test log | https://builds.apache.org/job/PreCommit-YARN-Build/8864/artifact/patchprocess/testrun_hadoop-yarn-server-web-proxy.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8864/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8864/console | This message was automatically generated. WebAppProxyServlet should not redirect to RM page if AHS is enabled --- Key: YARN-3975 URL: https://issues.apache.org/jira/browse/YARN-3975 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.1 Reporter: Mit Desai Assignee: Mit Desai Attachments: YARN-3975.2.b2.patch, YARN-3975.3.patch, YARN-3975.4.patch, YARN-3975.5.patch WebAppProxyServlet should be updated to handle the case when the appreport doesn't have a tracking URL and the Application History Server is eanbled. As we would have already tried the RM and got the ApplicationNotFoundException we should not direct the user to the RM app page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering
[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699814#comment-14699814 ] Hadoop QA commented on YARN-4051: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 26s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 57s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 59s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 37s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 13s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 6m 20s | Tests passed in hadoop-yarn-server-nodemanager. | | | | 44m 54s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750841/YARN-4051.03.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 13604bd | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8865/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8865/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8865/console | This message was automatically generated. ContainerKillEvent is lost when container is In New State and is recovering Key: YARN-4051 URL: https://issues.apache.org/jira/browse/YARN-4051 Project: Hadoop YARN Issue Type: Bug Reporter: sandflee Assignee: sandflee Priority: Critical Attachments: YARN-4051.01.patch, YARN-4051.02.patch, YARN-4051.03.patch As in YARN-4050, NM event dispatcher is blocked, and container is in New state, when we finish application, the container still alive even after NM event dispatcher is unblocked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699885#comment-14699885 ] Varun Saxena commented on YARN-4053: [~gtCarrera9], yes, will update a patch by tomorrow for YARN-3814 if no further comments come. Do you want it today ? Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699908#comment-14699908 ] Wangda Tan commented on YARN-4024: -- Hi [~zhiguohong], Thanks for working on this, For your comments: bq. I think that's too complicated... Agree, I changed my idea, see https://issues.apache.org/jira/browse/YARN-4024?focusedCommentId=14660607page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14660607. Approach in your patch general looks good to me, few suggestions: 1) When node becomes NODE_UNUSABLE/NODE_USABLE, I suggest remove them from the cache to force update its ip, since a node status change will (likely) update its ip. So this may require update the Resolver interface 2) seconds - something like normalizedHostnameCacheTimeout [~wilsoncraft], bq. When a nodemanager is decommissioned, is the IP cached for that host flushed out of the cache? Normally when a host gets a new IP its because it gets moved or some other deliberate maintenance which would normally be preceded by a decommission. If the IP is flushed when decommissioned or a IP is always resolved from the host name when a new or recommissioned nodemanager is added to the cluster I think that would be adequate IMHO. I'm not quite sure about what did you mean, does my comment solve the problem you meantioned? bq. 1) When node becomes NODE_UNUSABLE/NODE_USABLE, I suggest remove them from the cache to force update its ip, since a node status change will (likely) update its ip. So this may require update the Resolver interface bq. Also, it may be worthwhile or adequate to expose the method in a yarn rmadin command to force a flush of the IP cache. Is this IP cache the same used for Rack Awareness by the RM? I prefer keep this to be an internal behavior, this won't be used to determine rack IIUC. Please let me know your thoughts. Thanks, Wangda YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat -- Key: YARN-4024 URL: https://issues.apache.org/jira/browse/YARN-4024 Project: Hadoop YARN Issue Type: Improvement Reporter: Wangda Tan Assignee: Hong Zhiguo Attachments: YARN-4024-draft.patch Currently, YARN RM NodesListManager will resolve IP address every time when node doing heartbeat. When DNS server becomes slow, NM heartbeat will be blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699907#comment-14699907 ] Varun Saxena commented on YARN-4053: Oh sorry in my original comment, I meant that we can carry this logic over to other columns(not only metrics). I agree we can include type information in TimelineMetric object itself. That will be better. By the way do you envisage metric values having anything other than float or long ? I think TimelineData.FLOAT and TimelineData.LONG should be enough. Thoughts ? Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699883#comment-14699883 ] Varun Saxena commented on YARN-4053: bq. We may have something like TimelineData.FLOAT, TimelineData.LONG, TimelineData.GENERIC_OBJECT, etc., so that we can easily transfer those messages Yes this can be done. I meant the same when I mentioned we can extend the logic in the patch attached. Depends on what everyone agrees to. Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699888#comment-14699888 ] Li Lu commented on YARN-4053: - Thanks! Tomorrow LGTM. Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3975) WebAppProxyServlet should not redirect to RM page if AHS is enabled
[ https://issues.apache.org/jira/browse/YARN-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-3975: Attachment: YARN-3975.6.patch Fixed Checkstyle issues WebAppProxyServlet should not redirect to RM page if AHS is enabled --- Key: YARN-3975 URL: https://issues.apache.org/jira/browse/YARN-3975 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.1 Reporter: Mit Desai Assignee: Mit Desai Attachments: YARN-3975.2.b2.patch, YARN-3975.3.patch, YARN-3975.4.patch, YARN-3975.5.patch, YARN-3975.6.patch WebAppProxyServlet should be updated to handle the case when the appreport doesn't have a tracking URL and the Application History Server is eanbled. As we would have already tried the RM and got the ApplicationNotFoundException we should not direct the user to the RM app page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699934#comment-14699934 ] Sunil G commented on YARN-4014: --- IMHO, I feel we can change priority of an app at ACCEPTED state also. Bcz its yet to be activated, and we increase priority for same. Thoughts? Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1644) RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing
[ https://issues.apache.org/jira/browse/YARN-1644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699939#comment-14699939 ] MENG DING commented on YARN-1644: - I had an offline discussion with [~jianhe] a while ago, and we thought that the race condition in scenario 3 can be handled in a separate JIRA, as it applies to both increase container size and start container. For this ticket, we are exploring the idea of getting rid of the {{increasedContainers}} list from NM. The {{increasedContainers}} was originally introduced as a way to let NM inform RM that an increase action has been completed in NM. However, it seems that we may achieve the same result by checking {{containerStatuses}}. In particular, RM will keep checking the difference of container sizes between heartbeats. For each container: * If the container size reported from this heartbeat is larger than the size reported from previous heartbeat: ** If the reported size is the same as RM's bookkeeping for this container, then this is a confirmation of container resource increase. ** If the reported size is larger than RM's bookkeeping for this container, then this is due to an RM recovery during container resource increase in NM. RM should increase its bookkeeping of this container to match the reported size. ** If the reported size is smaller than RM's bookkeeping for this container, it should be an error. * If the container size reported from this heartbeat is smaller than the size reported from previous heartbeat: ** If the reported size is the same as RM's bookkeeping for this container, then this is a confirmation of container resource decrease. ** Any other case should be an error. The validity of this approach is still being decided. Any comments/concerns are welcome. RM-NM protocol changes and NodeStatusUpdater implementation to support container resizing - Key: YARN-1644 URL: https://issues.apache.org/jira/browse/YARN-1644 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Wangda Tan Assignee: MENG DING Attachments: YARN-1644-YARN-1197.4.patch, YARN-1644-YARN-1197.5.patch, YARN-1644.1.patch, YARN-1644.2.patch, YARN-1644.3.patch, yarn-1644.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699951#comment-14699951 ] Varun Saxena commented on YARN-4053: On second thoughts, all three types may make sense if we include filters as part of our object model and make client create and send them. Lets discuss this on Wednesday in weekly meeting. Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4014: Attachment: 0002-YARN-4014.patch Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch, 0002-YARN-4017.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4014: Attachment: (was: 0002-YARN-4017.patch) Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699645#comment-14699645 ] Allan Wilson commented on YARN-4024: When a nodemanager is decommissioned, is the IP cached for that host flushed out of the cache? Normally when a host gets a new IP its because it gets moved or some other deliberate maintenance which would normally be preceded by a decommission. If the IP is flushed when decommissioned or a IP is always resolved from the host name when a new or recommissioned nodemanager is added to the cluster I think that would be adequate IMHO. Also, it may be worthwhile or adequate to expose the method in a yarn rmadin command to force a flush of the IP cache. Is this IP cache the same used for Rack Awareness by the RM? Thanks YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat -- Key: YARN-4024 URL: https://issues.apache.org/jira/browse/YARN-4024 Project: Hadoop YARN Issue Type: Improvement Reporter: Wangda Tan Assignee: Hong Zhiguo Attachments: YARN-4024-draft.patch Currently, YARN RM NodesListManager will resolve IP address every time when node doing heartbeat. When DNS server becomes slow, NM heartbeat will be blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4055) Report node resource utilization in heartbeat
[ https://issues.apache.org/jira/browse/YARN-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699641#comment-14699641 ] Hudson commented on YARN-4055: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2217 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2217/]) YARN-4055. Report node resource utilization in heartbeat. (Inigo Goiri via kasha) (kasha: rev 13604bd5f119fc81b9942190dfa366afad61bc92) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java Report node resource utilization in heartbeat - Key: YARN-4055 URL: https://issues.apache.org/jira/browse/YARN-4055 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.1 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-4055-v0.patch, YARN-4055-v1.patch Send the resource utilization from the node (obtained in the NodeResourceMonitor) to the RM in the heartbeat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3975) WebAppProxyServlet should not redirect to RM page if AHS is enabled
[ https://issues.apache.org/jira/browse/YARN-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mit Desai updated YARN-3975: Attachment: YARN-3975.5.patch Thanks for the review [~jlowe]. I have modified the patch based on your comments. WebAppProxyServlet should not redirect to RM page if AHS is enabled --- Key: YARN-3975 URL: https://issues.apache.org/jira/browse/YARN-3975 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.1 Reporter: Mit Desai Assignee: Mit Desai Attachments: YARN-3975.2.b2.patch, YARN-3975.3.patch, YARN-3975.4.patch, YARN-3975.5.patch WebAppProxyServlet should be updated to handle the case when the appreport doesn't have a tracking URL and the Application History Server is eanbled. As we would have already tried the RM and got the ApplicationNotFoundException we should not direct the user to the RM app page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run table
[ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699804#comment-14699804 ] Joep Rottinghuis commented on YARN-3901: Presumably the storeWithTag method is just temporary until we have this worked out and then the store method would take a list of tags (and for those cases not using tags would end up passing null)? Populate flow run data in the flow_run table Key: YARN-3901 URL: https://issues.apache.org/jira/browse/YARN-3901 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Vrushali C Attachments: YARN-3901-YARN-2928.WIP.patch As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf filing jira to track creation and population of data in the flow run table. Some points that are being considered: - Stores per flow run information aggregated across applications, flow version RM’s collector writes to on app creation and app completion - Per App collector writes to it for metric updates at a slower frequency than the metric updates to application table primary key: cluster ! user ! flow ! flow run id - Only the latest version of flow-level aggregated metrics will be kept, even if the entity and application level keep a timeseries. - The running_apps column will be incremented on app creation, and decremented on app completion. - For min_start_time the RM writer will simply write a value with the tag for the applicationId. A coprocessor will return the min value of all written values. - - Upon flush and compactions, the min value between all the cells of this column will be written to the cell without any tag (empty tag) and all the other cells will be discarded. - Ditto for the max_end_time, but then the max will be kept. - Tags are represented as #type:value. The type can be not set (0), or can indicate running (1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed on compaction. - The m! values are aggregated (summed) upon read. Only when applications are completed (indicated by tag type 2) can the values be collapsed. - The application ids that have completed and been aggregated into the flow numbers are retained in a separate column for historical tracking: we don’t want to re-aggregate for those upon replay -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3534) Collect memory/cpu usage on the node
[ https://issues.apache.org/jira/browse/YARN-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699678#comment-14699678 ] Hudson commented on YARN-3534: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2236 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2236/]) YARN-3534. Collect memory/cpu usage on the node. (Inigo Goiri via kasha) (kasha: rev def12933b38efd5e47c5144b729c1a1496f09229) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeResourceMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeResourceMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeResourceMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java Collect memory/cpu usage on the node Key: YARN-3534 URL: https://issues.apache.org/jira/browse/YARN-3534 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Affects Versions: 2.7.0 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-3534-1.patch, YARN-3534-10.patch, YARN-3534-11.patch, YARN-3534-12.patch, YARN-3534-14.patch, YARN-3534-15.patch, YARN-3534-16.patch, YARN-3534-16.patch, YARN-3534-17.patch, YARN-3534-17.patch, YARN-3534-18.patch, YARN-3534-2.patch, YARN-3534-3.patch, YARN-3534-3.patch, YARN-3534-4.patch, YARN-3534-5.patch, YARN-3534-6.patch, YARN-3534-7.patch, YARN-3534-8.patch, YARN-3534-9.patch Original Estimate: 336h Remaining Estimate: 336h YARN should be aware of the resource utilization of the nodes when scheduling containers. For this, this task will implement the collection of memory/cpu usage on the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering
[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandflee updated YARN-4051: --- Attachment: YARN-4051.03.patch pending kill event while container is recovered. and just act like recoveredAsKilled. if container is recovered as COMPLETE, goto DONE state. if recovered as LAUNCHED, try to require container and kill container. if recovered as REQUESTED, try to cleanup container state, and goto Done state. ContainerKillEvent is lost when container is In New State and is recovering Key: YARN-4051 URL: https://issues.apache.org/jira/browse/YARN-4051 Project: Hadoop YARN Issue Type: Bug Reporter: sandflee Assignee: sandflee Priority: Critical Attachments: YARN-4051.01.patch, YARN-4051.02.patch, YARN-4051.03.patch As in YARN-4050, NM event dispatcher is blocked, and container is in New state, when we finish application, the container still alive even after NM event dispatcher is unblocked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run table
[ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699795#comment-14699795 ] Joep Rottinghuis commented on YARN-3901: Starting to go through. Just as a note (not necessarily related to this patch) when we do things like {noformat} te.getType().equals(TimelineEntityType.YARN_APPLICATION.toString()) {noformat} I'm wondering if we should create an equals method on TImelineEntityType so we can simply write: {noformat} TimelineEntityType.YARN_APPLICATION.equals(te.getType()) {noformat} That would be shorter and can also deal with nulls w/o need to check if te.getType() could ever return a null. Populate flow run data in the flow_run table Key: YARN-3901 URL: https://issues.apache.org/jira/browse/YARN-3901 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Vrushali C Attachments: YARN-3901-YARN-2928.WIP.patch As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf filing jira to track creation and population of data in the flow run table. Some points that are being considered: - Stores per flow run information aggregated across applications, flow version RM’s collector writes to on app creation and app completion - Per App collector writes to it for metric updates at a slower frequency than the metric updates to application table primary key: cluster ! user ! flow ! flow run id - Only the latest version of flow-level aggregated metrics will be kept, even if the entity and application level keep a timeseries. - The running_apps column will be incremented on app creation, and decremented on app completion. - For min_start_time the RM writer will simply write a value with the tag for the applicationId. A coprocessor will return the min value of all written values. - - Upon flush and compactions, the min value between all the cells of this column will be written to the cell without any tag (empty tag) and all the other cells will be discarded. - Ditto for the max_end_time, but then the max will be kept. - Tags are represented as #type:value. The type can be not set (0), or can indicate running (1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed on compaction. - The m! values are aggregated (summed) upon read. Only when applications are completed (indicated by tag type 2) can the values be collapsed. - The application ids that have completed and been aggregated into the flow numbers are retained in a separate column for historical tracking: we don’t want to re-aggregate for those upon replay -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3534) Collect memory/cpu usage on the node
[ https://issues.apache.org/jira/browse/YARN-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699661#comment-14699661 ] Hudson commented on YARN-3534: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #279 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/279/]) YARN-3534. Collect memory/cpu usage on the node. (Inigo Goiri via kasha) (kasha: rev def12933b38efd5e47c5144b729c1a1496f09229) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeResourceMonitorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeResourceMonitor.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeResourceMonitor.java Collect memory/cpu usage on the node Key: YARN-3534 URL: https://issues.apache.org/jira/browse/YARN-3534 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager, resourcemanager Affects Versions: 2.7.0 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-3534-1.patch, YARN-3534-10.patch, YARN-3534-11.patch, YARN-3534-12.patch, YARN-3534-14.patch, YARN-3534-15.patch, YARN-3534-16.patch, YARN-3534-16.patch, YARN-3534-17.patch, YARN-3534-17.patch, YARN-3534-18.patch, YARN-3534-2.patch, YARN-3534-3.patch, YARN-3534-3.patch, YARN-3534-4.patch, YARN-3534-5.patch, YARN-3534-6.patch, YARN-3534-7.patch, YARN-3534-8.patch, YARN-3534-9.patch Original Estimate: 336h Remaining Estimate: 336h YARN should be aware of the resource utilization of the nodes when scheduling containers. For this, this task will implement the collection of memory/cpu usage on the node. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4055) Report node resource utilization in heartbeat
[ https://issues.apache.org/jira/browse/YARN-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699662#comment-14699662 ] Hudson commented on YARN-4055: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #279 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/279/]) YARN-4055. Report node resource utilization in heartbeat. (Inigo Goiri via kasha) (kasha: rev 13604bd5f119fc81b9942190dfa366afad61bc92) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java Report node resource utilization in heartbeat - Key: YARN-4055 URL: https://issues.apache.org/jira/browse/YARN-4055 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.1 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-4055-v0.patch, YARN-4055-v1.patch Send the resource utilization from the node (obtained in the NodeResourceMonitor) to the RM in the heartbeat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4051) ContainerKillEvent is lost when container is In New State and is recovering
[ https://issues.apache.org/jira/browse/YARN-4051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699745#comment-14699745 ] sandflee commented on YARN-4051: if recovered as REQUESTED, try to cleanup container resource, and goto Done state. ContainerKillEvent is lost when container is In New State and is recovering Key: YARN-4051 URL: https://issues.apache.org/jira/browse/YARN-4051 Project: Hadoop YARN Issue Type: Bug Reporter: sandflee Assignee: sandflee Priority: Critical Attachments: YARN-4051.01.patch, YARN-4051.02.patch, YARN-4051.03.patch As in YARN-4050, NM event dispatcher is blocked, and container is in New state, when we finish application, the container still alive even after NM event dispatcher is unblocked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699628#comment-14699628 ] Rohith Sharma K S commented on YARN-4014: - Updating the modified patch, kindly review the patch. Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4055) Report node resource utilization in heartbeat
[ https://issues.apache.org/jira/browse/YARN-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699659#comment-14699659 ] Hudson commented on YARN-4055: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #287 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/287/]) YARN-4055. Report node resource utilization in heartbeat. (Inigo Goiri via kasha) (kasha: rev 13604bd5f119fc81b9942190dfa366afad61bc92) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java Report node resource utilization in heartbeat - Key: YARN-4055 URL: https://issues.apache.org/jira/browse/YARN-4055 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.1 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-4055-v0.patch, YARN-4055-v1.patch Send the resource utilization from the node (obtained in the NodeResourceMonitor) to the RM in the heartbeat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4055) Report node resource utilization in heartbeat
[ https://issues.apache.org/jira/browse/YARN-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699679#comment-14699679 ] Hudson commented on YARN-4055: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2236 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2236/]) YARN-4055. Report node resource utilization in heartbeat. (Inigo Goiri via kasha) (kasha: rev 13604bd5f119fc81b9942190dfa366afad61bc92) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/NodeStatus.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_protos.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/records/impl/pb/NodeStatusPBImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeManager.java Report node resource utilization in heartbeat - Key: YARN-4055 URL: https://issues.apache.org/jira/browse/YARN-4055 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Affects Versions: 2.7.1 Reporter: Inigo Goiri Assignee: Inigo Goiri Fix For: 2.8.0 Attachments: YARN-4055-v0.patch, YARN-4055-v1.patch Send the resource utilization from the node (obtained in the NodeResourceMonitor) to the RM in the heartbeat. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-679) add an entry point that can start any Yarn service
[ https://issues.apache.org/jira/browse/YARN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-679: Attachment: YARN-679-004.patch add an entry point that can start any Yarn service -- Key: YARN-679 URL: https://issues.apache.org/jira/browse/YARN-679 Project: Hadoop YARN Issue Type: New Feature Components: api Affects Versions: 2.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Labels: BB2015-05-TBR Attachments: YARN-679-001.patch, YARN-679-002.patch, YARN-679-002.patch, YARN-679-003.patch, YARN-679-004.patch, org.apache.hadoop.servic...mon 3.0.0-SNAPSHOT API).pdf Time Spent: 72h Remaining Estimate: 0h There's no need to write separate .main classes for every Yarn service, given that the startup mechanism should be identical: create, init, start, wait for stopped -with an interrupt handler to trigger a clean shutdown on a control-c interrrupt. Provide one that takes any classname, and a list of config files/options -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4014: Attachment: 0003-YARN-4014.patch updating the patch that check only for ACCEPTED and RUNNING application state before updating priority of an application. Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch, 0003-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700041#comment-14700041 ] Hadoop QA commented on YARN-4014: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 20m 39s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 7m 54s | There were no new javac warning messages. | | {color:red}-1{color} | javadoc | 9m 55s | The applied patch generated 2 additional warning messages. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 35s | The applied patch generated 6 new checkstyle issues (total was 31, now 37). | | {color:green}+1{color} | whitespace | 0m 11s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 25s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 6m 18s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | mapreduce tests | 107m 10s | Tests passed in hadoop-mapreduce-client-jobclient. | | {color:green}+1{color} | yarn tests | 0m 27s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 6m 56s | Tests passed in hadoop-yarn-client. | | {color:red}-1{color} | yarn tests | 2m 0s | Tests failed in hadoop-yarn-common. | | {color:red}-1{color} | yarn tests | 53m 11s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 220m 12s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.util.TestRackResolver | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750823/0002-YARN-4014.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 13604bd | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/8862/artifact/patchprocess/diffJavadocWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8862/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | hadoop-mapreduce-client-jobclient test log | https://builds.apache.org/job/PreCommit-YARN-Build/8862/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8862/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8862/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8862/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8862/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8862/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8862/console | This message was automatically generated. Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch, 0003-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3545) Investigate the concurrency issue with the map of timeline collector
[ https://issues.apache.org/jira/browse/YARN-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699964#comment-14699964 ] Xuan Gong commented on YARN-3545: - [~gtCarrera9], Thanks for the patch. I have a question about the patch. I am not sure why we need this check {code} TimelineCollector prevCollectorInTable = collectors.putIfAbsent(appId, collector); // if a previous (as in synchronization order) collector exists, we // should shut down the newly created collector since it will not be // published. if (prevCollectorInTable != null) { collector.stop(); collectorInTable = prevCollectorInTable; initializationBarrier(collectorInTable); } else { {code} ? I think that the question is whether it is possible that we will have multiple threads to call putIfAbsent at the same time? Looks like it will be only called if the container is AM. Investigate the concurrency issue with the map of timeline collector Key: YARN-3545 URL: https://issues.apache.org/jira/browse/YARN-3545 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Li Lu Attachments: YARN-3545-YARN-2928.000.patch See the discussion in YARN-3390 for details. Let's continue the discussion here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4053) Change the way metric values are stored in HBase Storage
[ https://issues.apache.org/jira/browse/YARN-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699971#comment-14699971 ] Li Lu commented on YARN-4053: - bq. Oh sorry in my original comment, I meant that we can carry this logic over to other columns(not only metrics). This looks good. bq. I agree we can include type information in TimelineMetric object itself. That will be better. Actually I believe we *are* carrying type information in TimelineMetrics, in the original (boxed) Java form. For now I think we're fine to move with float and long. Change the way metric values are stored in HBase Storage Key: YARN-4053 URL: https://issues.apache.org/jira/browse/YARN-4053 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-4053-YARN-2928.01.patch Currently HBase implementation uses GenericObjectMapper to convert and store values in backend HBase storage. This converts everything into a string representation(ASCII/UTF-8 encoded byte array). While this is fine in most cases, it does not quite serve our use case for metrics. So we need to decide how are we going to encode and decode metric values and store them in HBase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run table
[ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699978#comment-14699978 ] Vrushali C commented on YARN-3901: -- Thanks [~jrottinghuis] for looking into my wip patch. I also thought about adding in the equals method and will do it in the next patch. I believe the type can be unset for a TimelineEntity so we do need to handle nulls. I ran into that in my test case. Regarding using tags or put attributes, here is my take on it: - Accessing put (or get or any mutation attributes is much cleaner) than accessing cell tags. - But attributes are common to the entire Put (or that mutation) where as tags are specific to cell (exactly one cell). - Attributes of a mutation are not persisted into hbase, cell tags are. So I would like to use cell tags as often as possible so that its perfectly clear that this tag belongs to this exact cell. I think I need to clean up the code to remove the Put attributes in the ColumnHelper. Will do it in the next patch. Populate flow run data in the flow_run table Key: YARN-3901 URL: https://issues.apache.org/jira/browse/YARN-3901 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Vrushali C Attachments: YARN-3901-YARN-2928.WIP.patch As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf filing jira to track creation and population of data in the flow run table. Some points that are being considered: - Stores per flow run information aggregated across applications, flow version RM’s collector writes to on app creation and app completion - Per App collector writes to it for metric updates at a slower frequency than the metric updates to application table primary key: cluster ! user ! flow ! flow run id - Only the latest version of flow-level aggregated metrics will be kept, even if the entity and application level keep a timeseries. - The running_apps column will be incremented on app creation, and decremented on app completion. - For min_start_time the RM writer will simply write a value with the tag for the applicationId. A coprocessor will return the min value of all written values. - - Upon flush and compactions, the min value between all the cells of this column will be written to the cell without any tag (empty tag) and all the other cells will be discarded. - Ditto for the max_end_time, but then the max will be kept. - Tags are represented as #type:value. The type can be not set (0), or can indicate running (1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed on compaction. - The m! values are aggregated (summed) upon read. Only when applications are completed (indicated by tag type 2) can the values be collapsed. - The application ids that have completed and been aggregated into the flow numbers are retained in a separate column for historical tracking: we don’t want to re-aggregate for those upon replay -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4025) Deal with byte representations of Longs in writer code
[ https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1472#comment-1472 ] Sangjin Lee commented on YARN-4025: --- That's a good point [~gtCarrera9]. Let me see if I can improve on that point. Deal with byte representations of Longs in writer code -- Key: YARN-4025 URL: https://issues.apache.org/jira/browse/YARN-4025 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Sangjin Lee Attachments: YARN-4025-YARN-2928.001.patch, YARN-4025-YARN-2928.002.patch Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl code. There seem to be some places in the code where there are conversions between Long to byte[] to String for easier argument passing between function calls. Then these values end up being converted back to byte[] while storing in hbase. It would be better to pass around byte[] or the Longs themselves as applicable. This may result in some api changes (store function) as well in adding a few more function calls like getColumnQualifier which accepts a pre-encoded byte array. It will be in addition to the existing api which accepts a String and the ColumnHelper to return a byte[] column name instead of a String one. Filing jira to track these changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1461#comment-1461 ] Rohith Sharma K S commented on YARN-4014: - If the application is in SUBMITTED state, update priority should not be called because application would not be added to scheduler. In ACCEPTED state, update priority can be called. One of the doubt Jian He hasis if application is in ACCEPTED state, then application attempt would not be created. I rechecked the code flow, where we can do update in ACCEPTED state even though application is not created. IIRR, While doing YARN-3887, this specific scenario we discussed and handled *null* entry adding to SchedulableEntity. Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup
[ https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699111#comment-14699111 ] Naganarasimha G R commented on YARN-2923: - Hi [~leftnoteasy], Thanks for the comments, bq. I prefer to keep it unchanged instead of reset labels to empty. IAW, reset invalid labels to empty and send to RM seems a little over-kill to me. Yes this is debatable topic when we consider the partition labels then seems like not required but when it comes to constraints i think it would be required as per the earlier examples given. One thought i had was, may be for constraints we can group them and one of the value in the group can be given and on invalid labels, labels related to that group can be removed. As there is one more jira (YARN-3506) for error handling thought of further discussing this topic there and as far as this jira i will send nulls on NM Labels validation failure. Also have handled as discussed for {{NMDistributedNodeLabelsHandler}}. {{TestYarnConfigurationFields}} test case failure has been corrected and {{TestRackResolver}} seems to be not related to this jira and locally its passing. Checkstyle issue is related to the number of lines in YarnConfiguration, Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup - Key: YARN-2923 URL: https://issues.apache.org/jira/browse/YARN-2923 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Naganarasimha G R Assignee: Naganarasimha G R Fix For: 2.8.0 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, YARN-2923.20150517-1.patch, YARN-2923.20150817-1.patch As part of Distributed Node Labels configuration we need to support Node labels to be configured in Yarn-site.xml. And on modification of Node Labels configuration in yarn-site.xml, NM should be able to get modified Node labels from this NodeLabelsprovider service without NM restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3045) [Event producers] Implement NM writing container lifecycle events to ATS
[ https://issues.apache.org/jira/browse/YARN-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700123#comment-14700123 ] Junping Du commented on YARN-3045: -- Get confirmed with Vinod that new YARN-2928 branch is good to go. Will commit it soon. [Event producers] Implement NM writing container lifecycle events to ATS Key: YARN-3045 URL: https://issues.apache.org/jira/browse/YARN-3045 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Sangjin Lee Assignee: Naganarasimha G R Attachments: YARN-3045-YARN-2928.002.patch, YARN-3045-YARN-2928.003.patch, YARN-3045-YARN-2928.004.patch, YARN-3045-YARN-2928.005.patch, YARN-3045-YARN-2928.006.patch, YARN-3045-YARN-2928.007.patch, YARN-3045-YARN-2928.008.patch, YARN-3045-YARN-2928.009.patch, YARN-3045-YARN-2928.010.patch, YARN-3045-YARN-2928.011.patch, YARN-3045.20150420-1.patch Per design in YARN-2928, implement NM writing container lifecycle events and container system metrics to ATS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time
[ https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700068#comment-14700068 ] zhihai xu commented on YARN-4057: - Thanks for reporting and working on the issue. It is a good catch. This is also an optimization even if ContainersMonitor is enabled, because we don't need call {{isEnabled}} every time ContainersMonitorEvent handler is called. The patch is a minor optimization and also a very safe change. +1 for the patch, If no objection will commit it tomorrow. If ContainersMonitor is not enabled, only print related log info one time - Key: YARN-4057 URL: https://issues.apache.org/jira/browse/YARN-4057 Project: Hadoop YARN Issue Type: Improvement Components: nodemanager Reporter: Jun Gong Assignee: Jun Gong Priority: Minor Attachments: YARN-4057.01.patch ContainersMonitorImpl will check whether it is enabled when handling every event, and it will print following messages again and again if not enabled: {quote} 2015-08-17 13:20:13,792 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Neither virutal-memory nor physical-memory is needed. Not running the monitor-thread {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3814) REST API implementation for getting raw entities in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700236#comment-14700236 ] Sangjin Lee commented on YARN-3814: --- I have no other comments. Thanks! REST API implementation for getting raw entities in TimelineReader -- Key: YARN-3814 URL: https://issues.apache.org/jira/browse/YARN-3814 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Attachments: YARN-3814-YARN-2928.01.patch, YARN-3814-YARN-2928.02.patch, YARN-3814-YARN-2928.03.patch, YARN-3814-YARN-2928.04.patch, YARN-3814.reference.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700247#comment-14700247 ] Sangjin Lee commented on YARN-3904: --- The latest patch (v.9) LGTM. Any other comments? Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3904) Refactor timelineservice.storage to add support to online and offline aggregation writers
[ https://issues.apache.org/jira/browse/YARN-3904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700250#comment-14700250 ] Vrushali C commented on YARN-3904: -- Thanks Li, latest patch looks good to me. Refactor timelineservice.storage to add support to online and offline aggregation writers - Key: YARN-3904 URL: https://issues.apache.org/jira/browse/YARN-3904 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Li Lu Assignee: Li Lu Attachments: YARN-3904-YARN-2928.001.patch, YARN-3904-YARN-2928.002.patch, YARN-3904-YARN-2928.003.patch, YARN-3904-YARN-2928.004.patch, YARN-3904-YARN-2928.005.patch, YARN-3904-YARN-2928.006.patch, YARN-3904-YARN-2928.007.patch, YARN-3904-YARN-2928.008.patch, YARN-3904-YARN-2928.009.patch After we finished the design for time-based aggregation, we can adopt our existing Phoenix storage into the storage of the aggregated data. In this JIRA, I'm proposing to refactor writers to add support to aggregation writers. Offline aggregation writers typically has less contextual information. We can distinguish these writers by special naming. We can also use CollectorContexts to model all contextual information and use it in our writer interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4014) Support user cli interface in for Application Priority
[ https://issues.apache.org/jira/browse/YARN-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700252#comment-14700252 ] Jian He commented on YARN-4014: --- bq. handled null entry adding to SchedulableEntity. I think simply ignoring the null entry is not enough. That means the updated priority is lost. We need to handle this too. We may inherit the priority from the SchedulerApplication when schedulerApplicationAttempt is created. Support user cli interface in for Application Priority -- Key: YARN-4014 URL: https://issues.apache.org/jira/browse/YARN-4014 Project: Hadoop YARN Issue Type: Sub-task Components: client, resourcemanager Reporter: Rohith Sharma K S Assignee: Rohith Sharma K S Attachments: 0001-YARN-4014-V1.patch, 0001-YARN-4014.patch, 0002-YARN-4014.patch, 0003-YARN-4014.patch Track the changes for user-RM client protocol i.e ApplicationClientProtocol changes and discussions in this jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run table
[ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700255#comment-14700255 ] Joep Rottinghuis commented on YARN-3901: After discussing with [~vrushalic] we concluded the following: - Let's keep tag as an implementation detail in the coprocessor. - Let's add a MapString, byte[] attributes argument to store for columns (and column prefixes) in order to pass values along - Columns themselves know how to add additional attributes, namely the operation if needed: MIN, MAX, AGG - Coprocessor will map these values to tags and store. - Given that preput is evaluated for multiple items in a batch, reading during pre-put will yield incorrect result (even though it appears safe with flush of BufferedMutator). Therefore we need to switch to just adding a tag to a cell in pre-put and collapse min and max during read (flush and compactions). - Add an attribute Compact in order to indicate that an app is done (therefore separating whether a value can be aggregated or not). Write this only for the last write, so that we don't store tags for default/common values and therefore keeping storage smaller. - We don't need TimelineWriterUtils.join - We don't need TimelineWriterUtils.ONE_IN_BYTES - Collapse the wip storeWithTags into simply store. - Coprocessor needs to detect if it is going from one column qualifier to the next. The peek method just ensures that the iteration stays within the row. Need to sit and think through how to do that most cleanly, perhaps with peek being able to show only the same column based on argument? Populate flow run data in the flow_run table Key: YARN-3901 URL: https://issues.apache.org/jira/browse/YARN-3901 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Vrushali C Attachments: YARN-3901-YARN-2928.WIP.patch As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf filing jira to track creation and population of data in the flow run table. Some points that are being considered: - Stores per flow run information aggregated across applications, flow version RM’s collector writes to on app creation and app completion - Per App collector writes to it for metric updates at a slower frequency than the metric updates to application table primary key: cluster ! user ! flow ! flow run id - Only the latest version of flow-level aggregated metrics will be kept, even if the entity and application level keep a timeseries. - The running_apps column will be incremented on app creation, and decremented on app completion. - For min_start_time the RM writer will simply write a value with the tag for the applicationId. A coprocessor will return the min value of all written values. - - Upon flush and compactions, the min value between all the cells of this column will be written to the cell without any tag (empty tag) and all the other cells will be discarded. - Ditto for the max_end_time, but then the max will be kept. - Tags are represented as #type:value. The type can be not set (0), or can indicate running (1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed on compaction. - The m! values are aggregated (summed) upon read. Only when applications are completed (indicated by tag type 2) can the values be collapsed. - The application ids that have completed and been aggregated into the flow numbers are retained in a separate column for historical tracking: we don’t want to re-aggregate for those upon replay -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3901) Populate flow run data in the flow_run table
[ https://issues.apache.org/jira/browse/YARN-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700264#comment-14700264 ] Joep Rottinghuis commented on YARN-3901: peekNext also has to deal with limit correctly. Populate flow run data in the flow_run table Key: YARN-3901 URL: https://issues.apache.org/jira/browse/YARN-3901 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Vrushali C Attachments: YARN-3901-YARN-2928.WIP.patch As per the schema proposed in YARN-3815 in https://issues.apache.org/jira/secure/attachment/12743391/hbase-schema-proposal-for-aggregation.pdf filing jira to track creation and population of data in the flow run table. Some points that are being considered: - Stores per flow run information aggregated across applications, flow version RM’s collector writes to on app creation and app completion - Per App collector writes to it for metric updates at a slower frequency than the metric updates to application table primary key: cluster ! user ! flow ! flow run id - Only the latest version of flow-level aggregated metrics will be kept, even if the entity and application level keep a timeseries. - The running_apps column will be incremented on app creation, and decremented on app completion. - For min_start_time the RM writer will simply write a value with the tag for the applicationId. A coprocessor will return the min value of all written values. - - Upon flush and compactions, the min value between all the cells of this column will be written to the cell without any tag (empty tag) and all the other cells will be discarded. - Ditto for the max_end_time, but then the max will be kept. - Tags are represented as #type:value. The type can be not set (0), or can indicate running (1) or complete (2). In those cases (for metrics) only complete app metrics are collapsed on compaction. - The m! values are aggregated (summed) upon read. Only when applications are completed (indicated by tag type 2) can the values be collapsed. - The application ids that have completed and been aggregated into the flow numbers are retained in a separate column for historical tracking: we don’t want to re-aggregate for those upon replay -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3980) Plumb resource-utilization info in node heartbeat through to the scheduler
[ https://issues.apache.org/jira/browse/YARN-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699068#comment-14699068 ] Hadoop QA commented on YARN-3980: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 26s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 4 new or modified test files. | | {color:green}+1{color} | javac | 8m 24s | There were no new javac warning messages. | | {color:red}-1{color} | javadoc | 10m 32s | The applied patch generated 1 additional warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 14s | The applied patch generated 7 new checkstyle issues (total was 263, now 270). | | {color:green}+1{color} | whitespace | 0m 4s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 26s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | tools/hadoop tests | 0m 53s | Tests passed in hadoop-sls. | | {color:red}-1{color} | yarn tests | 54m 15s | Tests failed in hadoop-yarn-server-resourcemanager. | | | | 98m 49s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12750748/YARN-3980-v1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 13604bd | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/8857/artifact/patchprocess/diffJavadocWarnings.txt | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8857/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | hadoop-sls test log | https://builds.apache.org/job/PreCommit-YARN-Build/8857/artifact/patchprocess/testrun_hadoop-sls.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8857/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8857/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8857/console | This message was automatically generated. Plumb resource-utilization info in node heartbeat through to the scheduler -- Key: YARN-3980 URL: https://issues.apache.org/jira/browse/YARN-3980 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.7.1 Reporter: Karthik Kambatla Assignee: Inigo Goiri Attachments: YARN-3980-v0.patch, YARN-3980-v1.patch, YARN-3980-v2.patch YARN-1012 and YARN-3534 collect resource utilization information for all containers and the node respectively and send it to the RM on node heartbeat. We should plumb it through to the scheduler so the scheduler can make use of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3863) Enhance filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699087#comment-14699087 ] Varun Saxena commented on YARN-3863: Sorry I should say if I trim down the *columns* I get from HBase as I have done in YARN-3862, the *columns* required to apply those filters may not work. Enhance filters in TimelineReader - Key: YARN-3863 URL: https://issues.apache.org/jira/browse/YARN-3863 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Currently filters in timeline reader will return an entity only if all the filter conditions hold true i.e. only AND operation is supported. We can support OR operation for the filters as well. Additionally as primary backend implementation is HBase, we can design our filters in a manner, where they closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2923) Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup
[ https://issues.apache.org/jira/browse/YARN-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-2923: Attachment: YARN-2923.20150817-1.patch Support configuration based NodeLabelsProvider Service in Distributed Node Label Configuration Setup - Key: YARN-2923 URL: https://issues.apache.org/jira/browse/YARN-2923 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Naganarasimha G R Assignee: Naganarasimha G R Fix For: 2.8.0 Attachments: YARN-2923.20141204-1.patch, YARN-2923.20141210-1.patch, YARN-2923.20150328-1.patch, YARN-2923.20150404-1.patch, YARN-2923.20150517-1.patch, YARN-2923.20150817-1.patch As part of Distributed Node Labels configuration we need to support Node labels to be configured in Yarn-site.xml. And on modification of Node Labels configuration in yarn-site.xml, NM should be able to get modified Node labels from this NodeLabelsprovider service without NM restart -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3980) Plumb resource-utilization info in node heartbeat through to the scheduler
[ https://issues.apache.org/jira/browse/YARN-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Inigo Goiri updated YARN-3980: -- Attachment: YARN-3980-v2.patch Packing elements of RMNodeStatusEvent into NodeStatus Plumb resource-utilization info in node heartbeat through to the scheduler -- Key: YARN-3980 URL: https://issues.apache.org/jira/browse/YARN-3980 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager, scheduler Affects Versions: 2.7.1 Reporter: Karthik Kambatla Assignee: Inigo Goiri Attachments: YARN-3980-v0.patch, YARN-3980-v1.patch, YARN-3980-v2.patch YARN-1012 and YARN-3534 collect resource utilization information for all containers and the node respectively and send it to the RM on node heartbeat. We should plumb it through to the scheduler so the scheduler can make use of it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3863) Enhance filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14699074#comment-14699074 ] Varun Saxena commented on YARN-3863: [~jrottinghuis], what I meant by saying YARN-3862 depends on this one is that the reader get entity flow will not work properly until this one goes in conjunction to YARN-3862. This is because existing filters(not HBase filters) are currently applied after getting rows from HBase. So if I trim down the rows I get from HBase as I have done in YARN-3862, the rows required to apply those filters may not work. The reason I had segregated these JIRAs' was that YARN-3862 adds additional filters(to determine which configs and metrics to fetch from HBase) and this modifies existing ones. And as we will be modifying existing filters, there will be a lot of change in FS reader implementation as well. Maybe we can realign these JIRAs'. Just have filter model in one JIRA and implementation in 1 more JIRA. And if necessary break it up even further. Enhance filters in TimelineReader - Key: YARN-3863 URL: https://issues.apache.org/jira/browse/YARN-3863 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-2928 Reporter: Varun Saxena Assignee: Varun Saxena Currently filters in timeline reader will return an entity only if all the filter conditions hold true i.e. only AND operation is supported. We can support OR operation for the filters as well. Additionally as primary backend implementation is HBase, we can design our filters in a manner, where they closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4025) Deal with byte representations of Longs in writer code
[ https://issues.apache.org/jira/browse/YARN-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-4025: -- Attachment: YARN-4025-YARN-2928.003.patch v.3 patch posted. - added more javadoc to explain the format of the event column name - changed the signature of {{readResultsHavingCompoundQualifers()}} to make the key type ({{byte[][]}}) explicit - updated the event/application table to reflect the new separator value Deal with byte representations of Longs in writer code -- Key: YARN-4025 URL: https://issues.apache.org/jira/browse/YARN-4025 Project: Hadoop YARN Issue Type: Sub-task Components: timelineserver Reporter: Vrushali C Assignee: Sangjin Lee Attachments: YARN-4025-YARN-2928.001.patch, YARN-4025-YARN-2928.002.patch, YARN-4025-YARN-2928.003.patch Timestamps are being stored as Longs in hbase by the HBaseTimelineWriterImpl code. There seem to be some places in the code where there are conversions between Long to byte[] to String for easier argument passing between function calls. Then these values end up being converted back to byte[] while storing in hbase. It would be better to pass around byte[] or the Longs themselves as applicable. This may result in some api changes (store function) as well in adding a few more function calls like getColumnQualifier which accepts a pre-encoded byte array. It will be in addition to the existing api which accepts a String and the ColumnHelper to return a byte[] column name instead of a String one. Filing jira to track these changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4033) In FairScheduler, parent queues should also display queue status
[ https://issues.apache.org/jira/browse/YARN-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anubhav Dhoot updated YARN-4033: Component/s: fairscheduler In FairScheduler, parent queues should also display queue status - Key: YARN-4033 URL: https://issues.apache.org/jira/browse/YARN-4033 Project: Hadoop YARN Issue Type: Task Components: fairscheduler Reporter: Siqi Li Assignee: Siqi Li Attachments: Screen Shot 2015-08-07 at 2.04.04 PM.png, YARN-4033.v1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3545) Investigate the concurrency issue with the map of timeline collector
[ https://issues.apache.org/jira/browse/YARN-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14700222#comment-14700222 ] Hadoop QA commented on YARN-3545: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12732071/YARN-3545-YARN-2928.000.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | YARN-2928 / f40c735 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8868/console | This message was automatically generated. Investigate the concurrency issue with the map of timeline collector Key: YARN-3545 URL: https://issues.apache.org/jira/browse/YARN-3545 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Li Lu Attachments: YARN-3545-YARN-2928.000.patch See the discussion in YARN-3390 for details. Let's continue the discussion here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)