[jira] [Commented] (YARN-3840) Resource Manager web ui issue when sorting application by id (with application having id 9999)
[ https://issues.apache.org/jira/browse/YARN-3840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604020#comment-14604020 ] Mohammad Shahid Khan commented on YARN-3840: The datatable string sort algorithm has limitation. It can not properly sort the string having having the combination of the string and numeric value. The application id application_numericValue ie is why the sort is not working properly. To fix the same we can use the datatable plugins natural sort alogorithm. {CODE} sb.append([\n) .append({'sType':'natural', 'aTargets': [0]) .append(, 'mRender': parseHadoopID }) {CODE} plugin - ref: https://github.com/DataTables/Plugins/blob/1.10.7/sorting/natural.js Resource Manager web ui issue when sorting application by id (with application having id ) Key: YARN-3840 URL: https://issues.apache.org/jira/browse/YARN-3840 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Environment: Centos 6.6 Java 1.7 Reporter: LINTE Attachments: RMApps.png On the WEBUI, the global main view page : http://resourcemanager:8088/cluster/apps doesn't display applications over . With command line it works (# yarn application -list). Regards, Alexandre -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values
[ https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604027#comment-14604027 ] zhihai xu commented on YARN-3768: - [~jira.shegalov], thanks for the suggestion! The current code will only look up the pattern {{getEnvironmentVariableRegex}} in value({{parts[1]}}) and replace the matched substring with the stored Env Variable's value. I looked at java Matcher class, I couldn't find a way to do capture group and replacement at the same time with a single regex. Is it possible to use a single regex with capture group to do both split and replacement with different variable? If it is possible, Could you tell me how to do that? Index out of range exception with environment variables without values -- Key: YARN-3768 URL: https://issues.apache.org/jira/browse/YARN-3768 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.5.0 Reporter: Joe Ferner Assignee: zhihai xu Attachments: YARN-3768.000.patch, YARN-3768.001.patch Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range exception occurs if an environment variable is encountered without a value. I believe this occurs because java will not return empty strings from the split method. Similar to this http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
[ https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604173#comment-14604173 ] Hudson commented on YARN-2871: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #2169 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2169/]) YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * hadoop-yarn-project/CHANGES.txt TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk - Key: YARN-2871 URL: https://issues.apache.org/jira/browse/YARN-2871 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: zhihai xu Priority: Minor Fix For: 2.8.0 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, YARN-2871.002.patch From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746): {code} Failed tests: TestRMRestart.testRMRestartGetApplicationList:957 rMAppManager.logApplicationSummary( isA(org.apache.hadoop.yarn.api.records.ApplicationId) ); Wanted 3 times: - at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957) But was 2 times: - at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues
[ https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604112#comment-14604112 ] Hudson commented on YARN-3850: -- FAILURE: Integrated in Hadoop-Yarn-trunk #971 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/971/]) YARN-3850. NM fails to read files from full disks which can lead to container logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 40b256949ad6f6e0dbdd248f2d257b05899f4332) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java NM fails to read files from full disks which can lead to container logs being lost and other issues --- Key: YARN-3850 URL: https://issues.apache.org/jira/browse/YARN-3850 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation, nodemanager Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Priority: Blocker Fix For: 2.7.1 Attachments: YARN-3850.01.patch, YARN-3850.02.patch *Container logs* can be lost if disk has become full(~90% full). When application finishes, we upload logs after aggregation by calling {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns checks the eligible directories on call to {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would return nothing. So none of the container logs are aggregated and uploaded. But on application finish, we also call {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the application directory which contains container logs. This is because it calls {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks as well. So we are left with neither aggregated logs for the app nor the individual container logs for the app. In addition to this, there are 2 more issues : # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so NM will fail to serve up logs from full disks from its web interfaces. # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full disks so it is possible that on container recovery, PID file is not found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
[ https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604111#comment-14604111 ] Hudson commented on YARN-2871: -- FAILURE: Integrated in Hadoop-Yarn-trunk #971 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/971/]) YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * hadoop-yarn-project/CHANGES.txt TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk - Key: YARN-2871 URL: https://issues.apache.org/jira/browse/YARN-2871 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: zhihai xu Priority: Minor Fix For: 2.8.0 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, YARN-2871.002.patch From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746): {code} Failed tests: TestRMRestart.testRMRestartGetApplicationList:957 rMAppManager.logApplicationSummary( isA(org.apache.hadoop.yarn.api.records.ApplicationId) ); Wanted 3 times: - at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957) But was 2 times: - at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3846) RM Web UI queue fileter not working
[ https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604131#comment-14604131 ] Mohammad Shahid Khan commented on YARN-3846: https://issues.apache.org/jira/browse/YARN-2238 is for different fix. The label Queue : was added before the queue name in https://issues.apache.org/jira/browse/YARN-3362. After the search was not working and the same has been handled in the https://issues.apache.org/jira/browse/YARN-3707. The fix is OK for the for the first child of the queue but will not work for the second child for example Queue: b.x Queue: root Queue: a _*For queue a it will work fine*_ Queue: b Queue: b.x _*But for queue x and y this will not work*_ Queue: b.y *My question: * What is the significance of adding Label *Queue:* before the queue name? RM Web UI queue fileter not working --- Key: YARN-3846 URL: https://issues.apache.org/jira/browse/YARN-3846 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Reporter: Mohammad Shahid Khan Assignee: Mohammad Shahid Khan Click on root queue will show the complete applications But click on the leaf queue is not filtering the application related to the the clicked queue. The regular expression seems to be wrong {code} q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';, {code} For example 1. Suppose queue name is b them the above expression will try to substr at index 1 q.lastIndexOf(':') = -1 -1+2= 1 which is wrong. its should look at the 0 index. 2. if queue name is ab.x then it will parse it to .x but it should be x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues
[ https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604166#comment-14604166 ] Hudson commented on YARN-3850: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #230 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/230/]) YARN-3850. NM fails to read files from full disks which can lead to container logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 40b256949ad6f6e0dbdd248f2d257b05899f4332) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java NM fails to read files from full disks which can lead to container logs being lost and other issues --- Key: YARN-3850 URL: https://issues.apache.org/jira/browse/YARN-3850 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation, nodemanager Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Priority: Blocker Fix For: 2.7.1 Attachments: YARN-3850.01.patch, YARN-3850.02.patch *Container logs* can be lost if disk has become full(~90% full). When application finishes, we upload logs after aggregation by calling {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns checks the eligible directories on call to {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would return nothing. So none of the container logs are aggregated and uploaded. But on application finish, we also call {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the application directory which contains container logs. This is because it calls {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks as well. So we are left with neither aggregated logs for the app nor the individual container logs for the app. In addition to this, there are 2 more issues : # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so NM will fail to serve up logs from full disks from its web interfaces. # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full disks so it is possible that on container recovery, PID file is not found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3846) RM Web UI queue fileter not working
[ https://issues.apache.org/jira/browse/YARN-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604132#comment-14604132 ] Mohammad Shahid Khan commented on YARN-3846: Please confirm whether we have to keep the Label Queue: or not. Then will submit the patch accordingly. RM Web UI queue fileter not working --- Key: YARN-3846 URL: https://issues.apache.org/jira/browse/YARN-3846 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.7.0 Reporter: Mohammad Shahid Khan Assignee: Mohammad Shahid Khan Click on root queue will show the complete applications But click on the leaf queue is not filtering the application related to the the clicked queue. The regular expression seems to be wrong {code} q = '^' + q.substr(q.lastIndexOf(':') + 2) + '$';, {code} For example 1. Suppose queue name is b them the above expression will try to substr at index 1 q.lastIndexOf(':') = -1 -1+2= 1 which is wrong. its should look at the 0 index. 2. if queue name is ab.x then it will parse it to .x but it should be x -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
[ https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604165#comment-14604165 ] Hudson commented on YARN-2871: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #230 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/230/]) YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * hadoop-yarn-project/CHANGES.txt TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk - Key: YARN-2871 URL: https://issues.apache.org/jira/browse/YARN-2871 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: zhihai xu Priority: Minor Fix For: 2.8.0 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, YARN-2871.002.patch From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746): {code} Failed tests: TestRMRestart.testRMRestartGetApplicationList:957 rMAppManager.logApplicationSummary( isA(org.apache.hadoop.yarn.api.records.ApplicationId) ); Wanted 3 times: - at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957) But was 2 times: - at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues
[ https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604105#comment-14604105 ] Hudson commented on YARN-3850: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #241 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/241/]) YARN-3850. NM fails to read files from full disks which can lead to container logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 40b256949ad6f6e0dbdd248f2d257b05899f4332) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java NM fails to read files from full disks which can lead to container logs being lost and other issues --- Key: YARN-3850 URL: https://issues.apache.org/jira/browse/YARN-3850 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation, nodemanager Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Priority: Blocker Fix For: 2.7.1 Attachments: YARN-3850.01.patch, YARN-3850.02.patch *Container logs* can be lost if disk has become full(~90% full). When application finishes, we upload logs after aggregation by calling {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns checks the eligible directories on call to {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would return nothing. So none of the container logs are aggregated and uploaded. But on application finish, we also call {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the application directory which contains container logs. This is because it calls {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks as well. So we are left with neither aggregated logs for the app nor the individual container logs for the app. In addition to this, there are 2 more issues : # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so NM will fail to serve up logs from full disks from its web interfaces. # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full disks so it is possible that on container recovery, PID file is not found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
[ https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604104#comment-14604104 ] Hudson commented on YARN-2871: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #241 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/241/]) YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * hadoop-yarn-project/CHANGES.txt TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk - Key: YARN-2871 URL: https://issues.apache.org/jira/browse/YARN-2871 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: zhihai xu Priority: Minor Fix For: 2.8.0 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, YARN-2871.002.patch From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746): {code} Failed tests: TestRMRestart.testRMRestartGetApplicationList:957 rMAppManager.logApplicationSummary( isA(org.apache.hadoop.yarn.api.records.ApplicationId) ); Wanted 3 times: - at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957) But was 2 times: - at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3848) TestNodeLabelContainerAllocation is timing out
[ https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3848: --- Attachment: YARN-3848.01.patch Could have put a sleep in the test. But checked for dispatcher queue being drained instead. TestNodeLabelContainerAllocation is timing out -- Key: YARN-3848 URL: https://issues.apache.org/jira/browse/YARN-3848 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Jason Lowe Assignee: Varun Saxena Attachments: YARN-3848.01.patch, test_output.txt A number of builds, pre-commit and otherwise, have been failing recently because TestNodeLabelContainerAllocation has timed out. See https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, or YARN-3826 for examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3859) LeafQueue doesn't print user properly for application add
Devaraj K created YARN-3859: --- Summary: LeafQueue doesn't print user properly for application add Key: YARN-3859 URL: https://issues.apache.org/jira/browse/YARN-3859 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.7.0 Reporter: Devaraj K Priority: Minor {code:xml} 2015-06-28 04:36:22,721 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1435446241489_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8, leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 #queue-pending-applications: 2 #queue-active-applications: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3859) LeafQueue doesn't print user properly for application add
[ https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena reassigned YARN-3859: -- Assignee: Varun Saxena LeafQueue doesn't print user properly for application add - Key: YARN-3859 URL: https://issues.apache.org/jira/browse/YARN-3859 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.7.0 Reporter: Devaraj K Assignee: Varun Saxena Priority: Minor {code:xml} 2015-06-28 04:36:22,721 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1435446241489_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8, leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 #queue-pending-applications: 2 #queue-active-applications: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3859) LeafQueue doesn't print user properly for application add
[ https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3859: --- Attachment: YARN-3859.01.patch [~devaraj.k], kindly review LeafQueue doesn't print user properly for application add - Key: YARN-3859 URL: https://issues.apache.org/jira/browse/YARN-3859 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.7.0 Reporter: Devaraj K Assignee: Varun Saxena Priority: Minor Attachments: YARN-3859.01.patch {code:xml} 2015-06-28 04:36:22,721 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1435446241489_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8, leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 #queue-pending-applications: 2 #queue-active-applications: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3830) AbstractYarnScheduler.createReleaseCache may try to clean a null attempt
[ https://issues.apache.org/jira/browse/YARN-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604299#comment-14604299 ] Devaraj K commented on YARN-3830: - Nice catch [~nijel]. Thanks for working on this. Can you add a test to cover this change? Thanks. AbstractYarnScheduler.createReleaseCache may try to clean a null attempt Key: YARN-3830 URL: https://issues.apache.org/jira/browse/YARN-3830 Project: Hadoop YARN Issue Type: Bug Reporter: nijel Assignee: nijel Attachments: YARN-3830_1.patch, YARN-3830_2.patch, YARN-3830_3.patch org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.createReleaseCache() {code} protected void createReleaseCache() { // Cleanup the cache after nm expire interval. new Timer().schedule(new TimerTask() { @Override public void run() { for (SchedulerApplicationT app : applications.values()) { T attempt = app.getCurrentAppAttempt(); synchronized (attempt) { for (ContainerId containerId : attempt.getPendingRelease()) { RMAuditLogger.logFailure( {code} Here the attempt can be null since the attempt is created later. So null pointer exception will come {code} 2015-06-19 09:29:16,195 | ERROR | Timer-3 | Thread Thread[Timer-3,5,main] threw an Exception. | YarnUncaughtExceptionHandler.java:68 java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler$1.run(AbstractYarnScheduler.java:457) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505) {code} This will skip the other applications in this run. Can add a null check and continue with other applications -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues
[ https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604191#comment-14604191 ] Hudson commented on YARN-3850: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #239 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/239/]) YARN-3850. NM fails to read files from full disks which can lead to container logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 40b256949ad6f6e0dbdd248f2d257b05899f4332) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java * hadoop-yarn-project/CHANGES.txt NM fails to read files from full disks which can lead to container logs being lost and other issues --- Key: YARN-3850 URL: https://issues.apache.org/jira/browse/YARN-3850 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation, nodemanager Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Priority: Blocker Fix For: 2.7.1 Attachments: YARN-3850.01.patch, YARN-3850.02.patch *Container logs* can be lost if disk has become full(~90% full). When application finishes, we upload logs after aggregation by calling {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns checks the eligible directories on call to {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would return nothing. So none of the container logs are aggregated and uploaded. But on application finish, we also call {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the application directory which contains container logs. This is because it calls {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks as well. So we are left with neither aggregated logs for the app nor the individual container logs for the app. In addition to this, there are 2 more issues : # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so NM will fail to serve up logs from full disks from its web interfaces. # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full disks so it is possible that on container recovery, PID file is not found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
[ https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604190#comment-14604190 ] Hudson commented on YARN-2871: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #239 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/239/]) YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk - Key: YARN-2871 URL: https://issues.apache.org/jira/browse/YARN-2871 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: zhihai xu Priority: Minor Fix For: 2.8.0 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, YARN-2871.002.patch From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746): {code} Failed tests: TestRMRestart.testRMRestartGetApplicationList:957 rMAppManager.logApplicationSummary( isA(org.apache.hadoop.yarn.api.records.ApplicationId) ); Wanted 3 times: - at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957) But was 2 times: - at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3850) NM fails to read files from full disks which can lead to container logs being lost and other issues
[ https://issues.apache.org/jira/browse/YARN-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604200#comment-14604200 ] Hudson commented on YARN-3850: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2187 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2187/]) YARN-3850. NM fails to read files from full disks which can lead to container logs being lost and other issues. Contributed by Varun Saxena (jlowe: rev 40b256949ad6f6e0dbdd248f2d257b05899f4332) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LocalDirsHandlerService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/TestLogAggregationService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/TestContainerLogsPage.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/logaggregation/AppLogAggregatorImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java * hadoop-yarn-project/CHANGES.txt NM fails to read files from full disks which can lead to container logs being lost and other issues --- Key: YARN-3850 URL: https://issues.apache.org/jira/browse/YARN-3850 Project: Hadoop YARN Issue Type: Bug Components: log-aggregation, nodemanager Affects Versions: 2.7.0 Reporter: Varun Saxena Assignee: Varun Saxena Priority: Blocker Fix For: 2.7.1 Attachments: YARN-3850.01.patch, YARN-3850.02.patch *Container logs* can be lost if disk has become full(~90% full). When application finishes, we upload logs after aggregation by calling {{AppLogAggregatorImpl#uploadLogsForContainers}}. But this call in turns checks the eligible directories on call to {{LocalDirsHandlerService#getLogDirs}} which in case of disk full would return nothing. So none of the container logs are aggregated and uploaded. But on application finish, we also call {{AppLogAggregatorImpl#doAppLogAggregationPostCleanUp()}}. This deletes the application directory which contains container logs. This is because it calls {{LocalDirsHandlerService#getLogDirsForCleanup}} which returns the full disks as well. So we are left with neither aggregated logs for the app nor the individual container logs for the app. In addition to this, there are 2 more issues : # {{ContainerLogsUtil#getContainerLogDirs}} does not consider full disks so NM will fail to serve up logs from full disks from its web interfaces. # {{RecoveredContainerLaunch#locatePidFile}} also does not consider full disks so it is possible that on container recovery, PID file is not found. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode
[ https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604236#comment-14604236 ] Devaraj K commented on YARN-3857: - Thanks [~mujunchao] for reporting and working in this. I am assigning this issue to you. Adding to [~zxu], Can you also take care of the naming convention for the patch like JIRA-ID-patch-version.patch? Memory leak in ResourceManager with SIMPLE mode --- Key: YARN-3857 URL: https://issues.apache.org/jira/browse/YARN-3857 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: mujunchao Assignee: mujunchao Priority: Critical Attachments: hadoop-yarn-server-resourcemanager.patch We register the ClientTokenMasterKey to avoid client may hold an invalid ClientToken after RM restarts. In SIMPLE mode, we register PairApplicationAttemptId, null , But we never remove it from HashMap, as unregister only runing while in Security mode, so memory leak coming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode
[ https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-3857: Assignee: mujunchao Memory leak in ResourceManager with SIMPLE mode --- Key: YARN-3857 URL: https://issues.apache.org/jira/browse/YARN-3857 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: mujunchao Assignee: mujunchao Priority: Critical Attachments: hadoop-yarn-server-resourcemanager.patch We register the ClientTokenMasterKey to avoid client may hold an invalid ClientToken after RM restarts. In SIMPLE mode, we register PairApplicationAttemptId, null , But we never remove it from HashMap, as unregister only runing while in Security mode, so memory leak coming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3857) Memory leak in ResourceManager with SIMPLE mode
[ https://issues.apache.org/jira/browse/YARN-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604240#comment-14604240 ] Brahma Reddy Battula commented on YARN-3857: Nice Catch!! Memory leak in ResourceManager with SIMPLE mode --- Key: YARN-3857 URL: https://issues.apache.org/jira/browse/YARN-3857 Project: Hadoop YARN Issue Type: Bug Components: resourcemanager Affects Versions: 2.7.0 Reporter: mujunchao Assignee: mujunchao Priority: Critical Attachments: hadoop-yarn-server-resourcemanager.patch We register the ClientTokenMasterKey to avoid client may hold an invalid ClientToken after RM restarts. In SIMPLE mode, we register PairApplicationAttemptId, null , But we never remove it from HashMap, as unregister only runing while in Security mode, so memory leak coming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-3848) TestNodeLabelContainerAllocation is timing out
[ https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena reassigned YARN-3848: -- Assignee: Varun Saxena TestNodeLabelContainerAllocation is timing out -- Key: YARN-3848 URL: https://issues.apache.org/jira/browse/YARN-3848 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Jason Lowe Assignee: Varun Saxena A number of builds, pre-commit and otherwise, have been failing recently because TestNodeLabelContainerAllocation has timed out. See https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, or YARN-3826 for examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3848) TestNodeLabelContainerAllocation is timing out
[ https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604255#comment-14604255 ] Varun Saxena commented on YARN-3848: I mean the test does not have timeout. TestNodeLabelContainerAllocation is timing out -- Key: YARN-3848 URL: https://issues.apache.org/jira/browse/YARN-3848 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Jason Lowe Assignee: Varun Saxena Attachments: test_output.txt A number of builds, pre-commit and otherwise, have been failing recently because TestNodeLabelContainerAllocation has timed out. See https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, or YARN-3826 for examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2871) TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk
[ https://issues.apache.org/jira/browse/YARN-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604199#comment-14604199 ] Hudson commented on YARN-2871: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2187 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2187/]) YARN-2871. TestRMRestart#testRMRestartGetApplicationList sometime fails (xgong: rev fe6c1bd73aee188ed58df4d33bbc2d2fe0779a97) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * hadoop-yarn-project/CHANGES.txt TestRMRestart#testRMRestartGetApplicationList sometime fails in trunk - Key: YARN-2871 URL: https://issues.apache.org/jira/browse/YARN-2871 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Assignee: zhihai xu Priority: Minor Fix For: 2.8.0 Attachments: YARN-2871.000.patch, YARN-2871.001.patch, YARN-2871.002.patch From trunk build #746 (https://builds.apache.org/job/Hadoop-Yarn-trunk/746): {code} Failed tests: TestRMRestart.testRMRestartGetApplicationList:957 rMAppManager.logApplicationSummary( isA(org.apache.hadoop.yarn.api.records.ApplicationId) ); Wanted 3 times: - at org.apache.hadoop.yarn.server.resourcemanager.TestRMRestart.testRMRestartGetApplicationList(TestRMRestart.java:957) But was 2 times: - at org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.handle(RMAppManager.java:66) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3848) TestNodeLabelContainerAllocation is timing out
[ https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3848: --- Attachment: test_output.txt TestNodeLabelContainerAllocation is timing out -- Key: YARN-3848 URL: https://issues.apache.org/jira/browse/YARN-3848 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Jason Lowe Assignee: Varun Saxena Attachments: test_output.txt A number of builds, pre-commit and otherwise, have been failing recently because TestNodeLabelContainerAllocation has timed out. See https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, or YARN-3826 for examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3848) TestNodeLabelContainerAllocation is timing out
[ https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604254#comment-14604254 ] Varun Saxena commented on YARN-3848: Test which is failing is {{testQueueMaxCapacitiesWillNotBeHonoredWhenNotRespectingExclusivity}}. Test output has been attached. Basically MockRM is being stopped while dispatcher still has events in its queue which leads to InterruptedException. JUnit wrongly interprets this as timeout, even though it isn't. TestNodeLabelContainerAllocation is timing out -- Key: YARN-3848 URL: https://issues.apache.org/jira/browse/YARN-3848 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Jason Lowe Assignee: Varun Saxena Attachments: test_output.txt A number of builds, pre-commit and otherwise, have been failing recently because TestNodeLabelContainerAllocation has timed out. See https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, or YARN-3826 for examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3848) TestNodeLabelContainerAllocation is timing out
[ https://issues.apache.org/jira/browse/YARN-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604309#comment-14604309 ] Hadoop QA commented on YARN-3848: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 20s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 38s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 30s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 59s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 57s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 50m 53s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 94m 27s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12742338/YARN-3848.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 79ed0f9 | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8362/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8362/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8362/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8362/console | This message was automatically generated. TestNodeLabelContainerAllocation is timing out -- Key: YARN-3848 URL: https://issues.apache.org/jira/browse/YARN-3848 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Jason Lowe Assignee: Varun Saxena Attachments: YARN-3848.01.patch, test_output.txt A number of builds, pre-commit and otherwise, have been failing recently because TestNodeLabelContainerAllocation has timed out. See https://builds.apache.org/job/Hadoop-Yarn-trunk/969/, YARN-3830, YARN-3802, or YARN-3826 for examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3859) LeafQueue doesn't print user properly for application add
[ https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604322#comment-14604322 ] Hadoop QA commented on YARN-3859: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 51s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 33s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 34s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 46s | The applied patch generated 1 new checkstyle issues (total was 151, now 151). | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 25s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 50m 49s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 88m 29s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12742339/YARN-3859.01.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 79ed0f9 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8363/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8363/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8363/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8363/console | This message was automatically generated. LeafQueue doesn't print user properly for application add - Key: YARN-3859 URL: https://issues.apache.org/jira/browse/YARN-3859 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.7.0 Reporter: Devaraj K Assignee: Varun Saxena Priority: Minor Attachments: YARN-3859.01.patch {code:xml} 2015-06-28 04:36:22,721 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1435446241489_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8, leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 #queue-pending-applications: 2 #queue-active-applications: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3859) LeafQueue doesn't print user properly for application add
[ https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604323#comment-14604323 ] Varun Saxena commented on YARN-3859: Checkstyle issue related to file length LeafQueue doesn't print user properly for application add - Key: YARN-3859 URL: https://issues.apache.org/jira/browse/YARN-3859 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.7.0 Reporter: Devaraj K Assignee: Varun Saxena Priority: Minor Attachments: YARN-3859.01.patch {code:xml} 2015-06-28 04:36:22,721 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1435446241489_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8, leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 #queue-pending-applications: 2 #queue-active-applications: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3768) Index out of range exception with environment variables without values
[ https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gera Shegalov updated YARN-3768: Attachment: YARN-3768.002.patch You are right [~zxu], and I actually meant to combine matching k=v pairs and capturing k and v in one shot. Index out of range exception with environment variables without values -- Key: YARN-3768 URL: https://issues.apache.org/jira/browse/YARN-3768 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.5.0 Reporter: Joe Ferner Assignee: zhihai xu Attachments: YARN-3768.000.patch, YARN-3768.001.patch, YARN-3768.002.patch Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range exception occurs if an environment variable is encountered without a value. I believe this occurs because java will not return empty strings from the split method. Similar to this http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values
[ https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604372#comment-14604372 ] Gera Shegalov commented on YARN-3768: - 002 attached, with this idea and proper name validation. Index out of range exception with environment variables without values -- Key: YARN-3768 URL: https://issues.apache.org/jira/browse/YARN-3768 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.5.0 Reporter: Joe Ferner Assignee: zhihai xu Attachments: YARN-3768.000.patch, YARN-3768.001.patch, YARN-3768.002.patch Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range exception occurs if an environment variable is encountered without a value. I believe this occurs because java will not return empty strings from the split method. Similar to this http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3768) Index out of range exception with environment variables without values
[ https://issues.apache.org/jira/browse/YARN-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604401#comment-14604401 ] Hadoop QA commented on YARN-3768: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 45s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 33s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 37s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 1m 47s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 25s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | common tests | 22m 1s | Tests passed in hadoop-common. | | {color:green}+1{color} | yarn tests | 1m 56s | Tests passed in hadoop-yarn-common. | | | | 66m 36s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12742353/YARN-3768.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 79ed0f9 | | hadoop-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8364/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8364/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8364/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8364/console | This message was automatically generated. Index out of range exception with environment variables without values -- Key: YARN-3768 URL: https://issues.apache.org/jira/browse/YARN-3768 Project: Hadoop YARN Issue Type: Bug Components: yarn Affects Versions: 2.5.0 Reporter: Joe Ferner Assignee: zhihai xu Attachments: YARN-3768.000.patch, YARN-3768.001.patch, YARN-3768.002.patch Looking at line 80 of org.apache.hadoop.yarn.util.Apps an index out of range exception occurs if an environment variable is encountered without a value. I believe this occurs because java will not return empty strings from the split method. Similar to this http://stackoverflow.com/questions/14602062/java-string-split-removed-empty-values -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3859) LeafQueue doesn't print user properly for application add
[ https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604528#comment-14604528 ] Hudson commented on YARN-3859: -- FAILURE: Integrated in Hadoop-trunk-Commit #8078 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8078/]) YARN-3859. LeafQueue doesn't print user properly for application add. (devaraj: rev b543d1a390a67e5e92fea67d3a2635058c29e9da) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java LeafQueue doesn't print user properly for application add - Key: YARN-3859 URL: https://issues.apache.org/jira/browse/YARN-3859 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.7.0 Reporter: Devaraj K Assignee: Varun Saxena Priority: Minor Fix For: 2.8.0 Attachments: YARN-3859.01.patch {code:xml} 2015-06-28 04:36:22,721 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1435446241489_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8, leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 #queue-pending-applications: 2 #queue-active-applications: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3860: --- Attachment: YARN-3860.002.patch I fixed inappropriate name of the argument variable of getTargetIds in 002. rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3860.001.patch, YARN-3860.002.patch Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604506#comment-14604506 ] Hadoop QA commented on YARN-3860: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 41s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 33s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 35s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 22s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 29s | The applied patch generated 1 new checkstyle issues (total was 38, now 39). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 51s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 19m 52s | Tests failed in hadoop-yarn-client. | | | | 56m 34s | | \\ \\ || Reason || Tests || | Timed out tests | org.apache.hadoop.yarn.client.TestApplicationClientProtocolOnHA | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12742365/YARN-3860.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 79ed0f9 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/whitespace.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8365/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8365/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8365/console | This message was automatically generated. rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3860.001.patch, YARN-3860.002.patch Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3859) LeafQueue doesn't print user properly for application add
[ https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj K updated YARN-3859: Hadoop Flags: Reviewed +1 for the patch, committing it. LeafQueue doesn't print user properly for application add - Key: YARN-3859 URL: https://issues.apache.org/jira/browse/YARN-3859 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.7.0 Reporter: Devaraj K Assignee: Varun Saxena Priority: Minor Attachments: YARN-3859.01.patch {code:xml} 2015-06-28 04:36:22,721 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1435446241489_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8, leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 #queue-pending-applications: 2 #queue-active-applications: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3859) LeafQueue doesn't print user properly for application add
[ https://issues.apache.org/jira/browse/YARN-3859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604540#comment-14604540 ] Varun Saxena commented on YARN-3859: Thanks [~devaraj.k] for the review and commit. LeafQueue doesn't print user properly for application add - Key: YARN-3859 URL: https://issues.apache.org/jira/browse/YARN-3859 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 2.7.0 Reporter: Devaraj K Assignee: Varun Saxena Priority: Minor Fix For: 2.8.0 Attachments: YARN-3859.01.patch {code:xml} 2015-06-28 04:36:22,721 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application added - appId: application_1435446241489_0003 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@e8fb7a8, leaf-queue: default #user-pending-applications: 2 #user-active-applications: 1 #queue-pending-applications: 2 #queue-active-applications: 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3860: --- Attachment: YARN-3860.001.patch rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3860.001.patch {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3860: --- Attachment: YARN-3860.003.patch addressing checkstyle and whitespace warnings. rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3860.001.patch, YARN-3860.002.patch, YARN-3860.003.patch Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604531#comment-14604531 ] Hadoop QA commented on YARN-3860: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 15s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 31s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 35s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 28s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 34s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 32s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 52s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 6m 55s | Tests passed in hadoop-yarn-client. | | | | 43m 9s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12742371/YARN-3860.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 79ed0f9 | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8367/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8367/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8367/console | This message was automatically generated. rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3860.001.patch, YARN-3860.002.patch, YARN-3860.003.patch Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated YARN-3860: --- Description: Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. (was: {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do.) rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3860.001.patch Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604509#comment-14604509 ] Hadoop QA commented on YARN-3860: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 15m 37s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 36s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 39s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 0m 30s | The applied patch generated 1 new checkstyle issues (total was 38, now 39). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 52s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 6m 55s | Tests passed in hadoop-yarn-client. | | | | 43m 42s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12742367/YARN-3860.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 79ed0f9 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/whitespace.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/8366/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8366/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8366/console | This message was automatically generated. rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3860.001.patch, YARN-3860.002.patch Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604539#comment-14604539 ] zhihai xu commented on YARN-3860: - [~iwasakims], thanks for working on this issue. This looks like a good catch. One nit: I think times(1) is used by default, Can we just use {{verify(haadmin).getServiceStatus();}}? because all other tests didn't have times(1) in verify. rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Attachments: YARN-3860.001.patch, YARN-3860.002.patch, YARN-3860.003.patch Users can make both ResouceManagers active by {{rmadmin -transitionToActive}} even if {{\--forceactive}} option is not given. {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
[ https://issues.apache.org/jira/browse/YARN-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604474#comment-14604474 ] Masatake Iwasaki commented on YARN-3860: HAAdmin#isOtherTargetNodeActive does not checks the other nodes are active without overriding HAAdmin#getTargetIds which returns list including only given target id. RMAdminCLI should have getTargetIds method which returns list of all node ids as DFSHAAdmin do. rmadmin -transitionToActive should check the state of non-target node - Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-3860) rmadmin -transitionToActive should check the state of non-target node
Masatake Iwasaki created YARN-3860: -- Summary: rmadmin -transitionToActive should check the state of non-target node Key: YARN-3860 URL: https://issues.apache.org/jira/browse/YARN-3860 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.0 Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor {{haadmin -transitionToActive}} of HDFS checks whether non-target nodes are already active but {{rmadmin -transitionToActive}} does not do. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3017) ContainerID in ResourceManager Log Has Slightly Different Format From AppAttemptID
[ https://issues.apache.org/jira/browse/YARN-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14604503#comment-14604503 ] zhihai xu commented on YARN-3017: - I just found this change may cause problem in LogAggregation during rolling upgrade with NM-Recovery-supervised enabled. The following code in {{AggregatedLogFormat#getPendingLogFilesToUploadForThisContainer}} will upload the log based on the containerId String. So we may miss uploading the old log files after upgrade. {code} File containerLogDir = new File(appLogDir, ConverterUtils.toString(this.containerId)); if (!containerLogDir.isDirectory()) { continue; // ContainerDir may have been deleted by the user. } pendingUploadFiles .addAll(getPendingLogFilesToUpload(containerLogDir)); {code} To support this issue, we also need make change in {{getPendingLogFilesToUploadForThisContainer}} to compare containerId using {{ContainerId#fromString}}. It looks like it makes sense to keep the old format for compatibility. ContainerID in ResourceManager Log Has Slightly Different Format From AppAttemptID -- Key: YARN-3017 URL: https://issues.apache.org/jira/browse/YARN-3017 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 2.8.0 Reporter: MUFEED USMAN Assignee: Mohammad Shahid Khan Priority: Minor Labels: PatchAvailable Attachments: YARN-3017.patch, YARN-3017_1.patch, YARN-3017_2.patch, YARN-3017_3.patch Not sure if this should be filed as a bug or not. In the ResourceManager log in the events surrounding the creation of a new application attempt, ... ... 2014-11-14 17:45:37,258 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching masterappattempt_1412150883650_0001_02 ... ... The application attempt has the ID format _1412150883650_0001_02. Whereas the associated ContainerID goes by _1412150883650_0001_02_. ... ... 2014-11-14 17:45:37,260 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Setting up container Container: [ContainerId: container_1412150883650_0001_02_01, NodeId: n67:55933, NodeHttpAddress: n67:8042, Resource: memory:2048, vCores:1, disks:0.0, Priority: 0, Token: Token { kind: ContainerToken, service: 10.10.70.67:55933 }, ] for AM appattempt_1412150883650_0001_02 ... ... Curious to know if this is kept like that for a reason. If not while using filtering tools to, say, grep events surrounding a specific attempt by the numeric ID part information may slip out during troubleshooting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)