[jira] [Updated] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated YARN-1696: Target Version/s: 2.4.1 (was: 2.4.0) Document RM HA -- Key: YARN-1696 URL: https://issues.apache.org/jira/browse/YARN-1696 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: YARN-1696.2.patch, yarn-1696-1.patch Add documentation for RM HA. Marking this a blocker for 2.4 as this is required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955015#comment-13955015 ] Arun C Murthy commented on YARN-1696: - [~kasha] - I'm almost done with rc0, moving this to 2.4.1 - if we need to spin rc1 we can get this in. Else, we can manually put this doc on the site when ready for 2.4.0. Thanks. Document RM HA -- Key: YARN-1696 URL: https://issues.apache.org/jira/browse/YARN-1696 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: YARN-1696.2.patch, yarn-1696-1.patch Add documentation for RM HA. Marking this a blocker for 2.4 as this is required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
[ https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated YARN-1879: - Attachment: YARN-1879.1.patch Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol --- Key: YARN-1879 URL: https://issues.apache.org/jira/browse/YARN-1879 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Tsuyoshi OZAWA Priority: Critical Attachments: YARN-1879.1.patch, YARN-1879.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1893) Make ApplicationMasterProtocol#allocate AtMostOnce
[ https://issues.apache.org/jira/browse/YARN-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955097#comment-13955097 ] Hudson commented on YARN-1893: -- FAILURE: Integrated in Hadoop-Yarn-trunk #525 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/525/]) YARN-1893. Mark AtMostOnce annotation to ApplicationMasterProtocol#allocate. Contributed by Xuan Gong. (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1583203) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationMasterProtocol.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationClientProtocolOnHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java Make ApplicationMasterProtocol#allocate AtMostOnce -- Key: YARN-1893 URL: https://issues.apache.org/jira/browse/YARN-1893 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Xuan Gong Assignee: Xuan Gong Priority: Blocker Fix For: 2.4.0 Attachments: YARN-1893.1.patch, YARN-1893.1.patch, YARN-1893.2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1893) Make ApplicationMasterProtocol#allocate AtMostOnce
[ https://issues.apache.org/jira/browse/YARN-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955190#comment-13955190 ] Hudson commented on YARN-1893: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1717 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1717/]) YARN-1893. Mark AtMostOnce annotation to ApplicationMasterProtocol#allocate. Contributed by Xuan Gong. (jianhe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1583203) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationMasterProtocol.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/ProtocolHATestBase.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationClientProtocolOnHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestApplicationMasterServiceOnHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestResourceTrackerOnHA.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java Make ApplicationMasterProtocol#allocate AtMostOnce -- Key: YARN-1893 URL: https://issues.apache.org/jira/browse/YARN-1893 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Xuan Gong Assignee: Xuan Gong Priority: Blocker Fix For: 2.4.0 Attachments: YARN-1893.1.patch, YARN-1893.1.patch, YARN-1893.2.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1696) Document RM HA
[ https://issues.apache.org/jira/browse/YARN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955331#comment-13955331 ] Karthik Kambatla commented on YARN-1696: [~acmurthy] - sorry, I was not checking email over the weekend. I can get to this today. Was caught up with other things and given there were other blockers, didn't rush on this. Document RM HA -- Key: YARN-1696 URL: https://issues.apache.org/jira/browse/YARN-1696 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Affects Versions: 2.3.0 Reporter: Karthik Kambatla Assignee: Karthik Kambatla Priority: Blocker Attachments: YARN-1696.2.patch, yarn-1696-1.patch Add documentation for RM HA. Marking this a blocker for 2.4 as this is required to call RM HA Stable and ready for public consumption. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (YARN-808) ApplicationReport does not clearly tell that the attempt is running or not
[ https://issues.apache.org/jira/browse/YARN-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong resolved YARN-808. Resolution: Won't Fix Close this ticket as won't fix. Because we already have apis/cli to get ApplicationAttemptReport, so we do not need to expose it with ApplicationReport. ApplicationReport does not clearly tell that the attempt is running or not -- Key: YARN-808 URL: https://issues.apache.org/jira/browse/YARN-808 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-808.1.patch When an app attempt fails and is being retried, ApplicationReport immediately gives the new attemptId and non-null values of host etc. There is no way for clients to know that the attempt is running other than connecting to it and timing out on invalid host. Solution would be to expose the attempt state or return a null value for host instead of N/A -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1763) Handle RM failovers during the submitApplication call.
[ https://issues.apache.org/jira/browse/YARN-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955426#comment-13955426 ] Xuan Gong commented on YARN-1763: - already fixed with YARN-1521 Handle RM failovers during the submitApplication call. -- Key: YARN-1763 URL: https://issues.apache.org/jira/browse/YARN-1763 Project: Hadoop YARN Issue Type: Sub-task Reporter: Xuan Gong Assignee: Xuan Gong -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-808) ApplicationReport does not clearly tell that the attempt is running or not
[ https://issues.apache.org/jira/browse/YARN-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955438#comment-13955438 ] Bikas Saha commented on YARN-808: - We should atleast change the app report response to include invalid value of host and port in the response when there host and port are not ready. Currently we return a value of N/A for host which is confusing since the non-null string could be a valid host. We should return null for host and some -ve number for the port. ApplicationReport does not clearly tell that the attempt is running or not -- Key: YARN-808 URL: https://issues.apache.org/jira/browse/YARN-808 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-808.1.patch When an app attempt fails and is being retried, ApplicationReport immediately gives the new attemptId and non-null values of host etc. There is no way for clients to know that the attempt is running other than connecting to it and timing out on invalid host. Solution would be to expose the attempt state or return a null value for host instead of N/A -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (YARN-1892) Excessive logging in RM
[ https://issues.apache.org/jira/browse/YARN-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He reassigned YARN-1892: - Assignee: Jian He Excessive logging in RM --- Key: YARN-1892 URL: https://issues.apache.org/jira/browse/YARN-1892 Project: Hadoop YARN Issue Type: Bug Reporter: Siddharth Seth Assignee: Jian He Priority: Minor Mostly in the CS I believe {code} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt: Application application_1395435468498_0011 reserved container container_1395435468498_0011_01_000213 on node host: #containers=5 available=4096 used=20960, currently has 1 at priority 4; currentReservation 4096 {code} {code} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: hive2 usedResources: memory:20480, vCores:5 clusterResources: memory:81920, vCores:16 currentCapacity 0.25 required memory:4096, vCores:1 potentialNewCapacity: 0.255 ( max-capacity: 0.25) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1892) Excessive logging in RM
[ https://issues.apache.org/jira/browse/YARN-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jian He updated YARN-1892: -- Attachment: YARN-1892.1.patch Simple patch to clean some CS loggings. These logs will become more excessive if async scheduling is enabled, log per 5ms cycle by default. Move a few redundant logs regarding container reservation to debug level. Fix a few log formats and contents. Excessive logging in RM --- Key: YARN-1892 URL: https://issues.apache.org/jira/browse/YARN-1892 Project: Hadoop YARN Issue Type: Bug Reporter: Siddharth Seth Assignee: Jian He Priority: Minor Attachments: YARN-1892.1.patch Mostly in the CS I believe {code} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt: Application application_1395435468498_0011 reserved container container_1395435468498_0011_01_000213 on node host: #containers=5 available=4096 used=20960, currently has 1 at priority 4; currentReservation 4096 {code} {code} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: hive2 usedResources: memory:20480, vCores:5 clusterResources: memory:81920, vCores:16 currentCapacity 0.25 required memory:4096, vCores:1 potentialNewCapacity: 0.255 ( max-capacity: 0.25) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1870) FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo()
[ https://issues.apache.org/jira/browse/YARN-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated YARN-1870: -- Assignee: Fengdong Yu FileInputStream is not closed in ProcfsBasedProcessTree#constructProcessSMAPInfo() -- Key: YARN-1870 URL: https://issues.apache.org/jira/browse/YARN-1870 Project: Hadoop YARN Issue Type: Bug Reporter: Ted Yu Assignee: Fengdong Yu Priority: Minor Attachments: YARN-1870.patch {code} ListString lines = IOUtils.readLines(new FileInputStream(file)); {code} FileInputStream is not closed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-808) ApplicationReport does not clearly tell that the attempt is running or not
[ https://issues.apache.org/jira/browse/YARN-808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955513#comment-13955513 ] Zhijie Shen commented on YARN-808: -- IMHO, it's a different issue. The N/A string is not just restricted in host, but diagnostics, tracking url and etc, and not in ApplicationReport, but in ApplicationAttemptReport and ContainerReport. I think the good thing is to decouple the data and the display. While these string fields should keep null or empty for those programmatically get the reports to easily validate them, while webUI and CLI, who ware actually the consumers of the reports, should check whether these fields are null or empty, and display N/A when necessary. ApplicationReport does not clearly tell that the attempt is running or not -- Key: YARN-808 URL: https://issues.apache.org/jira/browse/YARN-808 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Bikas Saha Assignee: Xuan Gong Attachments: YARN-808.1.patch When an app attempt fails and is being retried, ApplicationReport immediately gives the new attemptId and non-null values of host etc. There is no way for clients to know that the attempt is running other than connecting to it and timing out on invalid host. Solution would be to expose the attempt state or return a null value for host instead of N/A -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1892) Excessive logging in RM
[ https://issues.apache.org/jira/browse/YARN-1892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955544#comment-13955544 ] Hadoop QA commented on YARN-1892: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12637891/YARN-1892.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3492//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3492//console This message is automatically generated. Excessive logging in RM --- Key: YARN-1892 URL: https://issues.apache.org/jira/browse/YARN-1892 Project: Hadoop YARN Issue Type: Bug Reporter: Siddharth Seth Assignee: Jian He Priority: Minor Attachments: YARN-1892.1.patch Mostly in the CS I believe {code} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt: Application application_1395435468498_0011 reserved container container_1395435468498_0011_01_000213 on node host: #containers=5 available=4096 used=20960, currently has 1 at priority 4; currentReservation 4096 {code} {code} INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: hive2 usedResources: memory:20480, vCores:5 clusterResources: memory:81920, vCores:16 currentCapacity 0.25 required memory:4096, vCores:1 potentialNewCapacity: 0.255 ( max-capacity: 0.25) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated YARN-221: - Attachment: YARN-221-trunk-v2.patch Here is the patch to support log aggregation sampling at yarn layer. Yarn applications can choose to override the default behavior. Without any change at MR layer to specify per-container log aggregation policy, yarn log aggregation sampling policy at cluster level will be applied. NM should provide a way for AM to tell it not to aggregate logs. Key: YARN-221 URL: https://issues.apache.org/jira/browse/YARN-221 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Robert Joseph Evans Assignee: Chris Trezzo Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch The NodeManager should provide a way for an AM to tell it that either the logs should not be aggregated, that they should be aggregated with a high priority, or that they should be aggregated but with a lower priority. The AM should be able to do this in the ContainerLaunch context to provide a default value, but should also be able to update the value when the container is released. This would allow for the NM to not aggregate logs in some cases, and avoid connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-85) Allow per job log aggregation configuration
[ https://issues.apache.org/jira/browse/YARN-85?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955601#comment-13955601 ] Ming Ma commented on YARN-85: - Regarding Seth's comment of container exit status is not necessarily an indication of whether a task completes successfully or not, https://issues.apache.org/jira/browse/MAPREDUCE-5465 should fix the issue. Allow per job log aggregation configuration --- Key: YARN-85 URL: https://issues.apache.org/jira/browse/YARN-85 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Siddharth Seth Assignee: Chris Trezzo Priority: Critical Currently, if log aggregation is enabled for a cluster - logs for all jobs will be aggregated - leading to a whole bunch of files on hdfs which users may not want. Users should be able to control this along with the aggregation policy - failed only, all, etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-221) NM should provide a way for AM to tell it not to aggregate logs.
[ https://issues.apache.org/jira/browse/YARN-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955678#comment-13955678 ] Hadoop QA commented on YARN-221: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12637905/YARN-221-trunk-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3493//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3493//console This message is automatically generated. NM should provide a way for AM to tell it not to aggregate logs. Key: YARN-221 URL: https://issues.apache.org/jira/browse/YARN-221 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Robert Joseph Evans Assignee: Chris Trezzo Attachments: YARN-221-trunk-v1.patch, YARN-221-trunk-v2.patch The NodeManager should provide a way for an AM to tell it that either the logs should not be aggregated, that they should be aggregated with a high priority, or that they should be aggregated but with a lower priority. The AM should be able to do this in the ContainerLaunch context to provide a default value, but should also be able to update the value when the container is released. This would allow for the NM to not aggregate logs in some cases, and avoid connection to the NN at all. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator
[ https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955762#comment-13955762 ] Sandy Ryza commented on YARN-1889: -- +1 avoid creating new objects on each fair scheduler call to AppSchedulable comparator --- Key: YARN-1889 URL: https://issues.apache.org/jira/browse/YARN-1889 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Hong Zhiguo Priority: Minor Labels: reviewed Attachments: YARN-1889.patch, YARN-1889.patch In fair scheduler, in each scheduling attempt, a full sort is performed on List of AppSchedulable, which invokes Comparator.compare method many times. Both FairShareComparator and DRFComparator call AppSchedulable.getWeights, and AppSchedulable.getPriority. A new ResourceWeights object is allocated on each call of getWeights, and the same for getPriority. This introduces a lot of pressure to GC because these methods are called very very frequently. Below test case shows improvement on performance and GC behaviour. The results show that the GC pressure during processing NodeUpdate is recuded half by this patch. The code to show the improvement: (Add it to TestFairScheduler.java) import java.lang.management.GarbageCollectorMXBean; import java.lang.management.ManagementFactory; public void printGCStats() { long totalGarbageCollections = 0; long garbageCollectionTime = 0; for(GarbageCollectorMXBean gc : ManagementFactory.getGarbageCollectorMXBeans()) { long count = gc.getCollectionCount(); if(count = 0) { totalGarbageCollections += count; } long time = gc.getCollectionTime(); if(time = 0) { garbageCollectionTime += time; } } System.out.println(Total Garbage Collections: + totalGarbageCollections); System.out.println(Total Garbage Collection Time (ms): + garbageCollectionTime); } @Test public void testImpactOnGC() throws Exception { scheduler.reinitialize(conf, resourceManager.getRMContext()); // Add nodes int numNode = 1; for (int i = 0; i numNode; ++i) { String host = String.format(192.1.%d.%d, i/256, i%256); RMNode node = MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, host); NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node); scheduler.handle(nodeEvent); assertEquals(1024 * 64 * (i+1), scheduler.getClusterCapacity().getMemory()); } assertEquals(numNode, scheduler.getNumClusterNodes()); assertEquals(1024 * 64 * numNode, scheduler.getClusterCapacity().getMemory()); // add apps, each app has 100 containers. int minReqSize = FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB; int numApp = 8000; int priority = 1; for (int i = 1; i numApp + 1; ++i) { ApplicationAttemptId attemptId = createAppAttemptId(i, 1); AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent( attemptId.getApplicationId(), queue1, user1); scheduler.handle(appAddedEvent); AppAttemptAddedSchedulerEvent attemptAddedEvent = new AppAttemptAddedSchedulerEvent(attemptId, false); scheduler.handle(attemptAddedEvent); createSchedulingRequestExistingApplication(minReqSize * 2, 1, priority, attemptId); } scheduler.update(); assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, true) .getRunnableAppSchedulables().size()); System.out.println(GC stats before NodeUpdate processing:); printGCStats(); int hb_num = 5000; long start = System.nanoTime(); for (int i = 0; i hb_num; ++i) { String host = String.format(192.1.%d.%d, i/256, i%256); RMNode node = MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, host); NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node); scheduler.handle(nodeEvent); } long end = System.nanoTime(); System.out.printf(processing time for a NodeUpdate in average: %d us\n, (end - start)/(hb_num * 1000)); System.out.println(GC stats after NodeUpdate processing:); printGCStats(); } -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator
[ https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1889: - Assignee: Hong Zhiguo avoid creating new objects on each fair scheduler call to AppSchedulable comparator --- Key: YARN-1889 URL: https://issues.apache.org/jira/browse/YARN-1889 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Hong Zhiguo Assignee: Hong Zhiguo Priority: Minor Labels: reviewed Attachments: YARN-1889.patch, YARN-1889.patch In fair scheduler, in each scheduling attempt, a full sort is performed on List of AppSchedulable, which invokes Comparator.compare method many times. Both FairShareComparator and DRFComparator call AppSchedulable.getWeights, and AppSchedulable.getPriority. A new ResourceWeights object is allocated on each call of getWeights, and the same for getPriority. This introduces a lot of pressure to GC because these methods are called very very frequently. Below test case shows improvement on performance and GC behaviour. The results show that the GC pressure during processing NodeUpdate is recuded half by this patch. The code to show the improvement: (Add it to TestFairScheduler.java) import java.lang.management.GarbageCollectorMXBean; import java.lang.management.ManagementFactory; public void printGCStats() { long totalGarbageCollections = 0; long garbageCollectionTime = 0; for(GarbageCollectorMXBean gc : ManagementFactory.getGarbageCollectorMXBeans()) { long count = gc.getCollectionCount(); if(count = 0) { totalGarbageCollections += count; } long time = gc.getCollectionTime(); if(time = 0) { garbageCollectionTime += time; } } System.out.println(Total Garbage Collections: + totalGarbageCollections); System.out.println(Total Garbage Collection Time (ms): + garbageCollectionTime); } @Test public void testImpactOnGC() throws Exception { scheduler.reinitialize(conf, resourceManager.getRMContext()); // Add nodes int numNode = 1; for (int i = 0; i numNode; ++i) { String host = String.format(192.1.%d.%d, i/256, i%256); RMNode node = MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, host); NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node); scheduler.handle(nodeEvent); assertEquals(1024 * 64 * (i+1), scheduler.getClusterCapacity().getMemory()); } assertEquals(numNode, scheduler.getNumClusterNodes()); assertEquals(1024 * 64 * numNode, scheduler.getClusterCapacity().getMemory()); // add apps, each app has 100 containers. int minReqSize = FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB; int numApp = 8000; int priority = 1; for (int i = 1; i numApp + 1; ++i) { ApplicationAttemptId attemptId = createAppAttemptId(i, 1); AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent( attemptId.getApplicationId(), queue1, user1); scheduler.handle(appAddedEvent); AppAttemptAddedSchedulerEvent attemptAddedEvent = new AppAttemptAddedSchedulerEvent(attemptId, false); scheduler.handle(attemptAddedEvent); createSchedulingRequestExistingApplication(minReqSize * 2, 1, priority, attemptId); } scheduler.update(); assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, true) .getRunnableAppSchedulables().size()); System.out.println(GC stats before NodeUpdate processing:); printGCStats(); int hb_num = 5000; long start = System.nanoTime(); for (int i = 0; i hb_num; ++i) { String host = String.format(192.1.%d.%d, i/256, i%256); RMNode node = MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, host); NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node); scheduler.handle(nodeEvent); } long end = System.nanoTime(); System.out.printf(processing time for a NodeUpdate in average: %d us\n, (end - start)/(hb_num * 1000)); System.out.println(GC stats after NodeUpdate processing:); printGCStats(); } -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1889) avoid creating new objects on each fair scheduler call to AppSchedulable comparator
[ https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-1889: - Description: In fair scheduler, in each scheduling attempt, a full sort is performed on List of AppSchedulable, which invokes Comparator.compare method many times. Both FairShareComparator and DRFComparator call AppSchedulable.getWeights, and AppSchedulable.getPriority. A new ResourceWeights object is allocated on each call of getWeights, and the same for getPriority. This introduces a lot of pressure to GC because these methods are called very very frequently. Below test case shows improvement on performance and GC behaviour. The results show that the GC pressure during processing NodeUpdate is recuded half by this patch. The code to show the improvement: (Add it to TestFairScheduler.java) {code} import java.lang.management.GarbageCollectorMXBean; import java.lang.management.ManagementFactory; public void printGCStats() { long totalGarbageCollections = 0; long garbageCollectionTime = 0; for(GarbageCollectorMXBean gc : ManagementFactory.getGarbageCollectorMXBeans()) { long count = gc.getCollectionCount(); if(count = 0) { totalGarbageCollections += count; } long time = gc.getCollectionTime(); if(time = 0) { garbageCollectionTime += time; } } System.out.println(Total Garbage Collections: + totalGarbageCollections); System.out.println(Total Garbage Collection Time (ms): + garbageCollectionTime); } @Test public void testImpactOnGC() throws Exception { scheduler.reinitialize(conf, resourceManager.getRMContext()); // Add nodes int numNode = 1; for (int i = 0; i numNode; ++i) { String host = String.format(192.1.%d.%d, i/256, i%256); RMNode node = MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, host); NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node); scheduler.handle(nodeEvent); assertEquals(1024 * 64 * (i+1), scheduler.getClusterCapacity().getMemory()); } assertEquals(numNode, scheduler.getNumClusterNodes()); assertEquals(1024 * 64 * numNode, scheduler.getClusterCapacity().getMemory()); // add apps, each app has 100 containers. int minReqSize = FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB; int numApp = 8000; int priority = 1; for (int i = 1; i numApp + 1; ++i) { ApplicationAttemptId attemptId = createAppAttemptId(i, 1); AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent( attemptId.getApplicationId(), queue1, user1); scheduler.handle(appAddedEvent); AppAttemptAddedSchedulerEvent attemptAddedEvent = new AppAttemptAddedSchedulerEvent(attemptId, false); scheduler.handle(attemptAddedEvent); createSchedulingRequestExistingApplication(minReqSize * 2, 1, priority, attemptId); } scheduler.update(); assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, true) .getRunnableAppSchedulables().size()); System.out.println(GC stats before NodeUpdate processing:); printGCStats(); int hb_num = 5000; long start = System.nanoTime(); for (int i = 0; i hb_num; ++i) { String host = String.format(192.1.%d.%d, i/256, i%256); RMNode node = MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), 5000, host); NodeUpdateSchedulerEvent nodeEvent = new NodeUpdateSchedulerEvent(node); scheduler.handle(nodeEvent); } long end = System.nanoTime(); System.out.printf(processing time for a NodeUpdate in average: %d us\n, (end - start)/(hb_num * 1000)); System.out.println(GC stats after NodeUpdate processing:); printGCStats(); } {code} was: In fair scheduler, in each scheduling attempt, a full sort is performed on List of AppSchedulable, which invokes Comparator.compare method many times. Both FairShareComparator and DRFComparator call AppSchedulable.getWeights, and AppSchedulable.getPriority. A new ResourceWeights object is allocated on each call of getWeights, and the same for getPriority. This introduces a lot of pressure to GC because these methods are called very very frequently. Below test case shows improvement on performance and GC behaviour. The results show that the GC pressure during processing NodeUpdate is recuded half by this patch. The code to show the improvement: (Add it to TestFairScheduler.java) import java.lang.management.GarbageCollectorMXBean; import java.lang.management.ManagementFactory; public void printGCStats() { long totalGarbageCollections = 0; long garbageCollectionTime = 0; for(GarbageCollectorMXBean gc : ManagementFactory.getGarbageCollectorMXBeans()) { long
[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
[ https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955817#comment-13955817 ] Tsuyoshi OZAWA commented on YARN-1879: -- I'm adding RetryCache support and tests to registerApplicationMaster()/unregisterApplicationMaster(). Please let me know if you have more appropriate idea. Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol --- Key: YARN-1879 URL: https://issues.apache.org/jira/browse/YARN-1879 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Tsuyoshi OZAWA Priority: Critical Attachments: YARN-1879.1.patch, YARN-1879.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (YARN-904) Enable multiple QOP for ResourceManager
[ https://issues.apache.org/jira/browse/YARN-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benoy Antony resolved YARN-904. --- Resolution: Duplicate resolved via HDFS_5910 and HADOOP-10221 Enable multiple QOP for ResourceManager --- Key: YARN-904 URL: https://issues.apache.org/jira/browse/YARN-904 Project: Hadoop YARN Issue Type: Improvement Reporter: Benoy Antony Attachments: yarn-904.patch Currently ResourceManager supports only single QOP. The feature makes ResourceManager listen on two ports for RPC. One RPC port supports only authentication , other RPC port supports privacy. Please see HADOOP-9709 for general requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1889) In Fair Scheduler, avoid creating objects on each call to AppSchedulable comparator
[ https://issues.apache.org/jira/browse/YARN-1889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955912#comment-13955912 ] Hudson commented on YARN-1889: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5440 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5440/]) YARN-1889. In Fair Scheduler, avoid creating objects on each call to AppSchedulable comparator (Hong Zhiguo via Sandy Ryza) (sandy: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1583491) * /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/resource/ResourceWeights.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/AppSchedulable.java * /hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java In Fair Scheduler, avoid creating objects on each call to AppSchedulable comparator --- Key: YARN-1889 URL: https://issues.apache.org/jira/browse/YARN-1889 Project: Hadoop YARN Issue Type: Improvement Components: scheduler Reporter: Hong Zhiguo Assignee: Hong Zhiguo Priority: Minor Labels: reviewed Fix For: 2.5.0 Attachments: YARN-1889.patch, YARN-1889.patch In fair scheduler, in each scheduling attempt, a full sort is performed on List of AppSchedulable, which invokes Comparator.compare method many times. Both FairShareComparator and DRFComparator call AppSchedulable.getWeights, and AppSchedulable.getPriority. A new ResourceWeights object is allocated on each call of getWeights, and the same for getPriority. This introduces a lot of pressure to GC because these methods are called very very frequently. Below test case shows improvement on performance and GC behaviour. The results show that the GC pressure during processing NodeUpdate is recuded half by this patch. The code to show the improvement: (Add it to TestFairScheduler.java) {code} import java.lang.management.GarbageCollectorMXBean; import java.lang.management.ManagementFactory; public void printGCStats() { long totalGarbageCollections = 0; long garbageCollectionTime = 0; for(GarbageCollectorMXBean gc : ManagementFactory.getGarbageCollectorMXBeans()) { long count = gc.getCollectionCount(); if(count = 0) { totalGarbageCollections += count; } long time = gc.getCollectionTime(); if(time = 0) { garbageCollectionTime += time; } } System.out.println(Total Garbage Collections: + totalGarbageCollections); System.out.println(Total Garbage Collection Time (ms): + garbageCollectionTime); } @Test public void testImpactOnGC() throws Exception { scheduler.reinitialize(conf, resourceManager.getRMContext()); // Add nodes int numNode = 1; for (int i = 0; i numNode; ++i) { String host = String.format(192.1.%d.%d, i/256, i%256); RMNode node = MockNodes.newNodeInfo(1, Resources.createResource(1024 * 64), i, host); NodeAddedSchedulerEvent nodeEvent = new NodeAddedSchedulerEvent(node); scheduler.handle(nodeEvent); assertEquals(1024 * 64 * (i+1), scheduler.getClusterCapacity().getMemory()); } assertEquals(numNode, scheduler.getNumClusterNodes()); assertEquals(1024 * 64 * numNode, scheduler.getClusterCapacity().getMemory()); // add apps, each app has 100 containers. int minReqSize = FairSchedulerConfiguration.DEFAULT_RM_SCHEDULER_INCREMENT_ALLOCATION_MB; int numApp = 8000; int priority = 1; for (int i = 1; i numApp + 1; ++i) { ApplicationAttemptId attemptId = createAppAttemptId(i, 1); AppAddedSchedulerEvent appAddedEvent = new AppAddedSchedulerEvent( attemptId.getApplicationId(), queue1, user1); scheduler.handle(appAddedEvent); AppAttemptAddedSchedulerEvent attemptAddedEvent = new AppAttemptAddedSchedulerEvent(attemptId, false); scheduler.handle(attemptAddedEvent); createSchedulingRequestExistingApplication(minReqSize * 2, 1, priority, attemptId); } scheduler.update(); assertEquals(numApp, scheduler.getQueueManager().getLeafQueue(queue1, true) .getRunnableAppSchedulables().size()); System.out.println(GC stats before NodeUpdate processing:);
[jira] [Commented] (YARN-1879) Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol
[ https://issues.apache.org/jira/browse/YARN-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955925#comment-13955925 ] Jian He commented on YARN-1879: --- +1 with the RetryCache approach Mark Idempotent/AtMostOnce annotations to ApplicationMasterProtocol --- Key: YARN-1879 URL: https://issues.apache.org/jira/browse/YARN-1879 Project: Hadoop YARN Issue Type: Sub-task Components: resourcemanager Reporter: Jian He Assignee: Tsuyoshi OZAWA Priority: Critical Attachments: YARN-1879.1.patch, YARN-1879.1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1896) For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics
Siqi Li created YARN-1896: - Summary: For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics Key: YARN-1896 URL: https://issues.apache.org/jira/browse/YARN-1896 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1896) For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics
[ https://issues.apache.org/jira/browse/YARN-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siqi Li updated YARN-1896: -- Attachment: YARN-1896.v1.patch For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics -- Key: YARN-1896 URL: https://issues.apache.org/jira/browse/YARN-1896 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Attachments: YARN-1896.v1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1896) For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics
[ https://issues.apache.org/jira/browse/YARN-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955952#comment-13955952 ] Hadoop QA commented on YARN-1896: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12637959/YARN-1896.v1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3494//console This message is automatically generated. For FairScheduler expose MinimumQueueResource of each queu in QueueMetrics -- Key: YARN-1896 URL: https://issues.apache.org/jira/browse/YARN-1896 Project: Hadoop YARN Issue Type: Bug Reporter: Siqi Li Attachments: YARN-1896.v1.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-596) In fair scheduler, intra-application container priorities affect inter-application preemption decisions
[ https://issues.apache.org/jira/browse/YARN-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955960#comment-13955960 ] Sandy Ryza commented on YARN-596: - Thanks. The patch is looking almost done. I like that you replaced the O(n log n) sort call in preemptContainer with an O(n) iteration. Just a few more nits: {code} + LOG.debug(Queue + getName() + is going to preempt a container + + from its childQueues.); {code} This doesn't make sense in FSLeafQueue, which can't have child queues. {code} +// Let the selected queue to preempt +if (candidateQueue != null) { + toBePreempted = candidateQueue.preemptContainer(); +} {code} Did you mean Let the selected queue choose which of its containers to preempt? For preemptContainerPreCheck, it would be good to take multiple resources into account (using the DefaultResourceCalculator will only apply to memory). Resources.fitsIn(getResourceUsage(), getFairShare) can be used to determine whether a Schedulable is safe from preemption. Lastly, can you add a test that makes sure that containers from apps that are higher over their fair share get preempted first, even when containers from other apps that are over their fair share have lower priorities? In fair scheduler, intra-application container priorities affect inter-application preemption decisions --- Key: YARN-596 URL: https://issues.apache.org/jira/browse/YARN-596 Project: Hadoop YARN Issue Type: Bug Components: scheduler Affects Versions: 2.0.3-alpha Reporter: Sandy Ryza Assignee: Sandy Ryza Attachments: YARN-596.patch, YARN-596.patch, YARN-596.patch, YARN-596.patch, YARN-596.patch In the fair scheduler, containers are chosen for preemption in the following way: All containers for all apps that are in queues that are over their fair share are put in a list. The list is sorted in order of the priority that the container was requested in. This means that an application can shield itself from preemption by requesting it's containers at higher priorities, which doesn't really make sense. Also, an application that is not over its fair share, but that is in a queue that is over it's fair share is just as likely to have containers preempted as an application that is over its fair share. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-1897) Define SignalContainerRequest and SignalContainerResponse
Ming Ma created YARN-1897: - Summary: Define SignalContainerRequest and SignalContainerResponse Key: YARN-1897 URL: https://issues.apache.org/jira/browse/YARN-1897 Project: Hadoop YARN Issue Type: Sub-task Components: api Reporter: Ming Ma We need to define SignalContainerRequest and SignalContainerResponse first as they are needed by other sub tasks. SignalContainerRequest should use OS-independent commands and provide a way to application to specify reason for diagnosis. SignalContainerResponse might be empty. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-1726) ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced in YARN-1041
[ https://issues.apache.org/jira/browse/YARN-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Yan updated YARN-1726: -- Attachment: YARN-1726.patch A new patch: (1) Fix the problem because of the AbstractYarnScheduler. (2) Add testcases for AMSimulator amd NMSimulator. (3) Update the TestSLSRunner to catch the possible exception from child thread during running. The old testcase cannot work as it cannot catch the child threads' exception. ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced in YARN-1041 Key: YARN-1726 URL: https://issues.apache.org/jira/browse/YARN-1726 Project: Hadoop YARN Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Priority: Minor Attachments: YARN-1726.patch, YARN-1726.patch The YARN scheduler simulator failed when running Fair Scheduler, due to AbstractYarnScheduler introduced in YARN-1041. The ResourceSchedulerWrapper should inherit AbstractYarnScheduler, instead of implementing ResourceScheduler interface directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1726) ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced in YARN-1041
[ https://issues.apache.org/jira/browse/YARN-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956107#comment-13956107 ] Hadoop QA commented on YARN-1726: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12637996/YARN-1726.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-tools/hadoop-sls: org.apache.hadoop.yarn.sls.appmaster.TestAMSimulator org.apache.hadoop.yarn.sls.TestSLSRunner {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-YARN-Build/3495//testReport/ Console output: https://builds.apache.org/job/PreCommit-YARN-Build/3495//console This message is automatically generated. ResourceSchedulerWrapper failed due to the AbstractYarnScheduler introduced in YARN-1041 Key: YARN-1726 URL: https://issues.apache.org/jira/browse/YARN-1726 Project: Hadoop YARN Issue Type: Bug Reporter: Wei Yan Assignee: Wei Yan Priority: Minor Attachments: YARN-1726.patch, YARN-1726.patch The YARN scheduler simulator failed when running Fair Scheduler, due to AbstractYarnScheduler introduced in YARN-1041. The ResourceSchedulerWrapper should inherit AbstractYarnScheduler, instead of implementing ResourceScheduler interface directly. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-1872) TestDistributedShell occasionally fails in trunk
[ https://issues.apache.org/jira/browse/YARN-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956122#comment-13956122 ] Hong Zhiguo commented on YARN-1872: --- I met the timeout too. But I can't reproduce it. Could you reproduce the timeout? Can you attach the TestDistributedShell-output.txt under surefire-reports? TestDistributedShell occasionally fails in trunk Key: YARN-1872 URL: https://issues.apache.org/jira/browse/YARN-1872 Project: Hadoop YARN Issue Type: Test Reporter: Ted Yu Attachments: TestDistributedShell.out From https://builds.apache.org/job/Hadoop-Yarn-trunk/520/console : TestDistributedShell#testDSShellWithCustomLogPropertyFile failed and TestDistributedShell#testDSShell timed out. -- This message was sent by Atlassian JIRA (v6.2#6252)