[jira] [Commented] (YARN-3573) MiniMRYarnCluster constructor that starts the timeline server using a boolean should be marked deprecated
[ https://issues.apache.org/jira/browse/YARN-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975798#comment-14975798 ] Hudson commented on YARN-3573: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #601 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/601/]) YARN-3573. MiniMRYarnCluster constructor that starts the timeline server (ozawa: rev 96677bef00b03057038157efeb3c2ad4702914da) * hadoop-yarn-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java > MiniMRYarnCluster constructor that starts the timeline server using a boolean > should be marked deprecated > - > > Key: YARN-3573 > URL: https://issues.apache.org/jira/browse/YARN-3573 > Project: Hadoop YARN > Issue Type: Test > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: YARN-3573-002.patch, YARN-3573.patch > > > {code}MiniMRYarnCluster(String testName, int noOfNMs, boolean enableAHS){code} > starts the timeline server using *boolean enableAHS*. It is better to have > the timelineserver started based on the config value. > We should mark this constructor as deprecated to avoid its future use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4302: --- Attachment: 0001-YARN-4302.patch Impact of YARN-4285. Attaching patch for the same.Please do review > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4164) Retrospect update ApplicationPriority API return type
[ https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976047#comment-14976047 ] Rohith Sharma K S commented on YARN-4164: - bq. rather than keeping a success flag. Right, the patches returns priority only. bq. we can set this return value as "null" currently "null" is sent back to client if application is already in completing states. > Retrospect update ApplicationPriority API return type > - > > Key: YARN-4164 > URL: https://issues.apache.org/jira/browse/YARN-4164 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch > > > Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API > returns empty UpdateApplicationPriorityResponse response. > But RM update priority to the cluster.max-priority if the given priority is > greater than cluster.max-priority. In this scenarios, need to intimate back > to client that updated priority rather just keeping quite where client > assumes that given priority itself is taken. > During application submission also has same scenario can happen, but I feel > when > explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), > response should have updated priority in response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4164) Retrospect update ApplicationPriority API return type
[ https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976049#comment-14976049 ] Rohith Sharma K S commented on YARN-4164: - Updated the patch, kindly review > Retrospect update ApplicationPriority API return type > - > > Key: YARN-4164 > URL: https://issues.apache.org/jira/browse/YARN-4164 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch > > > Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API > returns empty UpdateApplicationPriorityResponse response. > But RM update priority to the cluster.max-priority if the given priority is > greater than cluster.max-priority. In this scenarios, need to intimate back > to client that updated priority rather just keeping quite where client > assumes that given priority itself is taken. > During application submission also has same scenario can happen, but I feel > when > explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), > response should have updated priority in response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4302) SLS not able starting
Bibin A Chundatt created YARN-4302: -- Summary: SLS not able starting Key: YARN-4302 URL: https://issues.apache.org/jira/browse/YARN-4302 Project: Hadoop YARN Issue Type: Bug Reporter: Bibin A Chundatt Assignee: Bibin A Chundatt Configure the samples from tools/sls yarn-site.xml capacityscheduler.xml sls-runner.xml {quote} 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 java.lang.NullPointerException at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) at org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) at java.lang.Thread.run(Thread.java:745) {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4263) Capacity scheduler 60%-40% formatting floating point issue
[ https://issues.apache.org/jira/browse/YARN-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Kalaszi updated YARN-4263: - Attachment: (was: YARN-4263.005.patch) > Capacity scheduler 60%-40% formatting floating point issue > -- > > Key: YARN-4263 > URL: https://issues.apache.org/jira/browse/YARN-4263 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 2.7.1 >Reporter: Adrian Kalaszi >Priority: Trivial > Labels: easyfix > Attachments: YARN-4263.001.patch, YARN-4263.002.patch, > YARN-4263.003.patch, YARN-4263.004.patch, YARN-4263.005.patch > > > If capacity scheduler is set with two queues to 60% and 40% capacity, due to > a java float floating representation issue > {code} > > hadoop queue -list > == > Queue Name : default > Queue State : running > Scheduling Info : Capacity: 40.0, MaximumCapacity: 100.0, CurrentCapacity: > 0.0 > == > Queue Name : large > Queue State : running > Scheduling Info : Capacity: 60.04, MaximumCapacity: 100.0, > CurrentCapacity: 0.0 > {code} > Because > {code} System.err.println((0.6f) * 100); {code} > results in 60.04. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4263) Capacity scheduler 60%-40% formatting floating point issue
[ https://issues.apache.org/jira/browse/YARN-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Kalaszi updated YARN-4263: - Attachment: YARN-4263.005.patch > Capacity scheduler 60%-40% formatting floating point issue > -- > > Key: YARN-4263 > URL: https://issues.apache.org/jira/browse/YARN-4263 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 2.7.1 >Reporter: Adrian Kalaszi >Priority: Trivial > Labels: easyfix > Attachments: YARN-4263.001.patch, YARN-4263.002.patch, > YARN-4263.003.patch, YARN-4263.004.patch, YARN-4263.005.patch > > > If capacity scheduler is set with two queues to 60% and 40% capacity, due to > a java float floating representation issue > {code} > > hadoop queue -list > == > Queue Name : default > Queue State : running > Scheduling Info : Capacity: 40.0, MaximumCapacity: 100.0, CurrentCapacity: > 0.0 > == > Queue Name : large > Queue State : running > Scheduling Info : Capacity: 60.04, MaximumCapacity: 100.0, > CurrentCapacity: 0.0 > {code} > Because > {code} System.err.println((0.6f) * 100); {code} > results in 60.04. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4302: --- Description: Configure the samples from tools/sls yarn-site.xml capacityscheduler.xml sls-runner.xml to /etc/hadoop Start sls using bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json --output-dir=out {noformat} 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 java.lang.NullPointerException at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) at org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) at java.lang.Thread.run(Thread.java:745) {noformat} was: Configure the samples from tools/sls yarn-site.xml capacityscheduler.xml sls-runner.xml {quote} 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 java.lang.NullPointerException at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) at org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) at org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) at org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at
[jira] [Updated] (YARN-4164) Retrospect update ApplicationPriority API return type
[ https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4164: Attachment: 0002-YARN-4164.patch > Retrospect update ApplicationPriority API return type > - > > Key: YARN-4164 > URL: https://issues.apache.org/jira/browse/YARN-4164 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch > > > Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API > returns empty UpdateApplicationPriorityResponse response. > But RM update priority to the cluster.max-priority if the given priority is > greater than cluster.max-priority. In this scenarios, need to intimate back > to client that updated priority rather just keeping quite where client > assumes that given priority itself is taken. > During application submission also has same scenario can happen, but I feel > when > explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), > response should have updated priority in response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3216) Max-AM-Resource-Percentage should respect node labels
[ https://issues.apache.org/jira/browse/YARN-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975857#comment-14975857 ] Hudson commented on YARN-3216: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/]) YARN-3216. Max-AM-Resource-Percentage should respect node labels. (Sunil (wangda: rev 56e4f6237ae8b1852e82b186e08db3934f79a9db) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CSQueueUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/QueueCapacities.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerApplicationAttempt.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestUtils.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNodeLabelUpdate.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java > Max-AM-Resource-Percentage should respect node labels > - > > Key: YARN-3216 > URL: https://issues.apache.org/jira/browse/YARN-3216 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Sunil G >Priority: Critical > Fix For: 2.8.0 > > Attachments: 0001-YARN-3216.patch, 0002-YARN-3216.patch, > 0003-YARN-3216.patch, 0004-YARN-3216.patch, 0005-YARN-3216.patch, > 0006-YARN-3216.patch, 0007-YARN-3216.patch, 0008-YARN-3216.patch, > 0009-YARN-3216.patch, 0010-YARN-3216.patch, 0011-YARN-3216.patch > > > Currently, max-am-resource-percentage considers default_partition only. When > a queue can access multiple partitions, we should be able to compute > max-am-resource-percentage based on that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3573) MiniMRYarnCluster constructor that starts the timeline server using a boolean should be marked deprecated
[ https://issues.apache.org/jira/browse/YARN-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975858#comment-14975858 ] Hudson commented on YARN-3573: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/]) YARN-3573. MiniMRYarnCluster constructor that starts the timeline server (ozawa: rev 96677bef00b03057038157efeb3c2ad4702914da) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-yarn-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java > MiniMRYarnCluster constructor that starts the timeline server using a boolean > should be marked deprecated > - > > Key: YARN-3573 > URL: https://issues.apache.org/jira/browse/YARN-3573 > Project: Hadoop YARN > Issue Type: Test > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: YARN-3573-002.patch, YARN-3573.patch > > > {code}MiniMRYarnCluster(String testName, int noOfNMs, boolean enableAHS){code} > starts the timeline server using *boolean enableAHS*. It is better to have > the timelineserver started based on the config value. > We should mark this constructor as deprecated to avoid its future use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4169) Fix racing condition of TestNodeStatusUpdaterForLabels
[ https://issues.apache.org/jira/browse/YARN-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975856#comment-14975856 ] Hudson commented on YARN-4169: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/]) YARN-4169. Fix racing condition of TestNodeStatusUpdaterForLabels. (wangda: rev 6f606214e734d9600bc0f25a63142714f0fea633) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/TestRMNodeLabelsManager.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/TestCommonNodeLabelsManager.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java > Fix racing condition of TestNodeStatusUpdaterForLabels > -- > > Key: YARN-4169 > URL: https://issues.apache.org/jira/browse/YARN-4169 > Project: Hadoop YARN > Issue Type: Bug > Components: test >Affects Versions: 3.0.0 > Environment: Jenkins >Reporter: Steve Loughran >Assignee: Naganarasimha G R >Priority: Critical > Fix For: 2.8.0 > > Attachments: YARN-4162.v1.006.patch, YARN-4162.v2.005.patch, > YARN-4169.v1.001.patch, YARN-4169.v1.002.patch, YARN-4169.v1.003.patch, > YARN-4169.v1.004.patch, YARN-4169.v1.007.patch > > > Test failing in [[Jenkins build > 402|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-Yarn-trunk-Java8/402/testReport/junit/org.apache.hadoop.yarn.server.nodemanager/TestNodeStatusUpdaterForLabels/testNodeStatusUpdaterForNodeLabels/] > {code} > java.lang.NullPointerException: null > at java.util.HashSet.(HashSet.java:118) > at > org.apache.hadoop.yarn.nodelabels.NodeLabelTestBase.assertNLCollectionEquals(NodeLabelTestBase.java:103) > at > org.apache.hadoop.yarn.server.nodemanager.TestNodeStatusUpdaterForLabels.testNodeStatusUpdaterForNodeLabels(TestNodeStatusUpdaterForLabels.java:268) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3573) MiniMRYarnCluster constructor that starts the timeline server using a boolean should be marked deprecated
[ https://issues.apache.org/jira/browse/YARN-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975929#comment-14975929 ] Hudson commented on YARN-3573: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2479 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2479/]) YARN-3573. MiniMRYarnCluster constructor that starts the timeline server (ozawa: rev 96677bef00b03057038157efeb3c2ad4702914da) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-yarn-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java > MiniMRYarnCluster constructor that starts the timeline server using a boolean > should be marked deprecated > - > > Key: YARN-3573 > URL: https://issues.apache.org/jira/browse/YARN-3573 > Project: Hadoop YARN > Issue Type: Test > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: YARN-3573-002.patch, YARN-3573.patch > > > {code}MiniMRYarnCluster(String testName, int noOfNMs, boolean enableAHS){code} > starts the timeline server using *boolean enableAHS*. It is better to have > the timelineserver started based on the config value. > We should mark this constructor as deprecated to avoid its future use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2859) ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster
[ https://issues.apache.org/jira/browse/YARN-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975975#comment-14975975 ] Hadoop QA commented on YARN-2859: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | patch | 0m 1s | The patch file was not named according to hadoop's naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute for instructions. | | {color:blue}0{color} | pre-patch | 6m 15s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 55s | There were no new javac warning messages. | | {color:green}+1{color} | release audit | 0m 21s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 23s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 29s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 45s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 1m 49s | Tests passed in hadoop-yarn-server-tests. | | | | 19m 33s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12768895/YARN-2859.txt | | Optional Tests | javac unit findbugs checkstyle | | git revision | trunk / 96677be | | hadoop-yarn-server-tests test log | https://builds.apache.org/job/PreCommit-YARN-Build/9585/artifact/patchprocess/testrun_hadoop-yarn-server-tests.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9585/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9585/console | This message was automatically generated. > ApplicationHistoryServer binds to default port 8188 in MiniYARNCluster > -- > > Key: YARN-2859 > URL: https://issues.apache.org/jira/browse/YARN-2859 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Hitesh Shah >Assignee: Vinod Kumar Vavilapalli >Priority: Critical > Attachments: YARN-2859.txt > > > In mini cluster, a random port should be used. > Also, the config is not updated to the host that the process got bound to. > {code} > 2014-11-13 13:07:01,905 INFO [main] server.MiniYARNCluster > (MiniYARNCluster.java:serviceStart(722)) - MiniYARN ApplicationHistoryServer > address: localhost:10200 > 2014-11-13 13:07:01,905 INFO [main] server.MiniYARNCluster > (MiniYARNCluster.java:serviceStart(724)) - MiniYARN ApplicationHistoryServer > web address: 0.0.0.0:8188 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4302) SLS not able start
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4302: --- Summary: SLS not able start (was: SLS not able starting ) > SLS not able start > -- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > {quote} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4300) [JDK8] Fix javadoc errors caused by wrong tags
[ https://issues.apache.org/jira/browse/YARN-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975862#comment-14975862 ] Hudson commented on YARN-4300: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #541 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/541/]) YARN-4300. [JDK8] Fix javadoc errors caused by wrong tags. (aajisaka) (aajisaka: rev 8a68630dd126348b859e0fbe2d14079965016772) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/ScriptBasedNodeLabelsProvider.java > [JDK8] Fix javadoc errors caused by wrong tags > -- > > Key: YARN-4300 > URL: https://issues.apache.org/jira/browse/YARN-4300 > Project: Hadoop YARN > Issue Type: Bug > Components: build, documentation >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA >Priority: Blocker > Fix For: 2.8.0 > > Attachments: YARN-4300.00.patch > > > "mvn package -Pdist -Dtar -DskipTests" fails on JDK8. > {noformat} > [ERROR] > /Users/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/ScriptBasedNodeLabelsProvider.java:116: > error: exception not thrown: java.lang.Exception > [ERROR] * @throws Exception > [ERROR] ^ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3573) MiniMRYarnCluster constructor that starts the timeline server using a boolean should be marked deprecated
[ https://issues.apache.org/jira/browse/YARN-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975900#comment-14975900 ] Hudson commented on YARN-3573: -- FAILURE: Integrated in Hadoop-Yarn-trunk #1325 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1325/]) YARN-3573. MiniMRYarnCluster constructor that starts the timeline server (ozawa: rev 96677bef00b03057038157efeb3c2ad4702914da) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-yarn-project/CHANGES.txt > MiniMRYarnCluster constructor that starts the timeline server using a boolean > should be marked deprecated > - > > Key: YARN-3573 > URL: https://issues.apache.org/jira/browse/YARN-3573 > Project: Hadoop YARN > Issue Type: Test > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: YARN-3573-002.patch, YARN-3573.patch > > > {code}MiniMRYarnCluster(String testName, int noOfNMs, boolean enableAHS){code} > starts the timeline server using *boolean enableAHS*. It is better to have > the timelineserver started based on the config value. > We should mark this constructor as deprecated to avoid its future use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3877) YarnClientImpl.submitApplication swallows exceptions
[ https://issues.apache.org/jira/browse/YARN-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976052#comment-14976052 ] Steve Loughran commented on YARN-3877: -- LGTM +1 > YarnClientImpl.submitApplication swallows exceptions > > > Key: YARN-3877 > URL: https://issues.apache.org/jira/browse/YARN-3877 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Affects Versions: 2.7.2 >Reporter: Steve Loughran >Assignee: Varun Saxena >Priority: Minor > Attachments: YARN-3877.01.patch, YARN-3877.02.patch, > YARN-3877.03.patch > > > When {{YarnClientImpl.submitApplication}} spins waiting for the application > to be accepted, any interruption during its Sleep() calls are logged and > swallowed. > this makes it hard to interrupt the thread during shutdown. Really it should > throw some form of exception and let the caller deal with it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4303) Confusing help message if AM logs cant be retrieved via yarn logs command
[ https://issues.apache.org/jira/browse/YARN-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4303: --- Description: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to Nodemanager's http port and may lead user to think that they have to connect to nodemanager's webapp address. This help message should be changed to include command as {{yarn logs -applicationId -containerId --nodeAddress }} was: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to Node's http port and may lead user to think that they have to connect to nodemanager's webapp address. This help message should be changed to include command as {{yarn logs -applicationId -containerId --nodeAddress }} > Confusing help message if AM logs cant be retrieved via yarn logs command > - > > Key: YARN-4303 > URL: https://issues.apache.org/jira/browse/YARN-4303 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Priority: Minor > > {noformat} > yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs > --applicationId application_1445832014581_0028 -am ALL > Can not get AMContainers logs for the > application:application_1445832014581_0028 > This application:application_1445832014581_0028 is finished. Please enable > the application history service. Or Using yarn logs -applicationId > -containerId --nodeAddress to get the > container logs > {noformat} > The command mentioned above is {{yarn logs -applicationId > -containerId --nodeAddress }}. It asks you to > specify nodeHttpAddress which makes it sound like we have to connect to > Nodemanager's http port and may lead user to think that they have to connect > to nodemanager's webapp address. > This help message should be changed to include command as {{yarn logs > -applicationId -containerId --nodeAddress Address>}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4303) Confusing help message if AM logs cant be retrieved via yarn logs command
[ https://issues.apache.org/jira/browse/YARN-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4303: --- Description: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to nodemanager's webapp address. This help message should be changed to include command as {{yarn logs -applicationId -containerId --nodeAddress }} was: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to Nodemanager's http port and may lead user to think that they have to connect to nodemanager's webapp address. This help message should be changed to include command as {{yarn logs -applicationId -containerId --nodeAddress }} > Confusing help message if AM logs cant be retrieved via yarn logs command > - > > Key: YARN-4303 > URL: https://issues.apache.org/jira/browse/YARN-4303 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Priority: Minor > > {noformat} > yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs > --applicationId application_1445832014581_0028 -am ALL > Can not get AMContainers logs for the > application:application_1445832014581_0028 > This application:application_1445832014581_0028 is finished. Please enable > the application history service. Or Using yarn logs -applicationId > -containerId --nodeAddress to get the > container logs > {noformat} > The command mentioned above is {{yarn logs -applicationId > -containerId --nodeAddress }}. It asks you to > specify nodeHttpAddress which makes it sound like we have to connect to > nodemanager's webapp address. > This help message should be changed to include command as {{yarn logs > -applicationId -containerId --nodeAddress Address>}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976076#comment-14976076 ] Sunil G commented on YARN-4302: --- Yes [~bibinchundatt]. Its coming after YARN-4285, [~vvasudev] Could you please take a look. > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3573) MiniMRYarnCluster constructor that starts the timeline server using a boolean should be marked deprecated
[ https://issues.apache.org/jira/browse/YARN-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976095#comment-14976095 ] Hudson commented on YARN-3573: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2532 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2532/]) YARN-3573. MiniMRYarnCluster constructor that starts the timeline server (ozawa: rev 96677bef00b03057038157efeb3c2ad4702914da) * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java * hadoop-yarn-project/CHANGES.txt > MiniMRYarnCluster constructor that starts the timeline server using a boolean > should be marked deprecated > - > > Key: YARN-3573 > URL: https://issues.apache.org/jira/browse/YARN-3573 > Project: Hadoop YARN > Issue Type: Test > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: YARN-3573-002.patch, YARN-3573.patch > > > {code}MiniMRYarnCluster(String testName, int noOfNMs, boolean enableAHS){code} > starts the timeline server using *boolean enableAHS*. It is better to have > the timelineserver started based on the config value. > We should mark this constructor as deprecated to avoid its future use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976082#comment-14976082 ] Varun Vasudev commented on YARN-4302: - My apologies for not catching this [~bibinchundatt]. Thank you for catching this. +1 pending Jenkins. > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4303) Confusing help message if AM logs cant be retrieved via yarn logs command
[ https://issues.apache.org/jira/browse/YARN-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4303: --- Assignee: (was: Varun Saxena) > Confusing help message if AM logs cant be retrieved via yarn logs command > - > > Key: YARN-4303 > URL: https://issues.apache.org/jira/browse/YARN-4303 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Priority: Minor > > {noformat} > yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs > --applicationId application_1445832014581_0028 -am ALL > Can not get AMContainers logs for the > application:application_1445832014581_0028 > This application:application_1445832014581_0028 is finished. Please enable > the application history service. Or Using yarn logs -applicationId > -containerId --nodeAddress to get the > container logs > {noformat} > The command mentioned above is {{yarn logs -applicationId > -containerId --nodeAddress }}. It asks you to > specify nodeHttpAddress which makes it sound like we have to connect to > Node's http port and may lead user to think that they have to connect to > nodemanager's webapp address. > This help message should be changed to include command as {{yarn logs > -applicationId -containerId --nodeAddress Address>}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4303) Confusing help message if AM logs cant be retrieved via yarn logs command
[ https://issues.apache.org/jira/browse/YARN-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4303: --- Description: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to Node's http port and may lead user to think that they have to connect to nodemanager's webapp address. This command should be changed to {{yarn logs -applicationId -containerId --nodeAddress }} was: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to Node's http port which may confuse user to think that we have to connect to nodemanager's webapp address. This command should be changed to {{yarn logs -applicationId -containerId --nodeAddress }} > Confusing help message if AM logs cant be retrieved via yarn logs command > - > > Key: YARN-4303 > URL: https://issues.apache.org/jira/browse/YARN-4303 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Minor > > {noformat} > yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs > --applicationId application_1445832014581_0028 -am ALL > Can not get AMContainers logs for the > application:application_1445832014581_0028 > This application:application_1445832014581_0028 is finished. Please enable > the application history service. Or Using yarn logs -applicationId > -containerId --nodeAddress to get the > container logs > {noformat} > The command mentioned above is {{yarn logs -applicationId > -containerId --nodeAddress }}. It asks you to > specify nodeHttpAddress which makes it sound like we have to connect to > Node's http port and may lead user to think that they have to connect to > nodemanager's webapp address. > This command should be changed to {{yarn logs -applicationId > -containerId --nodeAddress }} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4303) Confusing help message if AM logs cant be retrieved via yarn logs command
Varun Saxena created YARN-4303: -- Summary: Confusing help message if AM logs cant be retrieved via yarn logs command Key: YARN-4303 URL: https://issues.apache.org/jira/browse/YARN-4303 Project: Hadoop YARN Issue Type: Bug Reporter: Varun Saxena Assignee: Varun Saxena Priority: Minor {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to Node's http port which may confuse user to think that we have to connect to nodemanager's webapp address. This command should be changed to {{yarn logs -applicationId -containerId --nodeAddress }} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4303) Confusing help message if AM logs cant be retrieved via yarn logs command
[ https://issues.apache.org/jira/browse/YARN-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4303: --- Description: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to Node's http port and may lead user to think that they have to connect to nodemanager's webapp address. This help message should be changed to include command as {{yarn logs -applicationId -containerId --nodeAddress }} was: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to Node's http port and may lead user to think that they have to connect to nodemanager's webapp address. This command should be changed to {{yarn logs -applicationId -containerId --nodeAddress }} > Confusing help message if AM logs cant be retrieved via yarn logs command > - > > Key: YARN-4303 > URL: https://issues.apache.org/jira/browse/YARN-4303 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: Varun Saxena >Priority: Minor > > {noformat} > yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs > --applicationId application_1445832014581_0028 -am ALL > Can not get AMContainers logs for the > application:application_1445832014581_0028 > This application:application_1445832014581_0028 is finished. Please enable > the application history service. Or Using yarn logs -applicationId > -containerId --nodeAddress to get the > container logs > {noformat} > The command mentioned above is {{yarn logs -applicationId > -containerId --nodeAddress }}. It asks you to > specify nodeHttpAddress which makes it sound like we have to connect to > Node's http port and may lead user to think that they have to connect to > nodemanager's webapp address. > This help message should be changed to include command as {{yarn logs > -applicationId -containerId --nodeAddress Address>}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4098) Document ApplicationPriority feature
[ https://issues.apache.org/jira/browse/YARN-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4098: Attachment: 0001-YARN-4098.patch updating initial patch.. This is mainly server side changes patch. There are still more doc need to update wrt client interfaces > Document ApplicationPriority feature > > > Key: YARN-4098 > URL: https://issues.apache.org/jira/browse/YARN-4098 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4098.patch > > > This JIRA is to track documentation of application priority and its user, > admin and REST interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3573) MiniMRYarnCluster constructor that starts the timeline server using a boolean should be marked deprecated
[ https://issues.apache.org/jira/browse/YARN-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975783#comment-14975783 ] Hudson commented on YARN-3573: -- FAILURE: Integrated in Hadoop-trunk-Commit #8712 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8712/]) YARN-3573. MiniMRYarnCluster constructor that starts the timeline server (ozawa: rev 96677bef00b03057038157efeb3c2ad4702914da) * hadoop-yarn-project/CHANGES.txt * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/MiniMRYarnCluster.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java > MiniMRYarnCluster constructor that starts the timeline server using a boolean > should be marked deprecated > - > > Key: YARN-3573 > URL: https://issues.apache.org/jira/browse/YARN-3573 > Project: Hadoop YARN > Issue Type: Test > Components: timelineserver >Affects Versions: 2.6.0 >Reporter: Mit Desai >Assignee: Brahma Reddy Battula > Fix For: 2.8.0 > > Attachments: YARN-3573-002.patch, YARN-3573.patch > > > {code}MiniMRYarnCluster(String testName, int noOfNMs, boolean enableAHS){code} > starts the timeline server using *boolean enableAHS*. It is better to have > the timelineserver started based on the config value. > We should mark this constructor as deprecated to avoid its future use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4300) [JDK8] Fix javadoc errors caused by wrong tags
[ https://issues.apache.org/jira/browse/YARN-4300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14975785#comment-14975785 ] Hudson commented on YARN-4300: -- FAILURE: Integrated in Hadoop-trunk-Commit #8712 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8712/]) YARN-4300. [JDK8] Fix javadoc errors caused by wrong tags. (aajisaka) (aajisaka: rev 8a68630dd126348b859e0fbe2d14079965016772) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/ScriptBasedNodeLabelsProvider.java > [JDK8] Fix javadoc errors caused by wrong tags > -- > > Key: YARN-4300 > URL: https://issues.apache.org/jira/browse/YARN-4300 > Project: Hadoop YARN > Issue Type: Bug > Components: build, documentation >Reporter: Akira AJISAKA >Assignee: Akira AJISAKA >Priority: Blocker > Fix For: 2.8.0 > > Attachments: YARN-4300.00.patch > > > "mvn package -Pdist -Dtar -DskipTests" fails on JDK8. > {noformat} > [ERROR] > /Users/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/ScriptBasedNodeLabelsProvider.java:116: > error: exception not thrown: java.lang.Exception > [ERROR] * @throws Exception > [ERROR] ^ > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4302: --- Attachment: 0001-YARN-4302.patch Uploading again to trigger CI. Failure is not related to patch attached.Local run in Eclipse is passing. > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4303) Confusing help message if AM logs cant be retrieved via yarn logs command
[ https://issues.apache.org/jira/browse/YARN-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4303: --- Description: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} Part of the command output mentioned above indicates that using {{yarn logs -applicationId -containerId --nodeAddress }} will fetch desired result. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to nodemanager's webapp address. This help message should be changed to include command as {{yarn logs -applicationId -containerId --nodeAddress }} was: {noformat} yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs --applicationId application_1445832014581_0028 -am ALL Can not get AMContainers logs for the application:application_1445832014581_0028 This application:application_1445832014581_0028 is finished. Please enable the application history service. Or Using yarn logs -applicationId -containerId --nodeAddress to get the container logs {noformat} The command mentioned above is {{yarn logs -applicationId -containerId --nodeAddress }}. It asks you to specify nodeHttpAddress which makes it sound like we have to connect to nodemanager's webapp address. This help message should be changed to include command as {{yarn logs -applicationId -containerId --nodeAddress }} > Confusing help message if AM logs cant be retrieved via yarn logs command > - > > Key: YARN-4303 > URL: https://issues.apache.org/jira/browse/YARN-4303 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Priority: Minor > > {noformat} > yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs > --applicationId application_1445832014581_0028 -am ALL > Can not get AMContainers logs for the > application:application_1445832014581_0028 > This application:application_1445832014581_0028 is finished. Please enable > the application history service. Or Using yarn logs -applicationId > -containerId --nodeAddress to get the > container logs > {noformat} > Part of the command output mentioned above indicates that using {{yarn logs > -applicationId -containerId --nodeAddress > }} will fetch desired result. It asks you to specify > nodeHttpAddress which makes it sound like we have to connect to nodemanager's > webapp address. > This help message should be changed to include command as {{yarn logs > -applicationId -containerId --nodeAddress Address>}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976245#comment-14976245 ] Hadoop QA commented on YARN-4302: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 23m 29s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 11m 20s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 14m 15s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 30s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 30s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 2m 0s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 49s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 9s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | tools/hadoop tests | 0m 30s | Tests failed in hadoop-sls. | | | | 54m 38s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.sls.appmaster.TestAMSimulator | | | hadoop.yarn.sls.TestSLSRunner | | | hadoop.yarn.sls.nodemanager.TestNMSimulator | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12768933/0001-YARN-4302.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 96677be | | hadoop-sls test log | https://builds.apache.org/job/PreCommit-YARN-Build/9588/artifact/patchprocess/testrun_hadoop-sls.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9588/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9588/console | This message was automatically generated. > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at >
[jira] [Created] (YARN-4304) AM max resource configuration per partition need not be displayed properly in UI
Sunil G created YARN-4304: - Summary: AM max resource configuration per partition need not be displayed properly in UI Key: YARN-4304 URL: https://issues.apache.org/jira/browse/YARN-4304 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.7.1 Reporter: Sunil G Assignee: Sunil G As we are supporting per-partition level max AM resource percentage configuration, UI also need to display correct configurations related to same. Current UI still shows am-resource percentage per queue level. This is to be updated correctly when label config is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4296) DistributedShell Log.info is not friendly
[ https://issues.apache.org/jira/browse/YARN-4296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated YARN-4296: Assignee: Xiaowei Wang > DistributedShell Log.info is not friendly > - > > Key: YARN-4296 > URL: https://issues.apache.org/jira/browse/YARN-4296 > Project: Hadoop YARN > Issue Type: Bug > Components: applications/distributed-shell >Affects Versions: 2.5.2, 2.7.1 >Reporter: Xiaowei Wang >Assignee: Xiaowei Wang > Fix For: 2.8.0 > > Attachments: YARN-4296.0.patch > > > In java/org/apache/hadoop/yarn/applications/distributedshell/Client.java, > line 453,Log.info is not friendly ,I think,nodeAddress and its content should > be separated with "=" > The origin Log.info is > {noformat} > 15/10/24 10:20:26 INFO distributedshell.Client: Got node report from ASM for, > nodeId=cloud101414328.wd.nm.ss.nop:45454, > nodeAddresscloud101414328.wd.nm.ss.nop:23999, nodeRackName/default-rack, > nodeNumContainers19 > 15/10/24 10:20:26 INFO distributedshell.Client: Got node report from ASM for, > nodeId=rsync.cloud10141032027.wd.nm.nop:45454, > nodeAddressrsync.cloud10141032027.wd.nm.nop:23999, nodeRackName/default-rack, > nodeNumContainers19 > {noformat} > A better way is > {noformat} > 15/10/24 10:20:26 INFO distributedshell.Client: Got node report from ASM for, > nodeId=cloud101414328.wd.nm.ss.nop:45454, > nodeAddress=cloud101414328.wd.nm.ss.nop:23999, nodeRackName=/default-rack, > nodeNumContainers=19 > 15/10/24 10:20:26 INFO distributedshell.Client: Got node report from ASM for, > nodeId=rsync.cloud10141032027.wd.nm.nop:45454, > nodeAddress=rsync.cloud10141032027.wd.nm.nop:23999, > nodeRackName=/default-rack, nodeNumContainers=19 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition need not be displayed properly in UI
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976309#comment-14976309 ] Bibin A Chundatt commented on YARN-4304: Hi [~sunilg] Cluster metrics also needs updation along with Schedule page . Currently the {{Total Memory & Total Vcores}} in Cluster metrics are showing only DEFAULT_PARTITION resource should i raise seperate JIRA for the same? > AM max resource configuration per partition need not be displayed properly in > UI > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > > As we are supporting per-partition level max AM resource percentage > configuration, UI also need to display correct configurations related to > same. Current UI still shows am-resource percentage per queue level. This is > to be updated correctly when label config is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4263) Capacity scheduler 60%-40% formatting floating point issue
[ https://issues.apache.org/jira/browse/YARN-4263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976230#comment-14976230 ] Hadoop QA commented on YARN-4263: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 20s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 9m 9s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 36s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 26s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 44s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 46s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 39s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 24s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | mapreduce tests | 0m 53s | Tests passed in hadoop-mapreduce-client-common. | | | | 46m 2s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12768924/YARN-4263.005.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 96677be | | hadoop-mapreduce-client-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/9586/artifact/patchprocess/testrun_hadoop-mapreduce-client-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9586/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9586/console | This message was automatically generated. > Capacity scheduler 60%-40% formatting floating point issue > -- > > Key: YARN-4263 > URL: https://issues.apache.org/jira/browse/YARN-4263 > Project: Hadoop YARN > Issue Type: Bug > Components: client >Affects Versions: 2.7.1 >Reporter: Adrian Kalaszi >Priority: Trivial > Labels: easyfix > Attachments: YARN-4263.001.patch, YARN-4263.002.patch, > YARN-4263.003.patch, YARN-4263.004.patch, YARN-4263.005.patch > > > If capacity scheduler is set with two queues to 60% and 40% capacity, due to > a java float floating representation issue > {code} > > hadoop queue -list > == > Queue Name : default > Queue State : running > Scheduling Info : Capacity: 40.0, MaximumCapacity: 100.0, CurrentCapacity: > 0.0 > == > Queue Name : large > Queue State : running > Scheduling Info : Capacity: 60.04, MaximumCapacity: 100.0, > CurrentCapacity: 0.0 > {code} > Because > {code} System.err.println((0.6f) * 100); {code} > results in 60.04. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4304) AM max resource configuration per partition need not be displayed properly in UI
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4304: -- Issue Type: Sub-task (was: Bug) Parent: YARN-2492 > AM max resource configuration per partition need not be displayed properly in > UI > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > > As we are supporting per-partition level max AM resource percentage > configuration, UI also need to display correct configurations related to > same. Current UI still shows am-resource percentage per queue level. This is > to be updated correctly when label config is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers
[ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] MENG DING updated YARN-1509: Attachment: YARN-1509.8.patch Thanks [~leftnoteasy] for the comments. Your concern is valid. I have updated the patch to use {{AbstractMap.SimpleEntry}} instead. > Make AMRMClient support send increase container request and get > increased/decreased containers > -- > > Key: YARN-1509 > URL: https://issues.apache.org/jira/browse/YARN-1509 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan (No longer used) >Assignee: MENG DING > Attachments: YARN-1509.1.patch, YARN-1509.2.patch, YARN-1509.3.patch, > YARN-1509.4.patch, YARN-1509.5.patch, YARN-1509.6.patch, YARN-1509.7.patch, > YARN-1509.8.patch > > > As described in YARN-1197, we need add API in AMRMClient to support > 1) Add increase request > 2) Can get successfully increased/decreased containers from RM -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4164) Retrospect update ApplicationPriority API return type
[ https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976656#comment-14976656 ] Naganarasimha G R commented on YARN-4164: - Also can you take a look @ checkstyle and white space ? Findbugs seems to be unrelated to this patch > Retrospect update ApplicationPriority API return type > - > > Key: YARN-4164 > URL: https://issues.apache.org/jira/browse/YARN-4164 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch > > > Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API > returns empty UpdateApplicationPriorityResponse response. > But RM update priority to the cluster.max-priority if the given priority is > greater than cluster.max-priority. In this scenarios, need to intimate back > to client that updated priority rather just keeping quite where client > assumes that given priority itself is taken. > During application submission also has same scenario can happen, but I feel > when > explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), > response should have updated priority in response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition need not be displayed properly in UI
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976668#comment-14976668 ] Sunil G commented on YARN-4304: --- Thanks Naga. Marked as sub task. > AM max resource configuration per partition need not be displayed properly in > UI > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > > As we are supporting per-partition level max AM resource percentage > configuration, UI also need to display correct configurations related to > same. Current UI still shows am-resource percentage per queue level. This is > to be updated correctly when label config is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976697#comment-14976697 ] Hudson commented on YARN-4302: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #603 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/603/]) YARN-4302. SLS not able start due to NPE in SchedulerApplicationAttempt. (vvasudev: rev c28e16b40caf1e22f72cf2214ebc2fe2eaca4d03) * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java * hadoop-yarn-project/CHANGES.txt > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4164) Retrospect update ApplicationPriority API return type
[ https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976557#comment-14976557 ] Hadoop QA commented on YARN-4164: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 30m 20s | Pre-patch trunk has 3 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 11m 9s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 14m 10s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 31s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 3m 34s | The applied patch generated 1 new checkstyle issues (total was 2, now 3). | | {color:red}-1{color} | whitespace | 0m 1s | The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 2m 1s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 48s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 8m 58s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | mapreduce tests | 147m 4s | Tests failed in hadoop-mapreduce-client-jobclient. | | {color:green}+1{color} | yarn tests | 0m 39s | Tests passed in hadoop-yarn-api. | | {color:red}-1{color} | yarn tests | 7m 36s | Tests failed in hadoop-yarn-client. | | {color:green}+1{color} | yarn tests | 2m 37s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 66m 9s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 296m 31s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.mapreduce.v2.TestMRJobsWithProfiler | | | hadoop.mapred.TestMiniMRClientCluster | | | hadoop.mapreduce.v2.TestMRJobs | | | hadoop.mapreduce.v2.TestNonExistentJob | | | hadoop.mapreduce.v2.TestUberAM | | | hadoop.mapreduce.v2.TestMRJobsWithHistoryService | | | hadoop.yarn.client.api.impl.TestYarnClient | | Timed out tests | org.apache.hadoop.mapreduce.TestLargeSort | | | org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12768938/0002-YARN-4164.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 96677be | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/9587/artifact/patchprocess/trunkFindbugsWarningshadoop-yarn-common.html | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/9587/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/9587/artifact/patchprocess/whitespace.txt | | hadoop-mapreduce-client-jobclient test log | https://builds.apache.org/job/PreCommit-YARN-Build/9587/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/9587/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/9587/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/9587/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/9587/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9587/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9587/console | This message was automatically generated. > Retrospect update ApplicationPriority API return type > - > > Key: YARN-4164 > URL: https://issues.apache.org/jira/browse/YARN-4164 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch > > > Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API > returns empty UpdateApplicationPriorityResponse response. > But RM update priority to the cluster.max-priority
[jira] [Commented] (YARN-1565) Add a way for YARN clients to get critical YARN system properties from the RM
[ https://issues.apache.org/jira/browse/YARN-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976669#comment-14976669 ] Pradeep Subrahmanion commented on YARN-1565: Gentle reminder for review. > Add a way for YARN clients to get critical YARN system properties from the RM > - > > Key: YARN-1565 > URL: https://issues.apache.org/jira/browse/YARN-1565 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Affects Versions: 2.2.0 >Reporter: Steve Loughran > Attachments: YARN-1565-001.patch, YARN-1565-002.patch, > YARN-1565-003.patch, YARN-1565-004.patch > > > If you are trying to build up an AM request, you need to know > # the limits of memory, core for the chosen queue > # the existing YARN classpath > # the path separator for the target platform (so your classpath comes out > right) > # cluster OS: in case you need some OS-specific changes > The classpath can be in yarn-site.xml, but a remote client may not have that. > The site-xml file doesn't list Queue resource limits, cluster OS or the path > separator. > A way to query the RM for these values would make it easier for YARN clients > to build up AM submissions with less guesswork and client-side config. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976643#comment-14976643 ] Hudson commented on YARN-4302: -- FAILURE: Integrated in Hadoop-trunk-Commit #8714 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8714/]) YARN-4302. SLS not able start due to NPE in SchedulerApplicationAttempt. (vvasudev: rev c28e16b40caf1e22f72cf2214ebc2fe2eaca4d03) * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java * hadoop-yarn-project/CHANGES.txt > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition need not be displayed properly in UI
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976529#comment-14976529 ] Sunil G commented on YARN-4304: --- Thanks [~bibinchundatt] for pointing out. As Naga mentioned, I will handle this case also in this patch. > AM max resource configuration per partition need not be displayed properly in > UI > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > > As we are supporting per-partition level max AM resource percentage > configuration, UI also need to display correct configurations related to > same. Current UI still shows am-resource percentage per queue level. This is > to be updated correctly when label config is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition need not be displayed properly in UI
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976544#comment-14976544 ] Naganarasimha G R commented on YARN-4304: - Also i think we can convert this into subtask of YARN-2492 as its not a bug and you just added the functionality of partition specific AM resource. > AM max resource configuration per partition need not be displayed properly in > UI > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > > As we are supporting per-partition level max AM resource percentage > configuration, UI also need to display correct configurations related to > same. Current UI still shows am-resource percentage per queue level. This is > to be updated correctly when label config is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4164) Retrospect update ApplicationPriority API return type
[ https://issues.apache.org/jira/browse/YARN-4164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976653#comment-14976653 ] Naganarasimha G R commented on YARN-4164: - Thanks for the patch [~rohithsharma], Yes as you mentioned, as per the patch priority will be returned as null if the app is in COMPLETED_APP_STATES, but IMHO i would not like to return null as it might lead to NPE in the client side if not properly handled, how about just returning the app's priority itself. If required to show proper logs then additional msg can be sent as part of response. The behavior what i have mentioned is also what It is currently in the REST side too. Thoughts? If you agree to my comment i think we can to do some correction in the REST side too as we are makiyng use of {{rm.getClientRMService()}} i.e. return the priority from the UpdateApplicationPriorityResponse. > Retrospect update ApplicationPriority API return type > - > > Key: YARN-4164 > URL: https://issues.apache.org/jira/browse/YARN-4164 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4164.patch, 0002-YARN-4164.patch > > > Currently {{ApplicationClientProtocol#updateApplicationPriority()}} API > returns empty UpdateApplicationPriorityResponse response. > But RM update priority to the cluster.max-priority if the given priority is > greater than cluster.max-priority. In this scenarios, need to intimate back > to client that updated priority rather just keeping quite where client > assumes that given priority itself is taken. > During application submission also has same scenario can happen, but I feel > when > explicitly invoke via ApplicationClientProtocol#updateApplicationPriority(), > response should have updated priority in response. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4288) NodeManager restart should keep retrying to register to RM while connection exception happens during RM failed over.
[ https://issues.apache.org/jira/browse/YARN-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-4288: - Attachment: YARN-4288-v2.patch Update v2 patch to fix issue on RMProxy only. > NodeManager restart should keep retrying to register to RM while connection > exception happens during RM failed over. > > > Key: YARN-4288 > URL: https://issues.apache.org/jira/browse/YARN-4288 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: YARN-4288-v2.patch, YARN-4288.patch > > > When NM get restarted, NodeStatusUpdaterImpl will try to register to RM with > RPC which could throw following exceptions when RM get restarted at the same > time, like following exception shows: > {noformat} > 2015-08-17 14:35:59,434 ERROR nodemanager.NodeStatusUpdaterImpl > (NodeStatusUpdaterImpl.java:rebootNodeStatusUpdaterAndRegisterWithRM(222)) - > Unexpected error rebooting NodeStatusUpdater > java.io.IOException: Failed on local exception: java.io.IOException: > Connection reset by peer; Host Details : local host is: "172.27.62.28"; > destination host is: "172.27.62.57":8025; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > at org.apache.hadoop.ipc.Client.call(Client.java:1473) > at org.apache.hadoop.ipc.Client.call(Client.java:1400) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > at com.sun.proxy.$Proxy36.registerNodeManager(Unknown Source) > at > org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy37.registerNodeManager(Unknown Source) > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:257) > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.rebootNodeStatusUpdaterAndRegisterWithRM(NodeStatusUpdaterImpl.java:215) > at > org.apache.hadoop.yarn.server.nodemanager.NodeManager$2.run(NodeManager.java:304) > Caused by: java.io.IOException: Connection reset by peer > at sun.nio.ch.FileDispatcherImpl.read0(Native Method) > at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) > at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) > at sun.nio.ch.IOUtil.read(IOUtil.java:197) > at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) > at > org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57) > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) > at java.io.FilterInputStream.read(FilterInputStream.java:133) > at java.io.FilterInputStream.read(FilterInputStream.java:133) > at > org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:514) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) > at java.io.BufferedInputStream.read(BufferedInputStream.java:254) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1072) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:967) > 2015-08-17 14:35:59,436 FATAL nodemanager.NodeManager > (NodeManager.java:run(307)) - Error while rebooting NodeStatusUpdater. > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: > Failed on local exception: java.io.IOException: Connection reset by peer; > Host Details : local host is: "172.27.62.28"; destination host is: > "172.27.62.57":8025; > at > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.rebootNodeStatusUpdaterAndRegisterWithRM(NodeStatusUpdaterImpl.java:223) > at >
[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976928#comment-14976928 ] Varun Saxena commented on YARN-2902: Thanks a lot [~jlowe] for the review. I was under the incorrect impression that the resource downloading will not be taken up by other containers again. You are correct we should not FAIL the resource here. It will be taken up by outstanding container when next HB comes. If we do not call handleDownloadingRsrcsOnCleanup, we wont require to synchronize scheduled map as well. Also event.getResource().getLocalPath() can be used here too. This would preclude the need for ScheduledResource class and hence the refactoring associated with it. However, as resource would not be explicitly FAILED in this case, we should probably do some cleanup when reference count of downloading resource becomes 0. Otherwise entry associated with the downloading resource will remain in LocalResourcesTrackerImpl#localResourceMap and this may show up when cache cleanup is done. And we may turn up with the same log {{LOG.error("Attempt to remove resource: " + rsrc + " with non-zero refcount");}} even though the resource is deleted on disk. I think in LocalResourcesTrackerImpl#handle, after handling RELEASE event, we should check if the reference count is 0 and whether state of resource is DOWNLOADING. And if this is so, call LocalResourcesTrackerImpl#removeResource. Thoughts ? > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976943#comment-14976943 ] Hudson commented on YARN-4302: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #543 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/543/]) YARN-4302. SLS not able start due to NPE in SchedulerApplicationAttempt. (vvasudev: rev c28e16b40caf1e22f72cf2214ebc2fe2eaca4d03) * hadoop-yarn-project/CHANGES.txt * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1509) Make AMRMClient support send increase container request and get increased/decreased containers
[ https://issues.apache.org/jira/browse/YARN-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976887#comment-14976887 ] Hadoop QA commented on YARN-1509: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 23m 38s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 11m 14s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 15m 22s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 36s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 7s | The applied patch generated 3 new checkstyle issues (total was 79, now 75). | | {color:green}+1{color} | whitespace | 0m 14s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 2m 16s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 51s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 22s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 6m 46s | Tests failed in hadoop-yarn-applications-distributedshell. | | {color:red}-1{color} | yarn tests | 8m 3s | Tests failed in hadoop-yarn-client. | | | | 72m 36s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.yarn.applications.distributedshell.TestDistributedShell | | | hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels | | | hadoop.yarn.client.api.impl.TestYarnClient | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12769009/YARN-1509.8.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / ed9806e | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/9590/artifact/patchprocess/diffcheckstylehadoop-yarn-client.txt | | hadoop-yarn-applications-distributedshell test log | https://builds.apache.org/job/PreCommit-YARN-Build/9590/artifact/patchprocess/testrun_hadoop-yarn-applications-distributedshell.txt | | hadoop-yarn-client test log | https://builds.apache.org/job/PreCommit-YARN-Build/9590/artifact/patchprocess/testrun_hadoop-yarn-client.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9590/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9590/console | This message was automatically generated. > Make AMRMClient support send increase container request and get > increased/decreased containers > -- > > Key: YARN-1509 > URL: https://issues.apache.org/jira/browse/YARN-1509 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Wangda Tan (No longer used) >Assignee: MENG DING > Attachments: YARN-1509.1.patch, YARN-1509.2.patch, YARN-1509.3.patch, > YARN-1509.4.patch, YARN-1509.5.patch, YARN-1509.6.patch, YARN-1509.7.patch, > YARN-1509.8.patch > > > As described in YARN-1197, we need add API in AMRMClient to support > 1) Add increase request > 2) Can get successfully increased/decreased containers from RM -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976952#comment-14976952 ] Hudson commented on YARN-4302: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #2534 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2534/]) YARN-4302. SLS not able start due to NPE in SchedulerApplicationAttempt. (vvasudev: rev c28e16b40caf1e22f72cf2214ebc2fe2eaca4d03) * hadoop-yarn-project/CHANGES.txt * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977077#comment-14977077 ] Jason Lowe commented on YARN-2902: -- bq. I think in LocalResourcesTrackerImpl#handle, after handling RELEASE event, we should check if the reference count is 0 and whether state of resource is DOWNLOADING. And if this is so, call LocalResourcesTrackerImpl#removeResource. Agreed. We can automatically remove the resource if the refcount of a downloaded resource ever goes to zero. And if there's a race where another container is just trying to reference that resource just as we're releasing (and removing) it from a killed container then either we'll keep it because the refcount is nonzero (request comes before release) or we'll create a new resource to track the subsequent request (release comes before request). > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977039#comment-14977039 ] Hudson commented on YARN-4302: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #591 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/591/]) YARN-4302. SLS not able start due to NPE in SchedulerApplicationAttempt. (vvasudev: rev c28e16b40caf1e22f72cf2214ebc2fe2eaca4d03) * hadoop-yarn-project/CHANGES.txt * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4283) hadoop-yarn Avoid unsafe split and append on fields that might be IPv6 literals
[ https://issues.apache.org/jira/browse/YARN-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976996#comment-14976996 ] Elliott Clark commented on YARN-4283: - +1 Patch looks good to me. Anyone else have comments? If not I'll commit in a little bit. > hadoop-yarn Avoid unsafe split and append on fields that might be IPv6 > literals > --- > > Key: YARN-4283 > URL: https://issues.apache.org/jira/browse/YARN-4283 > Project: Hadoop YARN > Issue Type: Task >Reporter: Nemanja Matkovic >Assignee: Nemanja Matkovic > Labels: ipv6 > Attachments: YARN-4283-HADOOP-11890.1.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > hadoop-yarn part of HADOOP-12122 task -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977014#comment-14977014 ] Hudson commented on YARN-4302: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #1327 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/1327/]) YARN-4302. SLS not able start due to NPE in SchedulerApplicationAttempt. (vvasudev: rev c28e16b40caf1e22f72cf2214ebc2fe2eaca4d03) * hadoop-yarn-project/CHANGES.txt * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2902: --- Attachment: YARN-2902.09.patch > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.09.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2902: --- Attachment: YARN-2902.09.patch > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.09.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2902: --- Attachment: (was: YARN-2902.09.patch) > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977247#comment-14977247 ] Varun Saxena commented on YARN-2902: [~jlowe], kindly review. The patch applies on branch-2.7 too. > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.09.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4287) Capacity Scheduler: Rack Locality improvement
[ https://issues.apache.org/jira/browse/YARN-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Roberts updated YARN-4287: - Attachment: YARN-4287-v4.patch V4 of patch. - I moved the calculation of locality delays out of canAssign() since this is a very hot path and the answer only changes when the size of the cluster changes. This caused a few unit tests to start failing because the number of nodes in the cluster was not always being mocked at the right time causing the LocalityDelays to be 0 which confused some of the assumptions. - I left the scaling approach in, but am willing to move to a rack-locality-delay that is specified as a percent. I absolutely want a node-locality-delay set to 5000, rack-locality-delay set to 5100, do something intelligent on a 3000 node cluster. - One argument for sticking with the scaling approach is the fact that we basically do it today in a simpler fashion. If you specify node-locality-delay of 5000 on a 3000 node cluster, it gets automatically scaled down to 3000 without informing the user. So I'd say scale it but don't try to explain it in user documentation. - Updated the documentation > Capacity Scheduler: Rack Locality improvement > - > > Key: YARN-4287 > URL: https://issues.apache.org/jira/browse/YARN-4287 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-4287-v2.patch, YARN-4287-v3.patch, > YARN-4287-v4.patch, YARN-4287.patch > > > YARN-4189 does an excellent job describing the issues with the current delay > scheduling algorithms within the capacity scheduler. The design proposal also > seems like a good direction. > This jira proposes a simple interim solution to the key issue we've been > experiencing on a regular basis: > - rackLocal assignments trickle out due to nodeLocalityDelay. This can have > significant impact on things like CombineFileInputFormat which targets very > specific nodes in its split calculations. > I'm not sure when YARN-4189 will become reality so I thought a simple interim > patch might make sense. The basic idea is simple: > 1) Separate delays for rackLocal, and OffSwitch (today there is only 1) > 2) When we're getting rackLocal assignments, subsequent rackLocal assignments > should not be delayed > Patch will be uploaded shortly. No big deal if the consensus is to go > straight to YARN-4189. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2902: --- Attachment: YARN-2902.09.patch > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.09.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-2902: --- Attachment: (was: YARN-2902.09.patch) > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4288) NodeManager restart should keep retrying to register to RM while connection exception happens during RM failed over.
[ https://issues.apache.org/jira/browse/YARN-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977391#comment-14977391 ] Hadoop QA commented on YARN-4288: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 19m 17s | Pre-patch trunk has 3 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 8m 15s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 42s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 45s | The applied patch generated 3 new checkstyle issues (total was 41, now 44). | | {color:red}-1{color} | whitespace | 0m 0s | The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 33s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 35s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 3m 29s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | common tests | 6m 40s | Tests failed in hadoop-common. | | {color:green}+1{color} | yarn tests | 2m 4s | Tests passed in hadoop-yarn-common. | | | | 55m 2s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.ipc.TestDecayRpcScheduler | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12769058/YARN-4288-v2.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 68ce93c | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/9591/artifact/patchprocess/trunkFindbugsWarningshadoop-yarn-common.html | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/9591/artifact/patchprocess/diffcheckstylehadoop-common.txt | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/9591/artifact/patchprocess/whitespace.txt | | hadoop-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/9591/artifact/patchprocess/testrun_hadoop-common.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/9591/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9591/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9591/console | This message was automatically generated. > NodeManager restart should keep retrying to register to RM while connection > exception happens during RM failed over. > > > Key: YARN-4288 > URL: https://issues.apache.org/jira/browse/YARN-4288 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.6.0 >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Attachments: YARN-4288-v2.patch, YARN-4288.patch > > > When NM get restarted, NodeStatusUpdaterImpl will try to register to RM with > RPC which could throw following exceptions when RM get restarted at the same > time, like following exception shows: > {noformat} > 2015-08-17 14:35:59,434 ERROR nodemanager.NodeStatusUpdaterImpl > (NodeStatusUpdaterImpl.java:rebootNodeStatusUpdaterAndRegisterWithRM(222)) - > Unexpected error rebooting NodeStatusUpdater > java.io.IOException: Failed on local exception: java.io.IOException: > Connection reset by peer; Host Details : local host is: "172.27.62.28"; > destination host is: "172.27.62.57":8025; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) > at org.apache.hadoop.ipc.Client.call(Client.java:1473) > at org.apache.hadoop.ipc.Client.call(Client.java:1400) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > at com.sun.proxy.$Proxy36.registerNodeManager(Unknown Source) > at > org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:68) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
[jira] [Commented] (YARN-4283) Avoid unsafe split and append on fields that might be IPv6 literals
[ https://issues.apache.org/jira/browse/YARN-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977546#comment-14977546 ] Elliott Clark commented on YARN-4283: - Pushed. > Avoid unsafe split and append on fields that might be IPv6 literals > --- > > Key: YARN-4283 > URL: https://issues.apache.org/jira/browse/YARN-4283 > Project: Hadoop YARN > Issue Type: Task >Reporter: Nemanja Matkovic >Assignee: Nemanja Matkovic > Labels: ipv6 > Attachments: YARN-4283-HADOOP-11890.1.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > hadoop-yarn part of HADOOP-12122 task -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4283) Avoid unsafe split and append on fields that might be IPv6 literals
[ https://issues.apache.org/jira/browse/YARN-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated YARN-4283: Summary: Avoid unsafe split and append on fields that might be IPv6 literals (was: hadoop-yarn Avoid unsafe split and append on fields that might be IPv6 literals) > Avoid unsafe split and append on fields that might be IPv6 literals > --- > > Key: YARN-4283 > URL: https://issues.apache.org/jira/browse/YARN-4283 > Project: Hadoop YARN > Issue Type: Task >Reporter: Nemanja Matkovic >Assignee: Nemanja Matkovic > Labels: ipv6 > Attachments: YARN-4283-HADOOP-11890.1.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > hadoop-yarn part of HADOOP-12122 task -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977341#comment-14977341 ] Hudson commented on YARN-4302: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #2481 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/2481/]) YARN-4302. SLS not able start due to NPE in SchedulerApplicationAttempt. (vvasudev: rev c28e16b40caf1e22f72cf2214ebc2fe2eaca4d03) * hadoop-tools/hadoop-sls/src/main/java/org/apache/hadoop/yarn/sls/scheduler/ResourceSchedulerWrapper.java * hadoop-yarn-project/CHANGES.txt > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Fix For: 2.8.0 > > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:839) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:820) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:801) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:183) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:109) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4305) NodeManager hang after namenode failover
He Tianyi created YARN-4305: --- Summary: NodeManager hang after namenode failover Key: YARN-4305 URL: https://issues.apache.org/jira/browse/YARN-4305 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.0 Reporter: He Tianyi Observed that there is a chance NodeManager got stuck after NameNode failing over, while seeing the following messages: {noformat} 2015-10-28 05:35:58,318 INFO org.apache.hadoop.io.retry.RetryInvocationHandler: Exception while invoking getFileInfo of class ClientNamenodeProtocolTranslatorPB over /10.6.128.152:5060. Trying to fail over immediately. org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87) at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1773) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1387) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getFileInfo(FSNamesystem.java:4282) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getFileInfo(NameNodeRpcServer.java:845) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getFileInfo(AuthorizationProviderProxyClientProtocol.java:519) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getFileInfo(ClientNamenodeProtocolServerSideTranslatorPB.java:840) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) at org.apache.hadoop.ipc.Client.call(Client.java:1468) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:735) at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy34.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1970) at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1128) at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1124) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1124) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl$1.run(AppLogAggregatorImpl.java:300) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainers(AppLogAggregatorImpl.java:296) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:415) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:380) at org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:384) at
[jira] [Assigned] (YARN-4303) Confusing help message if AM logs cant be retrieved via yarn logs command
[ https://issues.apache.org/jira/browse/YARN-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] nijel reassigned YARN-4303: --- Assignee: nijel > Confusing help message if AM logs cant be retrieved via yarn logs command > - > > Key: YARN-4303 > URL: https://issues.apache.org/jira/browse/YARN-4303 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Varun Saxena >Assignee: nijel >Priority: Minor > > {noformat} > yarn@BLR102525:~/test/install/hadoop/resourcemanager/bin> ./yarn logs > --applicationId application_1445832014581_0028 -am ALL > Can not get AMContainers logs for the > application:application_1445832014581_0028 > This application:application_1445832014581_0028 is finished. Please enable > the application history service. Or Using yarn logs -applicationId > -containerId --nodeAddress to get the > container logs > {noformat} > Part of the command output mentioned above indicates that using {{yarn logs > -applicationId -containerId --nodeAddress > }} will fetch desired result. It asks you to specify > nodeHttpAddress which makes it sound like we have to connect to nodemanager's > webapp address. > This help message should be changed to include command as {{yarn logs > -applicationId -containerId --nodeAddress Address>}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4306) Test failure: TestClientRMTokens
Sunil G created YARN-4306: - Summary: Test failure: TestClientRMTokens Key: YARN-4306 URL: https://issues.apache.org/jira/browse/YARN-4306 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Sunil G Assignee: Sunil G Tests are getting failed in local also. As part of HADOOP-12321 jenkins run, I see same error.: {noformat}testShortCircuitRenewCancelDifferentHostSamePort(org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens) Time elapsed: 0.638 sec <<< FAILURE! java.lang.AssertionError: expected: but was: at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.checkShortCircuitRenewCancel(TestClientRMTokens.java:363) at org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens.testShortCircuitRenewCancelDifferentHostSamePort(TestClientRMTokens.java:316) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2902) Killing a container that is localizing can orphan resources in the DOWNLOADING state
[ https://issues.apache.org/jira/browse/YARN-2902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14977604#comment-14977604 ] Hadoop QA commented on YARN-2902: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 57s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 9m 1s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 28s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 28s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 42s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 6s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 43s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 37s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 1m 25s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:red}-1{color} | yarn tests | 9m 21s | Tests failed in hadoop-yarn-server-nodemanager. | | | | 53m 52s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-nodemanager | | Failed unit tests | hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService | | | hadoop.yarn.server.nodemanager.containermanager.localizer.TestLocalResourcesTrackerImpl | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12769131/YARN-2902.09.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 68ce93c | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/9592/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-nodemanager.html | | hadoop-yarn-server-nodemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/9592/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9592/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9592/console | This message was automatically generated. > Killing a container that is localizing can orphan resources in the > DOWNLOADING state > > > Key: YARN-2902 > URL: https://issues.apache.org/jira/browse/YARN-2902 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.5.0 >Reporter: Jason Lowe >Assignee: Varun Saxena > Attachments: YARN-2902.002.patch, YARN-2902.03.patch, > YARN-2902.04.patch, YARN-2902.05.patch, YARN-2902.06.patch, > YARN-2902.07.patch, YARN-2902.08.patch, YARN-2902.09.patch, YARN-2902.patch > > > If a container is in the process of localizing when it is stopped/killed then > resources are left in the DOWNLOADING state. If no other container comes > along and requests these resources they linger around with no reference > counts but aren't cleaned up during normal cache cleanup scans since it will > never delete resources in the DOWNLOADING state even if their reference count > is zero. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4302) SLS not able start due to NPE in SchedulerApplicationAttempt#getResourceUsageReport
[ https://issues.apache.org/jira/browse/YARN-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976484#comment-14976484 ] Hadoop QA commented on YARN-4302: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 47s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 58s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 44s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 26s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 26s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 44s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 39s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 0m 55s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | tools/hadoop tests | 1m 0s | Tests passed in hadoop-sls. | | | | 44m 42s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12768982/0001-YARN-4302.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / bcb2386 | | hadoop-sls test log | https://builds.apache.org/job/PreCommit-YARN-Build/9589/artifact/patchprocess/testrun_hadoop-sls.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/9589/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/9589/console | This message was automatically generated. > SLS not able start due to NPE in > SchedulerApplicationAttempt#getResourceUsageReport > --- > > Key: YARN-4302 > URL: https://issues.apache.org/jira/browse/YARN-4302 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4302.patch, 0001-YARN-4302.patch > > > Configure the samples from tools/sls > yarn-site.xml > capacityscheduler.xml > sls-runner.xml > to /etc/hadoop > Start sls using > > bin/slsrun.sh --input-rumen=sample-data/2jobs2min-rumen-jh.json > --output-dir=out > {noformat} > 15/10/27 14:43:36 ERROR resourcemanager.ResourceManager: Error in handling > event type ATTEMPT_ADDED for applicationAttempt application_1445937212593_0001 > java.lang.NullPointerException > at org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java:117) > at org.apache.hadoop.yarn.util.resource.Resources.multiply(Resources.java:151) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceUsageReport(SchedulerApplicationAttempt.java:692) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.getAppResourceUsageReport(AbstractYarnScheduler.java:326) > at > org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.getAppResourceUsageReport(ResourceSchedulerWrapper.java:912) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptMetrics.getAggregateAppResourceUsage(RMAppAttemptMetrics.java:121) > at > org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore.storeNewApplicationAttempt(RMStateStore.java:819) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.storeAttempt(RMAppAttemptImpl.java:2011) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.access$2700(RMAppAttemptImpl.java:109) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:1021) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$ScheduleTransition.transition(RMAppAttemptImpl.java:974) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at >
[jira] [Commented] (YARN-4304) AM max resource configuration per partition need not be displayed properly in UI
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976483#comment-14976483 ] Bibin A Chundatt commented on YARN-4304: Not a problem at all. [~sunilg] please do consider metrics too. > AM max resource configuration per partition need not be displayed properly in > UI > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > > As we are supporting per-partition level max AM resource percentage > configuration, UI also need to display correct configurations related to > same. Current UI still shows am-resource percentage per queue level. This is > to be updated correctly when label config is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4287) Capacity Scheduler: Rack Locality improvement
[ https://issues.apache.org/jira/browse/YARN-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976386#comment-14976386 ] Nathan Roberts commented on YARN-4287: -- Thanks [~leftnoteasy] for the quick responses. {quote} I think instead of scaling, I suggest to simply cap rack/offswitch delay by the cluster size, so: rack-delay = min(offswitch, node-locality-delay, cluserSize) offswitch-delay = min(offswitch, clusterSize) The scaling behavior could be hard to explain to end users. {quote} I agree that it's not as easy to describe. BUT, the problem I have is that I don't know how to deal with the common case of someone wanting node-locality-delay to be based on the size of the cluster. What we do is set node-locality-delay to something guaranteed to be larger than the cluster, knowing the scheduler will automatically lower it to the size of the cluster. This works great for a single delay on any size cluster. However, it's impossible to describe two different delays using this same approach. For example, I might always want node-locality-delay to be 10% less than rack-locality-delay. Maybe we should specify rack-locality-delay as a percentage above node-locality-delay ( 10%)? Still a little hard to describe though. > Capacity Scheduler: Rack Locality improvement > - > > Key: YARN-4287 > URL: https://issues.apache.org/jira/browse/YARN-4287 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler >Affects Versions: 2.7.1 >Reporter: Nathan Roberts >Assignee: Nathan Roberts > Attachments: YARN-4287-v2.patch, YARN-4287-v3.patch, YARN-4287.patch > > > YARN-4189 does an excellent job describing the issues with the current delay > scheduling algorithms within the capacity scheduler. The design proposal also > seems like a good direction. > This jira proposes a simple interim solution to the key issue we've been > experiencing on a regular basis: > - rackLocal assignments trickle out due to nodeLocalityDelay. This can have > significant impact on things like CombineFileInputFormat which targets very > specific nodes in its split calculations. > I'm not sure when YARN-4189 will become reality so I thought a simple interim > patch might make sense. The basic idea is simple: > 1) Separate delays for rackLocal, and OffSwitch (today there is only 1) > 2) When we're getting rackLocal assignments, subsequent rackLocal assignments > should not be delayed > Patch will be uploaded shortly. No big deal if the consensus is to go > straight to YARN-4189. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition need not be displayed properly in UI
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14976396#comment-14976396 ] Naganarasimha G R commented on YARN-4304: - Hi [~bibinchundatt], good point i think we can sum it up and show along with this patch, as it small change better to be done here itself, what do you say [~sunilg] ? > AM max resource configuration per partition need not be displayed properly in > UI > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > > As we are supporting per-partition level max AM resource percentage > configuration, UI also need to display correct configurations related to > same. Current UI still shows am-resource percentage per queue level. This is > to be updated correctly when label config is used. -- This message was sent by Atlassian JIRA (v6.3.4#6332)