[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
[ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495481#comment-15495481 ] Naganarasimha G R commented on YARN-5545: - Thanks [~sunilg],[~wangda] & [~jlowe], for taking the discussion forward. I had few queries still # GlobalMaximumApplicationsPerQueue doesnt have any default set right ? if set then there is no need for {{maxSystemApps * queueCapacities.getAbsoluteCapacity()}} as it will never reach # IMO approach which was captured by Sunil in his earlier [comment|https://issues.apache.org/jira/browse/YARN-5545?focusedCommentId=15494147=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15494147] is not solving the base problem completely. Problem started with {{maxSystemApps * queueCapacities.getAbsoluteCapacity()}}, which partition's absolute capacity needs to be considered when for a given queue is not overriding max applications and default capacity of the queue is zero. So based on your approach only way to avoid it is to set {{GlobalMaximumApplicationsPerQueue}} so this would imply that for all the queues this value will be taken and earlier approach of {{maxSystemApps * queueCapacities.getAbsoluteCapacity()}} will not be considered. # I feel that {{enforce strict checking}} should have been implicit requirement with the assumption that the admin would have not configured in a way that queue max apps exceeds system max apps. And we need not validate the configuration that all queue's max apps is not greater than system max apps but just validate that while submitting the app first the system level max apps are not getting violated and then queue level max app is not getting violated. Thoughts ? > App submit failure on queue with label when default queue partition capacity > is zero > > > Key: YARN-5545 > URL: https://issues.apache.org/jira/browse/YARN-5545 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, > YARN-5545.0003.patch, capacity-scheduler.xml > > > Configure capacity scheduler > yarn.scheduler.capacity.root.default.capacity=0 > yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50 > yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50 > Submit application as below > ./yarn jar > ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar > sleep -Dmapreduce.job.node-label-expression=labelx > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1 > {noformat} > 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001 > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed > to submit application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136) > at > org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >
[jira] [Commented] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue
[ https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495424#comment-15495424 ] Jian He commented on YARN-3140: --- lgtm, committing in a day if no comments from others. > Improve locks in AbstractCSQueue/LeafQueue/ParentQueue > -- > > Key: YARN-3140 > URL: https://issues.apache.org/jira/browse/YARN-3140 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3140.1.patch, YARN-3140.2.patch, YARN-3140.3.patch > > > Enhance locks in AbstractCSQueue/LeafQueue/ParentQueue, as mentioned in > YARN-3091, a possible solution is using read/write lock. Other fine-graind > locks for specific purposes / bugs should be addressed in separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
[ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495350#comment-15495350 ] Bibin A Chundatt commented on YARN-5545: [~sunilg] {quote} However I think we do not need another config to enforce strict checking. It can be done in todays form. {quote} To keep the old behavior we can keep the value as false by default. > App submit failure on queue with label when default queue partition capacity > is zero > > > Key: YARN-5545 > URL: https://issues.apache.org/jira/browse/YARN-5545 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, > YARN-5545.0003.patch, capacity-scheduler.xml > > > Configure capacity scheduler > yarn.scheduler.capacity.root.default.capacity=0 > yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50 > yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50 > Submit application as below > ./yarn jar > ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar > sleep -Dmapreduce.job.node-label-expression=labelx > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1 > {noformat} > 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001 > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed > to submit application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136) > at > org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit > application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301) > ... 25 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5635) Better handling when bad script is configured as Node's HealthScript
[ https://issues.apache.org/jira/browse/YARN-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-5635: --- Assignee: (was: Yufei Gu) > Better handling when bad script is configured as Node's HealthScript > > > Key: YARN-5635 > URL: https://issues.apache.org/jira/browse/YARN-5635 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Allen Wittenauer > > Earlier fix to YARN-5567 is reverted because its not ideal to get the whole > cluster down because of a bad script. At the same time its important to > report that script is erroneous which is configured as node health script as > it might miss to detect bad health of a node. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
[ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495189#comment-15495189 ] Sunil G commented on YARN-5545: --- Thanks [~jlowe] for the valuable thoughts and suggestions. Thanks [~leftnoteasy]. It makes sense for me. [~bibinchundatt], However I think we do not need another config to enforce strict checking. It can be done in todays form. I will file a followup jira for same. IN that, we can check and reject app submission to any queue, if system-wide limit is met. Thoughts? > App submit failure on queue with label when default queue partition capacity > is zero > > > Key: YARN-5545 > URL: https://issues.apache.org/jira/browse/YARN-5545 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, > YARN-5545.0003.patch, capacity-scheduler.xml > > > Configure capacity scheduler > yarn.scheduler.capacity.root.default.capacity=0 > yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50 > yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50 > Submit application as below > ./yarn jar > ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar > sleep -Dmapreduce.job.node-label-expression=labelx > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1 > {noformat} > 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001 > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed > to submit application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136) > at > org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit > application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301) > ... 25 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Commented] (YARN-5655) TestContainerManagerSecurity is failing
[ https://issues.apache.org/jira/browse/YARN-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495169#comment-15495169 ] Robert Kanter commented on YARN-5655: - This only seems to be a problem in branch-2.8 and branch-2. trunk seems to be fine. However, I'm getting a different failure than you: {noformat} Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 44.517 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) Time elapsed: 23.939 sec <<< FAILURE! java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:360) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) Time elapsed: 19.823 sec <<< FAILURE! java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:360) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) {noformat} (The {{null}} is misleading and is because of a junit bug which happens when an {{assertTrue}} or {{assertFalse}} has no (optional) message) > TestContainerManagerSecurity is failing > --- > > Key: YARN-5655 > URL: https://issues.apache.org/jira/browse/YARN-5655 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Jason Lowe >Assignee: Robert Kanter > > TestContainerManagerSecurity has been failing recently in 2.8: > {noformat} > Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity > Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity > testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) > Time elapsed: 44.478 sec <<< ERROR! > java.lang.NullPointerException: null > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) > testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) > Time elapsed: 34.964 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
[ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495161#comment-15495161 ] Bibin A Chundatt commented on YARN-5545: {quote} Since the maximum-application is major used to cap memory consumed by apps in RM. So I think at least in a follow up JIRA, system-level maximum applications should be enforced. {quote} +1 for the same. similar to cgroups we can add configuration for strict mode to be enable. > App submit failure on queue with label when default queue partition capacity > is zero > > > Key: YARN-5545 > URL: https://issues.apache.org/jira/browse/YARN-5545 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, > YARN-5545.0003.patch, capacity-scheduler.xml > > > Configure capacity scheduler > yarn.scheduler.capacity.root.default.capacity=0 > yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50 > yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50 > Submit application as below > ./yarn jar > ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar > sleep -Dmapreduce.job.node-label-expression=labelx > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1 > {noformat} > 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001 > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed > to submit application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136) > at > org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit > application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301) > ... 25 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495060#comment-15495060 ] Eric Payne commented on YARN-4945: -- [~sunilg], I noticed in the resourcemanager log that the metrics were not as I would expect after running applications. For example, after 1 application has completed running, the {{#queue-active-applications}} metrics remains 1 instead of 0: {code} 2016-09-16 01:11:10,189 [SchedulerEventDispatcher:Event Processor] INFO capacity.LeafQueue: Application removed - appId: application_1473988192446_0001 user: hadoop1 queue: glamdring #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 1 {code} After 3 applications have run, the metrics are even more unexpected: {code} 2016-09-16 01:12:34,622 [SchedulerEventDispatcher:Event Processor] INFO capacity.LeafQueue: Application removed - appId: application_1473988192446_0003 user: hadoop1 queue: glamdring #user-pending-applications: -4 #user-active-applications: 4 #queue-pending-applications: 0 #queue-active-applications: 3 {code} I believe the cause of this is in {{LeafQueue#getAllApplications}}: {code} public Collection getAllApplications() { Collection apps = pendingOrderingPolicy.getSchedulableEntities(); apps.addAll(orderingPolicy.getSchedulableEntities()); return Collections.unmodifiableCollection(apps); } {code} The call to {{pendingOrderingPolicy.getSchedulableEntities()}} returns the {{AbstractComparatorOrderingPolicy#schedulableEntities}} object, and then the call to {{apps.addAll(orderingPolicy.getSchedulableEntities())}} adds additional {{FiCaSchedulerApp}}'s to {{schedulableEntities}}. By creating a copy of the return value of {{pendingOrderingPolicy.getSchedulableEntities()}}, I have been able to verify that the {{schedulableEntities}} does not have extra entries. For example: {code} public Collection getAllApplications() { Collection apps = new TreeSet( pendingOrderingPolicy.getSchedulableEntities()); apps.addAll(orderingPolicy.getSchedulableEntities()); return Collections.unmodifiableCollection(apps); } {code} > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, > YARN-2009.v1.patch, YARN-2009.v2.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5642) Typos in 9 log messages
[ https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495042#comment-15495042 ] Mehran Hassani commented on YARN-5642: -- This means my patch has conflicts with the trunk ? > Typos in 9 log messages > > > Key: YARN-5642 > URL: https://issues.apache.org/jira/browse/YARN-5642 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Mehran Hassani >Priority: Trivial > Labels: newbie > Attachments: YARN-5642.001.patch > > > I am conducting research on log related bugs. I tried to make a tool to fix > repetitive yet simple patterns of bugs that are related to logs. Typos in log > messages are one of the reoccurring bugs. Therefore, I made a tool find typos > in log statements. During my experiments, I managed to find the following > typos in Hadoop YARN: > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java, > LOG.info("AsyncDispatcher is draining to stop igonring any new events."), > igonring should be ignoring > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java, > LOG.info(authorizerClass.getName() + " is instiantiated."), > instiantiated should be instantiated > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java, > LOG.info("Completed reading history information of all conatiners"+ " of > application attempt " + appAttemptId), > conatiners should be containers > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java, > LOG.info("Neither virutal-memory nor physical-memory monitoring is " > +"needed. Not running the monitor-thread"), > virutal should be virtual > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java, > LOG.info("Intialized plan {} based on reservable queue {}" plan.toString() > planQueueName), > Intialized should be Initialized > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java, > LOG.info("Initializing " + queueName + "\n" +"capacity = " + > queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + > "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= > parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + > queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" > +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" > [= 1.0 maximumCapacity undefined " +"(parentAbsoluteMaxCapacity * > maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= > configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= > configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications > +" [= configuredMaximumSystemApplicationsPerQueue or" +" > (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" > +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= > (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" > +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= > usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" > +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / > clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + > maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + > "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= > (float)(maximumAllocationMemory - minimumAllocationMemory) / " > +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + > maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " > + numContainers +" [= currentNumContainers ]" + "\n" +"state = " + state +" > [= configuredState ]" + "\n" +"acls = " + aclsString +" [= configuredAcls ]" > + "\n" +"nodeLocalityDelay = " + nodeLocalityDelay + "\n" +"labels=" + > labelStrBuilder.toString() + "\n" +"reservationsContinueLooking = " > +reservationsContinueLooking + "\n" +"preemptionDisabled = " + > getPreemptionDisabled() + "\n" +"defaultAppPriorityPerQueue = " + > defaultAppPriorityPerQueue), > asbolute should be absolute > In file >
[jira] [Commented] (YARN-5642) Typos in 9 log messages
[ https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495027#comment-15495027 ] Hadoop QA commented on YARN-5642: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} YARN-5642 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828778/YARN-5642.001.patch | | JIRA Issue | YARN-5642 | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13119/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Typos in 9 log messages > > > Key: YARN-5642 > URL: https://issues.apache.org/jira/browse/YARN-5642 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Mehran Hassani >Priority: Trivial > Labels: newbie > Attachments: YARN-5642.001.patch > > > I am conducting research on log related bugs. I tried to make a tool to fix > repetitive yet simple patterns of bugs that are related to logs. Typos in log > messages are one of the reoccurring bugs. Therefore, I made a tool find typos > in log statements. During my experiments, I managed to find the following > typos in Hadoop YARN: > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java, > LOG.info("AsyncDispatcher is draining to stop igonring any new events."), > igonring should be ignoring > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java, > LOG.info(authorizerClass.getName() + " is instiantiated."), > instiantiated should be instantiated > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java, > LOG.info("Completed reading history information of all conatiners"+ " of > application attempt " + appAttemptId), > conatiners should be containers > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java, > LOG.info("Neither virutal-memory nor physical-memory monitoring is " > +"needed. Not running the monitor-thread"), > virutal should be virtual > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java, > LOG.info("Intialized plan {} based on reservable queue {}" plan.toString() > planQueueName), > Intialized should be Initialized > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java, > LOG.info("Initializing " + queueName + "\n" +"capacity = " + > queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + > "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= > parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + > queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" > +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" > [= 1.0 maximumCapacity undefined " +"(parentAbsoluteMaxCapacity * > maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= > configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= > configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications > +" [= configuredMaximumSystemApplicationsPerQueue or" +" > (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" > +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= > (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" > +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= > usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" > +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / > clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + > maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + > "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= >
[jira] [Updated] (YARN-5642) Typos in 9 log messages
[ https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehran Hassani updated YARN-5642: - Attachment: YARN-5642.001.patch > Typos in 9 log messages > > > Key: YARN-5642 > URL: https://issues.apache.org/jira/browse/YARN-5642 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Mehran Hassani >Priority: Trivial > Labels: newbie > Attachments: YARN-5642.001.patch > > > I am conducting research on log related bugs. I tried to make a tool to fix > repetitive yet simple patterns of bugs that are related to logs. Typos in log > messages are one of the reoccurring bugs. Therefore, I made a tool find typos > in log statements. During my experiments, I managed to find the following > typos in Hadoop YARN: > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java, > LOG.info("AsyncDispatcher is draining to stop igonring any new events."), > igonring should be ignoring > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java, > LOG.info(authorizerClass.getName() + " is instiantiated."), > instiantiated should be instantiated > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java, > LOG.info("Completed reading history information of all conatiners"+ " of > application attempt " + appAttemptId), > conatiners should be containers > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java, > LOG.info("Neither virutal-memory nor physical-memory monitoring is " > +"needed. Not running the monitor-thread"), > virutal should be virtual > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java, > LOG.info("Intialized plan {} based on reservable queue {}" plan.toString() > planQueueName), > Intialized should be Initialized > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java, > LOG.info("Initializing " + queueName + "\n" +"capacity = " + > queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + > "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= > parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + > queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" > +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" > [= 1.0 maximumCapacity undefined " +"(parentAbsoluteMaxCapacity * > maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= > configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= > configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications > +" [= configuredMaximumSystemApplicationsPerQueue or" +" > (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" > +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= > (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" > +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= > usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" > +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / > clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + > maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + > "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= > (float)(maximumAllocationMemory - minimumAllocationMemory) / " > +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + > maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " > + numContainers +" [= currentNumContainers ]" + "\n" +"state = " + state +" > [= configuredState ]" + "\n" +"acls = " + aclsString +" [= configuredAcls ]" > + "\n" +"nodeLocalityDelay = " + nodeLocalityDelay + "\n" +"labels=" + > labelStrBuilder.toString() + "\n" +"reservationsContinueLooking = " > +reservationsContinueLooking + "\n" +"preemptionDisabled = " + > getPreemptionDisabled() + "\n" +"defaultAppPriorityPerQueue = " + > defaultAppPriorityPerQueue), > asbolute should be absolute > In file >
[jira] [Commented] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494957#comment-15494957 ] Kai Sasaki commented on YARN-5145: -- [~sunilg] I see. Sorry for my misunderstanding and missed the documentation. I'll update to use {{configs.env}} under {{HADOOP_CONF_DIR}} as described initially. Thanks you so much for clear explanation! > [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR > - > > Key: YARN-5145 > URL: https://issues.apache.org/jira/browse/YARN-5145 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Kai Sasaki > Attachments: YARN-5145-YARN-3368.01.patch > > > Existing YARN UI configuration is under Hadoop package's directory: > $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to > $HADOOP_CONF_DIR like other configurations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5642) Typos in 9 log messages
[ https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehran Hassani updated YARN-5642: - Summary: Typos in 9 log messages (was: Typos in 11 log messages ) > Typos in 9 log messages > > > Key: YARN-5642 > URL: https://issues.apache.org/jira/browse/YARN-5642 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Mehran Hassani >Priority: Trivial > Labels: newbie > > I am conducting research on log related bugs. I tried to make a tool to fix > repetitive yet simple patterns of bugs that are related to logs. Typos in log > messages are one of the reoccurring bugs. Therefore, I made a tool find typos > in log statements. During my experiments, I managed to find the following > typos in Hadoop YARN: > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java, > LOG.info("AsyncDispatcher is draining to stop igonring any new events."), > igonring should be ignoring > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java, > LOG.info(authorizerClass.getName() + " is instiantiated."), > instiantiated should be instantiated > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java, > LOG.info("Completed reading history information of all conatiners"+ " of > application attempt " + appAttemptId), > conatiners should be containers > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java, > LOG.info("Neither virutal-memory nor physical-memory monitoring is " > +"needed. Not running the monitor-thread"), > virutal should be virtual > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java, > LOG.info("Intialized plan {} based on reservable queue {}" plan.toString() > planQueueName), > Intialized should be Initialized > In file > /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java, > LOG.info("Initializing " + queueName + "\n" +"capacity = " + > queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + > "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= > parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + > queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" > +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" > [= 1.0 maximumCapacity undefined " +"(parentAbsoluteMaxCapacity * > maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= > configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= > configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications > +" [= configuredMaximumSystemApplicationsPerQueue or" +" > (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" > +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= > (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" > +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= > usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" > +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / > clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + > maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + > "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= > (float)(maximumAllocationMemory - minimumAllocationMemory) / " > +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + > maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " > + numContainers +" [= currentNumContainers ]" + "\n" +"state = " + state +" > [= configuredState ]" + "\n" +"acls = " + aclsString +" [= configuredAcls ]" > + "\n" +"nodeLocalityDelay = " + nodeLocalityDelay + "\n" +"labels=" + > labelStrBuilder.toString() + "\n" +"reservationsContinueLooking = " > +reservationsContinueLooking + "\n" +"preemptionDisabled = " + > getPreemptionDisabled() + "\n" +"defaultAppPriorityPerQueue = " + > defaultAppPriorityPerQueue), > asbolute should be absolute > In file >
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494825#comment-15494825 ] Wangda Tan commented on YARN-4945: -- 1) YarnConfiguration: - Instead of have a separate SELECT_CANDIDATES_FOR_INTRAQUEUE_PREEMPTION, should we only have a "queue.intra-queue-preemption-enabled"? I cannot clearly think what it means in semantic, one example is, after we have user-limit preemption support, what happens if we only enable the user-limit preemption (without priority preemption enabled)? 2) PCPP: - Unused imports / methods - getPartitionResource: avoid clone resources? Because we will clone resource twice for every app. If you concern about consistency, you can clone it once before starting preemption calculation - It seems to me, partitionToUnderServedQueues can be kept in AbstractPreemptableResourceCalculator. In addition, Mapcould be Map . (LinkedHashSet is not necessarily needed, because we won't have two TempQueuePerPartition with the same queueName and same partition) 3) CapacitySchedulerPreemptionUtils: - deductPreemptableResourcePerApp, is following a valid comment? bq. // High priority app is coming first - Remove unnecessary param in method and new generic type (like new HashMap(...)), better to move to Intellij? :p - {getResToObtainByPartitionForApps}} can be removed, we can directly use policy.getResourceDemandFromAppsPerQueue 4) FiCaSchedulerApp: Mvoe getTotalPendingRequestsPerPartition to ResourceUsage? I can see we could have requirements to: getUsedResourceByPartition, getReservedReosurceByPartition, etc. in the future 5) PreemptionCandidatesSelector: - All non-abstract methods can be static, correct? - All TODOs in comments are done, correct? 6) IntraQueuePreemptionPolicy and PriorityIntraQueuePreemptionPolicy: - Overall: Do you think if the name: -Policy is too big? What it essentially do is computing how much resource to preempt from each app, how about call it something like IntraQueuePreemptionComputePlugin? Would like to hear thoughts from you and Eric for this as well. - Rename the PriorityIntraQueuePreemptionPolicy to FifoIntraQueuePreemptionPolicy if you agree with [my comment|https://issues.apache.org/jira/browse/YARN-4945?focusedCommentId=15494454=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15494454] - PriorityIntraQueuePreemptionPolicy#getResourceDemandFromAppsPerQueue: a. resToObtainByPartition can be removed from parameter b. IIUC, it gets resourceToObtain for each app instead of gets resourceDemand for each app, rename it properly? c. This logic is not correct: {code} // If demand is from same priority level, skip the same. if (!tq.isPendingDemandForHigherPriorityApps(a1.getPriority())) { continue; } {code} It can only avoid highest priority in a queue applications preempt from each other, but it cannot avoid 2nd highest applications from each other. And the performance can be improved as well, I believe in some settings, maxAppPriority can be as much as MAX_INT. Please look for below comments/pesudo code for details. - computeAppsIdealAllocation: a. Calling getUserLimitHeadRoomPerApp is too expensive, instead we can add one method in LeafQueue to get UserLimit by userName. Have a Map of username to headroom inside the method can compute user limit at most once for different user. And this logic can be reused to compute user-limit preemption b. {{tq.addPendinResourcePerPriority(tmpApp.getPriority(), tmpApp.pending);}} could be changed if you agree with above .c c. I think we should move the {{skip the same priority demand}} logic into this method. One approach in my mind is: {code} // General idea: // Use two pointer, one from most prioritized app, one from least prioritized app // Each app has two quotas, one is how much resource required (ideal - used), // Another one is how much resource can be preempted // Move the two pointer and update the two quotas to get: // For application X, is there any app with higher priority need the resource? p1 = most-prioritized-app.iterator p2 = least-prioritized-app.iterator // For each app, we have: // - "toPreemptFromOther" which initialized to (ideal - (used - selected)). // - "actuallyToBePreempted" initialized to 0 while (p1.getPriority() > p2.getPriority() && p1 != p2) { Resource rest = p2.toBePreempt - p2.actuallyToBePreempted; if (rest > 0) { if (p1.toBePreemptFromOther > 0) { Resource toPreempt = min(p1.toBePreemptFromOther, rest); p1.toBePreemptFromOther -= toPreempt p2.actuallyToBePreempted += toPreempt } } if (p2.toBePreempt - p2.actuallyToBePreempted == 0) { // Nothing more can be preempt from p2, move to next p2 --; } if (p1.toBePreemptFromOther == 0) {
[jira] [Updated] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery
[ https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-5638: Attachment: YARN-5638-trunk.v1.patch First version of patch to demonstrate the idea. Right now I've split collector discovery process into two steps: In the first step, collector manager reports the collector to NM, and NM sends the collector data to the RM for registration. In the second step, the RM (synchronously) assigns the collector a timestamp (rm's timestamp and a version number) and store it in memory. The RM then updates known collector data via heartbeats to all NMs as before. The only difference is the RM attach timestamp information for each collector to NMs so that once there's a rebuild process, NMs can report this information. > Introduce a collector timestamp to uniquely identify collectors creation > order in collector discovery > - > > Key: YARN-5638 > URL: https://issues.apache.org/jira/browse/YARN-5638 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-5638-trunk.v1.patch > > > As discussed in YARN-3359, we need to further identify timeline collectors' > creation order to rebuild collector discovery data in the RM. This JIRA > proposes to useto order collectors > for each application in the RM. This timestamp can then be used when a > standby RM becomes active and rebuild collector discovery data. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494699#comment-15494699 ] Hadoop QA commented on YARN-3141: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 15 new + 48 unchanged - 23 fixed = 63 total (was 71) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 37m 55s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 54m 1s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828731/YARN-3141.4.patch | | JIRA Issue | YARN-3141 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 4b1c39304449 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fcbac00 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13118/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/13118/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit test logs | https://builds.apache.org/job/PreCommit-YARN-Build/13118/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results |
[jira] [Commented] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery
[ https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494688#comment-15494688 ] Li Lu commented on YARN-5638: - Updated the description of this JIRA. What we need here is not a new type of "collector id", but to store timestamp data in the RMs and NMs for the collectors. This can address the problem when we rebuild collector status for a new active rm, as discussed in YARN-3359: bq. when one application has two different attempts running (due to some network problems, for example) and the RM is trying to rebuild collector status, the RM needs to know which collector is for the latest app attempt and which one is for the stale attempt. We do not necessarily need to associate collectors to application attempts. Actually, according to timeline server v2 design, we should only associate app collectors to applications. However, when maintaining collector data in RMs and NMs, we can store the timestamp of each collector. In this way, when the RM needs to rebuild collector status, it can gather all known collector data from NMs, use the timestamp to decide the most recent state of the collectors, and then rebuild all states. > Introduce a collector timestamp to uniquely identify collectors creation > order in collector discovery > - > > Key: YARN-5638 > URL: https://issues.apache.org/jira/browse/YARN-5638 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > > As discussed in YARN-3359, we need to further identify timeline collectors' > creation order to rebuild collector discovery data in the RM. This JIRA > proposes to useto order collectors > for each application in the RM. This timestamp can then be used when a > standby RM becomes active and rebuild collector discovery data. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery
[ https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-5638: Description: As discussed in YARN-3359, we need to further identify timeline collectors' creation order to rebuild collector discovery data in the RM. This JIRA proposes to useto order collectors for each application in the RM. This timestamp can then be used when a standby RM becomes active and rebuild collector discovery data. (was: As discussed in YARN-3359, we need to further identify timeline collectors and their creation order for better service discovery and resource isolation. This JIRA proposes to use to accurately identify each timeline collector. ) > Introduce a collector timestamp to uniquely identify collectors creation > order in collector discovery > - > > Key: YARN-5638 > URL: https://issues.apache.org/jira/browse/YARN-5638 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > > As discussed in YARN-3359, we need to further identify timeline collectors' > creation order to rebuild collector discovery data in the RM. This JIRA > proposes to use to order collectors > for each application in the RM. This timestamp can then be used when a > standby RM becomes active and rebuild collector discovery data. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery
[ https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-5638: Summary: Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery (was: Introduce a collector Id to uniquely identify collectors and their creation order) > Introduce a collector timestamp to uniquely identify collectors creation > order in collector discovery > - > > Key: YARN-5638 > URL: https://issues.apache.org/jira/browse/YARN-5638 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > > As discussed in YARN-3359, we need to further identify timeline collectors > and their creation order for better service discovery and resource isolation. > This JIRA proposes to useto accurately identify > each timeline collector. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5336) Put in some limit for accepting key-values in hbase writer
[ https://issues.apache.org/jira/browse/YARN-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-5336: - Assignee: Haibo Chen (was: Vrushali C) > Put in some limit for accepting key-values in hbase writer > -- > > Key: YARN-5336 > URL: https://issues.apache.org/jira/browse/YARN-5336 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Haibo Chen > Labels: YARN-5355 > > As recommended by [~jrottinghuis] , need to add in some limit (default and > configurable) for accepting key values to be written to the backend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5336) Put in some limit for accepting key-values in hbase writer
[ https://issues.apache.org/jira/browse/YARN-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494615#comment-15494615 ] Vrushali C commented on YARN-5336: -- Assigning to [~haibochen] > Put in some limit for accepting key-values in hbase writer > -- > > Key: YARN-5336 > URL: https://issues.apache.org/jira/browse/YARN-5336 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Haibo Chen > Labels: YARN-5355 > > As recommended by [~jrottinghuis] , need to add in some limit (default and > configurable) for accepting key values to be written to the backend. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
[ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494592#comment-15494592 ] Wangda Tan commented on YARN-5545: -- Thanks [~jlowe], [~sunilg] for suggestions. I generally agree with approach at https://issues.apache.org/jira/browse/YARN-5545?focusedCommentId=15494147=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15494147. Since the maximum-application is major used to cap memory consumed by apps in RM. So I think at least in a follow up JIRA, system-level maximum applications should be enforced. We should not allow pending + running apps number beyond system-level maximum applications. Without this, it gonna be hard to estimate how many apps in RM. Thoughts? > App submit failure on queue with label when default queue partition capacity > is zero > > > Key: YARN-5545 > URL: https://issues.apache.org/jira/browse/YARN-5545 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, > YARN-5545.0003.patch, capacity-scheduler.xml > > > Configure capacity scheduler > yarn.scheduler.capacity.root.default.capacity=0 > yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50 > yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50 > Submit application as below > ./yarn jar > ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar > sleep -Dmapreduce.job.node-label-expression=labelx > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1 > {noformat} > 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001 > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed > to submit application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136) > at > org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit > application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301) > ... 25 more > {noformat} -- This message was
[jira] [Updated] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3141: - Attachment: YARN-3141.4.patch Attached ver.4 patch. > Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp > -- > > Key: YARN-3141 > URL: https://issues.apache.org/jira/browse/YARN-3141 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch, > YARN-3141.4.patch > > > Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, > as mentioned in YARN-3091, a possible solution is using read/write lock. > Other fine-graind locks for specific purposes / bugs should be addressed in > separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494547#comment-15494547 ] Wangda Tan commented on YARN-3141: -- Thanks [~templedf], However I think for most of the comments, we should keep them as-is, a volatile varible doesn't mean we don't need *extra lock to protect consistency between variables*. For a simplest example, {code} volatile boolean a; volatile int b; void update_b(b') { if (a) { b = b' } } void update_a(a') { a = a' } boolean get_a() { return a; } boolean get_b() { return b; } {code} If two separate thread, thread #1 calls update_b first, and thread #2 calls update_a, it is possible that update_a returns before update_b returns. And if we read the two variables, data inconsistency happens. So I would rather be more conservative: if a method read some fields and write some fields, the safest way is add a single write lock to protect all them. Same to a method read multiple fields, we should have a single readlock for them. Most of the comments fall into the category, we could not shorten the critical sections of them. What I have addressed in this patch: bq. SchedulerApplicationAttempt.getLiveContainersMap() should be default visibility and @VisibleForTesting bq. In FSAppAttempt.getAllowedLocalityLevel(), FSAppAttempt.getAllowedLocalityLevelByTime(), FSAppAttempt.setReservation(), and FSAppAttempt.clearReservation() the write lock acquisition can be delayed until after the arg validation bq. There's an unused import in FaCiSchedulerApp bq. > Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp > -- > > Key: YARN-3141 > URL: https://issues.apache.org/jira/browse/YARN-3141 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch > > > Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, > as mentioned in YARN-3091, a possible solution is using read/write lock. > Other fine-graind locks for specific purposes / bugs should be addressed in > separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494454#comment-15494454 ] Wangda Tan commented on YARN-4945: -- [~eepayne], bq. I need to understand what it would mean to combine all intra-queue priority policies into one. To clarify, we may not combine *all* intra-queue policies into one, but if you look at queue internal policies. There are majorly two groups: 1) Fair + user-limit + priority 2) Fifo + user-limit + priority User-limit and priority will be always on and ordering policy like Fair/Fifo is a changeable config. So it makes sense to me to have two different policies, one for Fifo (plus priority/UL) and Fair (same plus priority/UL) bq. If they are combined, then is it still necessary to make IntraQueuePreemptionPolicy an interface? As I mentioned above, we can have a fair intra-queue policy. To be honest, I haven't thought a good way that a list of policies can better solve the priority + user-limit preemption problem. Could you share some ideas about it. For example, how to better consider both in the final decision > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, > YARN-2009.v1.patch, YARN-2009.v2.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application
[ https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494437#comment-15494437 ] Hadoop QA commented on YARN-3692: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 58s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 2s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 27s {color} | {color:red} root: The patch generated 1 new + 233 unchanged - 1 fixed = 234 total (was 234) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 22s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client generated 1 new + 157 unchanged - 0 fixed = 158 total (was 157) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 37m 29s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 0s {color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 115m 17s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 220m 50s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828683/0005-YARN-3692.1.patch | | JIRA Issue | YARN-3692 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux 1355118415f6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | |
[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494440#comment-15494440 ] Daniel Templeton commented on YARN-3141: Continuing with more comments on v2. Sorry, I started these before you uploaded v3. These comments are a little more speculative. I'm not 100% certain that everything I'm recommending is safe. :) * {{SchedulerApplicationAttempt.getLiveContainersMap()}} should be default visibility and {{@VisibleForTesting}} * {{SchedulerApplicationAttempt.addRMContainer()}}, {{SchedulerApplicationAttempt.removeRMContainer()}}, {{SchedulerApplicationAttempt.updateResourceRequests()}}, {{SchedulerApplicationAttempt.recoverResourceRequestsForContainer()}}, {{SchedulerApplicationAttempt.reserve()}}, and {{SchedulerApplicationAttempt.updateBlacklist()}} should have the write locks pushed down to inside the _if_ * {{SchedulerApplicationAttempt.getHeadroom()}} and {{SchedulerApplicationAttempt.getResourceLimit()}} are identical. {{SchedulerApplicationAttempt.getResourceLimit()}} is not used outside {{SchedulerApplicationAttempt}} * In {{SchedulerApplicationAttempt.resetSchedulingOpportunities()}}, is the write lock needed? * In {{SchedulerApplicationAttempt.getLiveContainers()}} is the read lock needed? * In {{SchedulerApplicationAttempt.stop()}} the {{isStopped}} update can happen before the lock * In {{SchedulerApplicationAttempt.getReservedContainers()}} the lock should only cover the _for_ loop * In {{FSAppAttempt.getAllowedLocalityLevel()}}, {{FSAppAttempt.getAllowedLocalityLevelByTime()}}, {{FSAppAttempt.setReservation()}}, and {{FSAppAttempt.clearReservation()}} the write lock acquisition can be delayed until after the arg validation * There's an unused import in {{FaCiSchedulerApp}} * In {{FaCiSchedulerApp.containerCompleted()}} the write lock acquisition can be delayed until after removing from {{liveContainers}} * In {{FaCiSchedulerApp.allocate()}} the write lock acquisition can be delayed until after the stop check, and maybe after the sanity check * In {{FaCiSchedulerApp.unreserve()}} the write lock acquisition can be delayed until after canceling the increase request * In {{FaCiSchedulerApp.markContainerForPreemption()}} the write lock acquisition can be push down inside the _if_ > Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp > -- > > Key: YARN-3141 > URL: https://issues.apache.org/jira/browse/YARN-3141 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch > > > Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, > as mentioned in YARN-3091, a possible solution is using read/write lock. > Other fine-graind locks for specific purposes / bugs should be addressed in > separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494394#comment-15494394 ] Eric Payne commented on YARN-4945: -- bq. I would say it may not be necessarily to have two separate policies to consider priority and user-limit. [~leftnoteasy], I'm not sure how I feel about that yet. I need to understand what it would mean to combine all intra-queue priority policies into one. Whatever the design, I want to make sure it is not cumbersome to solve the user-limit-percent inversion that we often see. If they are combined, then is it still necessary to make {{IntraQueuePreemptionPolicy}} an interface? Wouldn't this just be the implementation class and then there would be no need for {{PriorityIntraQueuePreemptionPolicy}} or other derivative classes? > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, > YARN-2009.v1.patch, YARN-2009.v2.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue
[ https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494387#comment-15494387 ] Hadoop QA commented on YARN-3140: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 49s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 59s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: The patch generated 40 new + 91 unchanged - 57 fixed = 131 total (was 148) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 59s {color} | {color:red} hadoop-yarn in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 37m 59s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 91m 30s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager | | | hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor | | | hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12827903/YARN-3140.2.patch | | JIRA Issue | YARN-3140 | | Optional Tests | asflicense findbugs xml compile javac javadoc mvninstall mvnsite unit checkstyle | |
[jira] [Assigned] (YARN-5655) TestContainerManagerSecurity is failing
[ https://issues.apache.org/jira/browse/YARN-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter reassigned YARN-5655: --- Assignee: Robert Kanter Sure. I'll take a look later today. > TestContainerManagerSecurity is failing > --- > > Key: YARN-5655 > URL: https://issues.apache.org/jira/browse/YARN-5655 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Jason Lowe >Assignee: Robert Kanter > > TestContainerManagerSecurity has been failing recently in 2.8: > {noformat} > Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity > Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity > testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) > Time elapsed: 44.478 sec <<< ERROR! > java.lang.NullPointerException: null > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) > testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) > Time elapsed: 34.964 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5655) TestContainerManagerSecurity is failing
[ https://issues.apache.org/jira/browse/YARN-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494375#comment-15494375 ] Jason Lowe commented on YARN-5655: -- git bisect points to this commit when it started failing in branch-2.8: {noformat} commit f9016dfec33f1d6486c03a54f0a479ed08aff34f Author: Karthik KambatlaDate: Tue Sep 6 16:23:06 2016 -0700 YARN-5566. Client-side NM graceful decom is not triggered when jobs finish. (Robert Kanter via kasha) {noformat} [~rkanter] [~kasha] could you take a look? > TestContainerManagerSecurity is failing > --- > > Key: YARN-5655 > URL: https://issues.apache.org/jira/browse/YARN-5655 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.8.0 >Reporter: Jason Lowe > > TestContainerManagerSecurity has been failing recently in 2.8: > {noformat} > Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity > Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec > <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity > testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) > Time elapsed: 44.478 sec <<< ERROR! > java.lang.NullPointerException: null > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) > testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) > Time elapsed: 34.964 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333) > at > org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-5655) TestContainerManagerSecurity is failing
Jason Lowe created YARN-5655: Summary: TestContainerManagerSecurity is failing Key: YARN-5655 URL: https://issues.apache.org/jira/browse/YARN-5655 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.8.0 Reporter: Jason Lowe TestContainerManagerSecurity has been failing recently in 2.8: {noformat} Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) Time elapsed: 44.478 sec <<< ERROR! java.lang.NullPointerException: null at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity) Time elapsed: 34.964 sec <<< FAILURE! java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333) at org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494307#comment-15494307 ] Hadoop QA commented on YARN-3141: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 16 new + 49 unchanged - 23 fixed = 65 total (was 72) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 33m 45s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 51m 1s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828699/YARN-3141.3.patch | | JIRA Issue | YARN-3141 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux af9636090cb8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / fcbac00 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13117/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13117/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/13117/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Improve locks in
[jira] [Comment Edited] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints
[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494251#comment-15494251 ] Varun Saxena edited comment on YARN-5585 at 9/15/16 7:08 PM: - Just to summarise the suggestions given for folks to refer to. * Applications (like Tez) would know best how to interpret their entity IDs' and how they can be descendingly sorted. Most entity IDs' seem to have some sort of monotonically increasing sequence like app ID. We can hence open up a PUBLIC interface which ATSv2 users like Tez can implement to decide how to encode and decode a particular entity type so that it is stored in descending sorted fashion (based on creation time) in ATSv2. Encoding and decoding similar to AppIDConverter written in our code.Because if row keys themselves can be sorted, this will be performance wise the best possible solution. Refer to [comment | https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15470803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15470803] ** _Pros of the approach:_ **# Lookup will be fast. ** _Cons of the approach:_ **# We are depending on application to provide some code for this to work. Corresponding JAR will have to be placed in classpath. Folks in other projects may not be pleased to not have inbuilt support for this in ATS. **# Entity IDs' may not always have a monotonically increasing sequence like App IDs'. * We can keep another table, say EntityCreationTable or EntityIndexTable with row key as {{cluster!user!flow!flowrun!app!entitytype!reverse entity creation time!entityid}}. We will make an entry into this table whenever created time is reported for the entity. The real data would still reside in the main entity table. Entities in this table will be sorted descendingly. On read side, we can first peek into this table to get relevant records in descending fashion (based on limit and/or fromId) and then use this info to query entity table. We can do this in two ways. We can get created times from querying this index table and apply a filter of created time range. Or alternatively we can try out MultiRowRangeFilter. That from javadoc of HBase seems to be efficient. We will have to do some processing to determine these multiple row key ranges. Refer to [comment | https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15472669=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15472669] ** _Note:_ Client should not send different created times for the same entity otherwise that will lead to an additional row. If different created time would be reported more than once we will have to consider the latest one. ** _Pros of the approach:_ **# Solution provided within ATS. **# Extra write only when created time is reported. ** _Cons of the approach:_ **# Extra peek into the index table on the read side. Single entity read can still be served directly from entity table though. * Another option would be to change the row key of entity table to {{cluster!user!flow!flowrun!app!entitytype!reverse entity creation time!entityid}} and have another table to map {{cluster!user!flow!flowrun!app!entitytype!entityid}} to entity created time. So for a single entity call (HBase Get) we will have to first peek into the new table and then get records from entity table. ** _Cons of the approach:_ **# On write side, we will have to first lookup into the index table which has the entity created time or on every write client should supply entity created time. First would impact write performance and latter may not be feasible for client to send. **# What should be the row key if client does not supply created time on first write but supplies the created time on a subsequent write. cc [~sjlee0], [~vrushalic], [~rohithsharma], [~gtCarrera9] was (Author: varun_saxena): Just to summarise the suggestions given for folks to refer to. * Applications (like Tez) would know best how to interpret their entity IDs' and how they can be descendingly sorted. Most entity IDs' seem to have some sort of monotonically increasing sequence like app ID. We can hence open up a PUBLIC interface which ATSv2 users like Tez can implement to decide how to encode and decode a particular entity type so that it is stored in descending sorted fashion (based on creation time) in ATSv2. Encoding and decoding similar to AppIDConverter written in our code.Because if row keys themselves can be sorted, this will be performance wise the best possible solution. Refer to [comment | https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15470803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15470803] ** _Pros of the approach:_ **# Lookup will be fast. ** _Cons of the approach:_ **# We are depending on application to provide some code for this to work. Corresponding JAR
[jira] [Commented] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints
[ https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494251#comment-15494251 ] Varun Saxena commented on YARN-5585: Just to summarise the suggestions given for folks to refer to. * Applications (like Tez) would know best how to interpret their entity IDs' and how they can be descendingly sorted. Most entity IDs' seem to have some sort of monotonically increasing sequence like app ID. We can hence open up a PUBLIC interface which ATSv2 users like Tez can implement to decide how to encode and decode a particular entity type so that it is stored in descending sorted fashion (based on creation time) in ATSv2. Encoding and decoding similar to AppIDConverter written in our code.Because if row keys themselves can be sorted, this will be performance wise the best possible solution. Refer to [comment | https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15470803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15470803] ** _Pros of the approach:_ **# Lookup will be fast. ** _Cons of the approach:_ **# We are depending on application to provide some code for this to work. Corresponding JAR will have to be placed in classpath. Folks in other projects may not be pleased to not have inbuilt support for this in ATS. **# Entity IDs' may not always have a monotonically increasing sequence like App IDs'. * We can keep another table, say EntityCreationTable or EntityIndexTable with row key as {{cluster!user!flow!flowrun!app!entitytype!reverse entity creation time!entityid}}. We will make an entry into this table whenever created time is reported for the entity. The real data would still reside in the main entity table. Entities in this table will be sorted descendingly. On read side, we can first peek into this table to get relevant records in descending fashion (based on limit and/or fromId) and then use this info to query entity table. We can do this in two ways. We can get created times from querying this index table and apply a filter of created time range. Or alternatively we can try out MultiRowRangeFilter. That from javadoc of HBase seems to be efficient. We will have to do some processing to determine these multiple row key ranges. Refer to [comment | https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15472669=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15472669] ** _Note:_ Client should not send different created times for the same entity otherwise that will lead to an additional row. If different created time would be reported more than once we will have to consider the latest one. ** _Pros of the approach:_ **# Solution provided within ATS. **# Extra write only when created time is reported. ** _Cons of the approach:_ **# Extra peek into the index table on the read side. Single entity read can still be served directly from entity table though. * Another option would be to change the row key of entity table to cluster!user!flow!flowrun!app!entitytype!reverse entity creation time!entityid and have another table to map cluster!user!flow!flowrun!app!entitytype!entityid to entity created time. So for a single entity call (HBase Get) we will have to first peek into the new table and then get records from entity table. ** _Cons of the approach:_ **# On write side, we will have to first lookup into the index table which has the entity created time or on every write client should supply entity created time. First would impact write performance and latter may not be feasible for client to send. **# What should be the row key if client does not supply created time on first write but supplies the created time on a subsequent write. cc [~sjlee0], [~vrushalic], [~rohithsharma], [~gtCarrera9] > [Atsv2] Add a new filter fromId in REST endpoints > - > > Key: YARN-5585 > URL: https://issues.apache.org/jira/browse/YARN-5585 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelinereader >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S >Priority: Critical > Attachments: YARN-5585.v0.patch > > > TimelineReader REST API's provides lot of filters to retrieve the > applications. Along with those, it would be good to add new filter i.e fromId > so that entities can be retrieved after the fromId. > Current Behavior : Default limit is set to 100. If there are 1000 entities > then REST call gives first/last 100 entities. How to retrieve next set of 100 > entities i.e 101 to 200 OR 900 to 801? > Example : If applications are stored database, app-1 app-2 ... app-10. > *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is > no way to achieve this. > So proposal is to have fromId in the filter like >
[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
[ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494248#comment-15494248 ] Jason Lowe commented on YARN-5545: -- Yes, that's essentially the idea. Users can work around the issue initially reported today by setting a queue-specific max apps setting. All the new global queue max apps setting does is allow users to easily specify a default max apps value for all queues that don't have a specific setting rather than manually set it themselves on each one. > App submit failure on queue with label when default queue partition capacity > is zero > > > Key: YARN-5545 > URL: https://issues.apache.org/jira/browse/YARN-5545 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, > YARN-5545.0003.patch, capacity-scheduler.xml > > > Configure capacity scheduler > yarn.scheduler.capacity.root.default.capacity=0 > yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50 > yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50 > Submit application as below > ./yarn jar > ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar > sleep -Dmapreduce.job.node-label-expression=labelx > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1 > {noformat} > 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001 > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed > to submit application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136) > at > org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit > application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286) > at > org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296) > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301) > ... 25 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Resolved] (YARN-5640) Issue while accessing resource manager webapp rest service
[ https://issues.apache.org/jira/browse/YARN-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Loknath Priyatham Teja Singamsetty resolved YARN-5640. --- Resolution: Not A Bug > Issue while accessing resource manager webapp rest service > -- > > Key: YARN-5640 > URL: https://issues.apache.org/jira/browse/YARN-5640 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.5.1 >Reporter: Loknath Priyatham Teja Singamsetty > > I am running E2E test in phoenix which starts the minimapreduce cluster using > MapreduceTestingShim.java from HBaseTestingUtiltiy of hbase codebase and > makes rest call to get all the submitted yarn map reduce jobs > (http://localhost:63996/ws/v1/cluster/apps?states=NEW,ACCEPTED,SUBMITTED,RUNNING) > which is failing with the following stack trace. Tried debugging but > couldn't reach to bottom of the issue. > Also I couldn't find the DelegationTokenAuthenticationHandler in the source > code. Any clue where to find the same? > Setup: > Phoenix - 4.8.0 > HBase - 1.20 > Hadoop - 2.5.1 > Stack Trace: > HTTP ERROR 500 > Problem accessing /ws/v1/cluster/apps. Reason: > > org.apache.http.client.utils.URLEncodedUtils.parse(Ljava/lang/String;Ljava/nio/charset/Charset;)Ljava/util/List; > Caused by: > java.lang.NoSuchMethodError: > org.apache.http.client.utils.URLEncodedUtils.parse(Ljava/lang/String;Ljava/nio/charset/Charset;)Ljava/util/List; > at > org.apache.hadoop.security.token.delegation.web.ServletUtils.getParameter(ServletUtils.java:48) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.managementOperation(DelegationTokenAuthenticationHandler.java:171) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:514) > at > org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1243) > at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494187#comment-15494187 ] Wangda Tan commented on YARN-4945: -- [~eepayne] / [~sunilg], For the suggestion from [~eepayne]: bq. I think that the objects that implement the IntraQueuePreemptionPolicy interface should be in in a List, and then IntraQueueCandidatesSelector#selectCandidates should loop over the list to process the different policies. I would say it may not be necessarily to have two separate policies to consider priority and user-limit. In my existing rough thinking, only minor changes required to support FIFO + Priority + user-limit intra-queue preemption, if it is really required, we can refactor this part when we move to user-limit preemption. The other reason is the two intra-queue preemption policy (user-limit / priority) can affect to each other, we cannot do priority preemption without considering user-limit, and vice versa. So if we can consider both with a reasonable code complexity, why not :)? > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, > YARN-2009.v1.patch, YARN-2009.v2.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue
[ https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494172#comment-15494172 ] Wangda Tan commented on YARN-3140: -- Thanks [~jianhe] for review, Addressed all comments, except: bq. private synchronized CSAssignment assignContainersToChildQueues It is already protected by writelocks inside assignContainers, so there's no need to keep the writelock inside assignContainersToChildQueues Any other comments? (Uploaded ver.3 patch) > Improve locks in AbstractCSQueue/LeafQueue/ParentQueue > -- > > Key: YARN-3140 > URL: https://issues.apache.org/jira/browse/YARN-3140 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3140.1.patch, YARN-3140.2.patch, YARN-3140.3.patch > > > Enhance locks in AbstractCSQueue/LeafQueue/ParentQueue, as mentioned in > YARN-3091, a possible solution is using read/write lock. Other fine-graind > locks for specific purposes / bugs should be addressed in separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue
[ https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3140: - Attachment: YARN-3140.3.patch > Improve locks in AbstractCSQueue/LeafQueue/ParentQueue > -- > > Key: YARN-3140 > URL: https://issues.apache.org/jira/browse/YARN-3140 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3140.1.patch, YARN-3140.2.patch, YARN-3140.3.patch > > > Enhance locks in AbstractCSQueue/LeafQueue/ParentQueue, as mentioned in > YARN-3091, a possible solution is using read/write lock. Other fine-graind > locks for specific purposes / bugs should be addressed in separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-3141: - Attachment: YARN-3141.3.patch > Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp > -- > > Key: YARN-3141 > URL: https://issues.apache.org/jira/browse/YARN-3141 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch > > > Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, > as mentioned in YARN-3091, a possible solution is using read/write lock. > Other fine-graind locks for specific purposes / bugs should be addressed in > separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
[ https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494159#comment-15494159 ] Wangda Tan commented on YARN-3141: -- Thanks for review, [~templedf], I addressed all your suggestions except: bq. You axed the javadoc for SchedulerApplicationAttempt.isReserved() isReserved is not used by anyone, so I removed that method bq. It would be nice in the javadoc for all the methods that are no longer synchronized to note that they're MT safe. This is a good suggestion, but I think it's better to come in a separate patch, since we have to update almost every method in scheduler. Any other thoughts? (Attached ver.3 patch) > Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp > -- > > Key: YARN-3141 > URL: https://issues.apache.org/jira/browse/YARN-3141 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch > > > Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, > as mentioned in YARN-3091, a possible solution is using read/write lock. > Other fine-graind locks for specific purposes / bugs should be addressed in > separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
[ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494147#comment-15494147 ] Sunil G commented on YARN-5545: --- Thank you very much [~jlowe] for sharing use case and detailed analysis. I think i understood now the intend here. We will be sticking with the existing configuration set here, and introducing a much more flexible global queue max-apps. So for those queues which are not configured per-queue level, and do not have any capacity configured (in case of node labels and the the problem mentioned in this jira) will be set to this new config (global queue max-apps). So I think more or less, we could have below pseudo code to represent this behavior. {code} maxApplications = conf.getMaximumApplicationsPerQueue(getQueuePath()); if (maxApplications < 0) { int maxGlobalPerQueueApps = conf.getGlobalMaximumApplicationsPerQueue(); if(maxGlobalPerQueueApps > 0) { maxApplications = maxGlobalPerQueueApps; } else { int maxSystemApps = conf.getMaximumSystemApplications(); maxApplications = (int) (maxSystemApps * queueCapacities.getAbsoluteCapacity()); } } {code} So in cases where there are no capacity configured for some labels in a queue, we could make use of global queue max-apps configurations. > App submit failure on queue with label when default queue partition capacity > is zero > > > Key: YARN-5545 > URL: https://issues.apache.org/jira/browse/YARN-5545 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, > YARN-5545.0003.patch, capacity-scheduler.xml > > > Configure capacity scheduler > yarn.scheduler.capacity.root.default.capacity=0 > yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50 > yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50 > Submit application as below > ./yarn jar > ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar > sleep -Dmapreduce.job.node-label-expression=labelx > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1 > {noformat} > 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001 > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed > to submit application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136) > at > org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:239) > at org.apache.hadoop.util.RunJar.main(RunJar.java:153) > Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to
[jira] [Commented] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494121#comment-15494121 ] Sunil G commented on YARN-5145: --- As per below doc, {{under hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnUI2.md}} {noformat} *In $HADOOP_PREFIX/share/hadoop/yarn/webapps/rm/config/configs.env* - Update timelineWebAddress and rmWebAddress to the actual addresses run resource manager and timeline server - If you run RM locally in you computer just for test purpose, you need to keep `corsproxy` running. Otherwise, you need to set `localBaseAddress` to empty. {noformat} This is the help or readme doc which explains how to configure YARN UI in a real production cluster. WE will not be editing any .js files to change config. It should be done via *configs.env* itself. You can also refer TEZ project for same. I could say that defaul-config.js is no longer needed anymore. I can try remove the same in another ticket. > [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR > - > > Key: YARN-5145 > URL: https://issues.apache.org/jira/browse/YARN-5145 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Kai Sasaki > Attachments: YARN-5145-YARN-3368.01.patch > > > Existing YARN UI configuration is under Hadoop package's directory: > $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to > $HADOOP_CONF_DIR like other configurations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494103#comment-15494103 ] Sunil G commented on YARN-4945: --- I think i missed the previous comment from [~eepayne]. Let me share another patch after checking the comments. > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, > YARN-2009.v1.patch, YARN-2009.v2.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4945: -- Attachment: YARN-2009.v2.patch Attaching v2 patch addressing the mentioned todo's. [~leftnoteasy] and [~eepayne], please help to review the same. > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, > YARN-2009.v1.patch, YARN-2009.v2.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit
[ https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494031#comment-15494031 ] Hadoop QA commented on YARN-5637: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 8 new + 267 unchanged - 0 fixed = 275 total (was 267) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 36s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 49s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager | | | hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828684/YARN-5637.005.patch | | JIRA Issue | YARN-5637 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 3f446ba56641 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7cad7b7 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13115/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/13115/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit test logs | https://builds.apache.org/job/PreCommit-YARN-Build/13115/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/13115/testReport/
[jira] [Commented] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler
[ https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493961#comment-15493961 ] Yufei Gu commented on YARN-4329: Thanks [~Naganarasimha]! I found you've done basic framework in YARN-3946. That's great! Thanks. Please comment if you have any heads-up or concern. > Allow fetching exact reason as to why a submitted app is in ACCEPTED state in > Fair Scheduler > > > Key: YARN-4329 > URL: https://issues.apache.org/jira/browse/YARN-4329 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, resourcemanager >Reporter: Naganarasimha G R >Assignee: Yufei Gu > > Similar to YARN-3946, it would be useful to capture possible reason why the > Application is in accepted state in FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit
[ https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-5637: -- Attachment: YARN-5637.005.patch Fixing checkstyle, javadocs and tests > Changes in NodeManager to support Container upgrade and rollback/commit > --- > > Key: YARN-5637 > URL: https://issues.apache.org/jira/browse/YARN-5637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5637.001.patch, YARN-5637.002.patch, > YARN-5637.003.patch, YARN-5637.004.patch, YARN-5637.005.patch > > > YARN-5620 added support for re-initialization of Containers using a new > launch Context. > This JIRA proposes to use the above feature to support upgrade and subsequent > rollback or commit of the upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit
[ https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493862#comment-15493862 ] Hadoop QA commented on YARN-5637: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 3 new + 267 unchanged - 0 fixed = 270 total (was 267) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 5 new + 240 unchanged - 0 fixed = 245 total (was 240) {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 42s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 15s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 28m 25s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor | | | hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828663/YARN-5637.004.patch | | JIRA Issue | YARN-5637 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux f7d665720cf6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 7cad7b7 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13113/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | javadoc | https://builds.apache.org/job/PreCommit-YARN-Build/13113/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit |
[jira] [Updated] (YARN-3692) Allow REST API to set a user generated message when killing an application
[ https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-3692: Attachment: 0005-YARN-3692.1.patch Removed unused import and updated the patch > Allow REST API to set a user generated message when killing an application > -- > > Key: YARN-3692 > URL: https://issues.apache.org/jira/browse/YARN-3692 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Rajat Jain >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-3692.patch, 0002-YARN-3692.patch, > 0003-YARN-3692.patch, 0004-YARN-3692.patch, 0005-YARN-3692.1.patch, > 0005-YARN-3692.patch > > > Currently YARN's REST API supports killing an application without setting a > diagnostic message. It would be good to provide that support. > *Use Case* > Usually this helps in workflow management in a multi-tenant environment when > the workflow scheduler (or the hadoop admin) wants to kill a job - and let > the user know the reason why the job was killed. Killing the job by setting a > diagnostic message is a very good solution for that. Ideally, we can set the > diagnostic message on all such interface: > yarn kill -applicationId ... -diagnosticMessage "some message added by > admin/workflow" > REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by > admin/workflow'} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493893#comment-15493893 ] Eric Payne commented on YARN-4945: -- Thanks again, [~sunilg]. I will look closely at the patch, but one thing I wanted to bring out before too much time passes is that some of the IntraQueue classes seem priority-centric and do not lend themselves to adding multiple intra-queue policies. - The constructor for {{IntraQueueCandidatesSelector}} passes {{priorityBasedPolicy}} as a parameter directly to the constructor for {{IntraQueuePreemptableResourceCalculator}} - {{IntraQueueCandidatesSelector#selectCandidates}} passes {{priorityBasedPolicy}} as a parameter directly to {{CapacitySchedulerPreemptionUtils.getResToObtainByPartitionForApps}}. I think that the objects that implement the {{IntraQueuePreemptionPolicy}} interface should be in in a {{List}}, and then {{IntraQueueCandidatesSelector#selectCandidates}} should loop over the list to process the different policies. Please change the name of variables in classes that need to be independent of the specific intra-queue policy: - {{CapacitySchedulerPreemptionUtils#getResToObtainByPartitionForApps}} has a parameter named {{priorityBasedPolicy}}, but this should be generic, like {{intraQueuePolicy}} - {{IntraQueuePreemptableResourceCalculator}} also has a variable named {{priorityBasedPolicy}}, which I think should be more generic. - {{CapacitySchedulerConfiguration#SELECT_CANDIDATES_FOR_INTRAQUEUE_PREEMPTION}}: since the value for this property is the switch to turn on intra-queue preemption, the name should be something more generic. Currently, it is {{yarn.resourcemanager.monitor.capacity.preemption.select_based_on_priority_of_applications}}, but it should be something like {{enable_intra_queue_preemption}}. > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, > YARN-2009.v1.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application
[ https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493832#comment-15493832 ] Hadoop QA commented on YARN-3692: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 9s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 56s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 28s {color} | {color:red} root: The patch generated 3 new + 233 unchanged - 1 fixed = 236 total (was 234) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client generated 1 new + 157 unchanged - 0 fixed = 158 total (was 157) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 37m 28s {color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 5s {color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 114m 21s {color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 217m 47s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828632/0005-YARN-3692.patch | | JIRA Issue | YARN-3692 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc | | uname | Linux fa021b97284e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | |
[jira] [Updated] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit
[ https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-5637: -- Attachment: YARN-5637.004.patch Rebasing patch with latest YARN-5620 > Changes in NodeManager to support Container upgrade and rollback/commit > --- > > Key: YARN-5637 > URL: https://issues.apache.org/jira/browse/YARN-5637 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5637.001.patch, YARN-5637.002.patch, > YARN-5637.003.patch, YARN-5637.004.patch > > > YARN-5620 added support for re-initialization of Containers using a new > launch Context. > This JIRA proposes to use the above feature to support upgrade and subsequent > rollback or commit of the upgrade. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5577) [Atsv2] Document object passing in infofilters with an example
[ https://issues.apache.org/jira/browse/YARN-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493714#comment-15493714 ] Rohith Sharma K S commented on YARN-5577: - ping [~varun_saxena]!! > [Atsv2] Document object passing in infofilters with an example > -- > > Key: YARN-5577 > URL: https://issues.apache.org/jira/browse/YARN-5577 > Project: Hadoop YARN > Issue Type: Bug > Components: timelinereader, timelineserver >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Labels: documentation > Attachments: YARN-5577.patch > > > In HierarchicalTimelineEntity, setparent/addChild allows to set parent/child > entities at INFO level. The key is an string and value as an object. > Like below, for YARN_CONTAINER entity parent entity set for application. > {code} > "SYSTEM_INFO_PARENT_ENTITY": { >"type": "YARN_APPLICATION", >"id": "application_1471931266232_0024" > } > {code} > But to use infofilter on entity type YARN_CONTAINER for an specific > applicationId, IIUC there is no way to pass object as value in infofilter. > To make easier retrieval either > # publish parent/child entity id and type as string rather that object like > below > {code} > "SYSTEM_INFO_PARENT_ENTITY_TYPE": "YARN_APPLICATION" > "SYSTEM_INFO_PARENT_ENTITY_ID":"application_1471931266232_0024" > {code} > OR > # Add ability to provide object as filter with below format like > {{infofilters=SYSTEM_INFO_PARENT_ENTITY eq ((type eq YARN_APPLICATION) AND > (id eq application_1471931266232_0024))}} > I believe 2nd approach will be well applicable for any entities. But I am not > sure does HBase supports such a custom filters while scanning a table. > 1st approaches will be much easier to change. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext
[ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493609#comment-15493609 ] Arun Suresh edited comment on YARN-5620 at 9/15/16 3:13 PM: Committed this to trunk and branch-2 Thanks again for the review [~jianhe] and [~vvasudev] was (Author: asuresh): Committed this to trunk and branch-2 > Core changes in NodeManager to support re-initialization of Containers with > new launchContext > - > > Key: YARN-5620 > URL: https://issues.apache.org/jira/browse/YARN-5620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-5620.001.patch, YARN-5620.002.patch, > YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, > YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, > YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, > YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, > YARN-5620.015.patch, YARN-5620.016.patch > > > JIRA proposes to modify the ContainerManager (and other core classes) to > support upgrade of a running container with a new {{ContainerLaunchContext}} > as well as the ability to rollback the upgrade if the container is not able > to restart using the new launch Context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext
[ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493609#comment-15493609 ] Arun Suresh commented on YARN-5620: --- Committed this to trunk and branch-2 > Core changes in NodeManager to support re-initialization of Containers with > new launchContext > - > > Key: YARN-5620 > URL: https://issues.apache.org/jira/browse/YARN-5620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-5620.001.patch, YARN-5620.002.patch, > YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, > YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, > YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, > YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, > YARN-5620.015.patch, YARN-5620.016.patch > > > JIRA proposes to modify the ContainerManager (and other core classes) to > support upgrade of a running container with a new {{ContainerLaunchContext}} > as well as the ability to rollback the upgrade if the container is not able > to restart using the new launch Context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext
[ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-5620: -- Fix Version/s: 3.0.0-alpha2 2.9.0 > Core changes in NodeManager to support re-initialization of Containers with > new launchContext > - > > Key: YARN-5620 > URL: https://issues.apache.org/jira/browse/YARN-5620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Fix For: 2.9.0, 3.0.0-alpha2 > > Attachments: YARN-5620.001.patch, YARN-5620.002.patch, > YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, > YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, > YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, > YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, > YARN-5620.015.patch, YARN-5620.016.patch > > > JIRA proposes to modify the ContainerManager (and other core classes) to > support upgrade of a running container with a new {{ContainerLaunchContext}} > as well as the ability to rollback the upgrade if the container is not able > to restart using the new launch Context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext
[ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493503#comment-15493503 ] Hudson commented on YARN-5620: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10443 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10443/]) YARN-5620. Core changes in NodeManager to support re-initialization of (arun suresh: rev 40b5a59b726733df456330a26f03d5174cc0bc1c) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/Container.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerEventType.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerState.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceSet.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerReInitEvent.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/event/ContainerLocalizationRequestEvent.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/MockContainer.java > Core changes in NodeManager to support re-initialization of Containers with > new launchContext > - > > Key: YARN-5620 > URL: https://issues.apache.org/jira/browse/YARN-5620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5620.001.patch, YARN-5620.002.patch, > YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, > YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, > YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, > YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, > YARN-5620.015.patch, YARN-5620.016.patch > > > JIRA proposes to modify the ContainerManager (and other core classes) to > support upgrade of a running container with a new {{ContainerLaunchContext}} > as well as the ability to rollback the upgrade if the container is not able > to restart using the new launch Context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext
[ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493431#comment-15493431 ] Arun Suresh commented on YARN-5620: --- Committing this shortly based on [~vvasudev]'s and [~jianhe]'s +1. Will take care of the unused imports when I check in. > Core changes in NodeManager to support re-initialization of Containers with > new launchContext > - > > Key: YARN-5620 > URL: https://issues.apache.org/jira/browse/YARN-5620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5620.001.patch, YARN-5620.002.patch, > YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, > YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, > YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, > YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, > YARN-5620.015.patch, YARN-5620.016.patch > > > JIRA proposes to modify the ContainerManager (and other core classes) to > support upgrade of a running container with a new {{ContainerLaunchContext}} > as well as the ability to rollback the upgrade if the container is not able > to restart using the new launch Context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext
[ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493411#comment-15493411 ] Hadoop QA commented on YARN-5620: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager: The patch generated 13 new + 529 unchanged - 5 fixed = 542 total (was 534) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 11s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 16s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 55s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828647/YARN-5620.016.patch | | JIRA Issue | YARN-5620 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 45370375ce2a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2a8f55a | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit | https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | unit test logs | https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt | | Test Results |
[jira] [Commented] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message
[ https://issues.apache.org/jira/browse/YARN-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493403#comment-15493403 ] Hadoop QA commented on YARN-5631: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 25s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 2 new + 28 unchanged - 17 fixed = 30 total (was 45) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 53s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_101. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 13s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_111. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 145m 32s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_101 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy | | | hadoop.yarn.client.TestGetGroups | | JDK v1.8.0_101 Timed out junit tests | org.apache.hadoop.yarn.client.cli.TestYarnCLI | | | org.apache.hadoop.yarn.client.api.impl.TestYarnClient | | | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient | | | org.apache.hadoop.yarn.client.api.impl.TestNMClient | | JDK v1.7.0_111 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy | | |
[jira] [Updated] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext
[ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-5620: -- Attachment: YARN-5620.016.patch Thanks for the review [~vvasudev].. Uploading final patch with the changes in comments and class rename you suggested. Will commit after a good Jenkins run > Core changes in NodeManager to support re-initialization of Containers with > new launchContext > - > > Key: YARN-5620 > URL: https://issues.apache.org/jira/browse/YARN-5620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5620.001.patch, YARN-5620.002.patch, > YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, > YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, > YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, > YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, > YARN-5620.015.patch, YARN-5620.016.patch > > > JIRA proposes to modify the ContainerManager (and other core classes) to > support upgrade of a running container with a new {{ContainerLaunchContext}} > as well as the ability to rollback the upgrade if the container is not able > to restart using the new launch Context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
[ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493271#comment-15493271 ] Jason Lowe commented on YARN-5545: -- bq. This could be configured to set max-apps per queue level in cluster level (queue won’t override this). A queue-level max-app setting should always override the system-level setting. If a user explicitly sets the max-apps setting for a particular queue then we cannot ignore that. We already have setups today where max-apps is being tuned at the queue-level for some queues. Today if users set a queue-level max app limit then it overrides any system-level limit. That means even today users are allowed to configure RMs that can accept over the system-level app limit by explicitly overriding the derived queue limits with specific limits that are larger. Therefore I'm tempted to have the global queue config completely override the old system-level max-apps config because it's akin to setting the max-apps level for each queue explicitly. That means we operate in one of two modes: if global queue max-apps is not set then we do what we do today and derive the max-apps based on relative capacities. Queues that override max-apps at their level continue to behave as they do today and get the override setting. If the global queue max-apps is set then yarn.scheduler.capacity.maximum-applications is completely ignored. Queues that override max-apps at their level continue to behave as they do today and get the override setting. Queues that do not override get the global queue setting as their max apps setting. This preserves existing behavior if the queue is not set and is likely the least surprising behavior when the new setting is used, especially if we document for both the old system max-apps and global queue max-apps configs that the latter always overrides the former when set. > App submit failure on queue with label when default queue partition capacity > is zero > > > Key: YARN-5545 > URL: https://issues.apache.org/jira/browse/YARN-5545 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, > YARN-5545.0003.patch, capacity-scheduler.xml > > > Configure capacity scheduler > yarn.scheduler.capacity.root.default.capacity=0 > yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50 > yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50 > Submit application as below > ./yarn jar > ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar > sleep -Dmapreduce.job.node-label-expression=labelx > -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1 > {noformat} > 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001 > java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed > to submit application_1471670113386_0001 to YARN : > org.apache.hadoop.security.AccessControlException: Queue root.default already > has 0 applications, cannot accept submission of application: > application_1471670113386_0001 > at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362) > at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at > org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136) > at >
[jira] [Updated] (YARN-3692) Allow REST API to set a user generated message when killing an application
[ https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-3692: Attachment: 0005-YARN-3692.patch Updated the patch addressing review comments. > Allow REST API to set a user generated message when killing an application > -- > > Key: YARN-3692 > URL: https://issues.apache.org/jira/browse/YARN-3692 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Rajat Jain >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-3692.patch, 0002-YARN-3692.patch, > 0003-YARN-3692.patch, 0004-YARN-3692.patch, 0005-YARN-3692.patch > > > Currently YARN's REST API supports killing an application without setting a > diagnostic message. It would be good to provide that support. > *Use Case* > Usually this helps in workflow management in a multi-tenant environment when > the workflow scheduler (or the hadoop admin) wants to kill a job - and let > the user know the reason why the job was killed. Killing the job by setting a > diagnostic message is a very good solution for that. Ideally, we can set the > diagnostic message on all such interface: > yarn kill -applicationId ... -diagnosticMessage "some message added by > admin/workflow" > REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by > admin/workflow'} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message
[ https://issues.apache.org/jira/browse/YARN-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Sasaki updated YARN-5631: - Attachment: YARN-5631-branch-2.8.02.patch > Missing refreshClusterMaxPriority usage in rmadmin help message > --- > > Key: YARN-5631 > URL: https://issues.apache.org/jira/browse/YARN-5631 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 3.0.0-alpha2 >Reporter: Kai Sasaki >Assignee: Kai Sasaki >Priority: Minor > Attachments: YARN-5631-branch-2.8.01.patch, > YARN-5631-branch-2.8.02.patch, YARN-5631.01.patch, YARN-5631.02.patch > > > {{rmadmin -help}} does not show {{-refreshClusterMaxPriority}} option in > usage line. > {code} > $ bin/yarn rmadmin -help > rmadmin is the command to execute YARN administrative commands. > The full syntax is: > yarn rmadmin [-refreshQueues] [-refreshNodes [-g|graceful [timeout in > seconds] -client|server]] [-refreshNodesResources] > [-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] > [-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] > [-addToClusterNodeLabels > <"label1(exclusive=true),label2(exclusive=false),label3">] > [-removeFromClusterNodeLabels] [-replaceLabelsOnNode > <"node1[:port]=label1,label2 node2[:port]=label1">] > [-directlyAccessNodeLabelStore] [-updateNodeResource [NodeID] [MemSize] > [vCores] ([OvercommitTimeout]) [-help [cmd]] > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue
[ https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493012#comment-15493012 ] Jian He commented on YARN-3140: --- - Is this method not used ? If so, labelManager no need to be volatile, and remove this method {code} @VisibleForTesting public void setNodeLabelManager(RMNodeLabelsManager mgr) { this.labelManager = mgr; } {code} - pendingOrderingPolicy: no need to be volatile - This synchronized keyword is removed, but no write lock is added {code} private synchronized CSAssignment assignContainersToChildQueues( {code} > Improve locks in AbstractCSQueue/LeafQueue/ParentQueue > -- > > Key: YARN-3140 > URL: https://issues.apache.org/jira/browse/YARN-3140 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-3140.1.patch, YARN-3140.2.patch > > > Enhance locks in AbstractCSQueue/LeafQueue/ParentQueue, as mentioned in > YARN-3091, a possible solution is using read/write lock. Other fine-graind > locks for specific purposes / bugs should be addressed in separated tickets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
[ https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493000#comment-15493000 ] Kai Sasaki commented on YARN-5145: -- [~sunilg] Sorry for lacking of explanation. It seems that {{configs.env}} is not used anymore because all configurations are included in {{default-config.js}}. And new YARN UI is working fine without {{configs.env}}. So we can remove this file. And then after removing {{configs.env}} file, there is no {{config}} directory under {{$HADOOP_PREFIX/share/hadoop/yarn/webapps/}} because condifurations {{default-config.js}} are build into ember deploy. There is no configurations to be passed externally since they are included in ember deploy package. So removing {{configs.env}} is intended to 1. Since {{configs.env}} is not used anymore, it can be removed. 2. By removing config directory in ember deployed package, we found there is no configurations to be passed from external. Does it make sense? But if any other new configurations are introduced which should be changed from external, we need to implement a way to pass some values to deployed ember package. Do you consider of such future usage? > [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR > - > > Key: YARN-5145 > URL: https://issues.apache.org/jira/browse/YARN-5145 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Kai Sasaki > Attachments: YARN-5145-YARN-3368.01.patch > > > Existing YARN UI configuration is under Hadoop package's directory: > $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to > $HADOOP_CONF_DIR like other configurations. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext
[ https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492925#comment-15492925 ] Varun Vasudev commented on YARN-5620: - Thanks for the patch [~asuresh]. +1 except for some minor comment fixes. 1) {code} + * Resource is localized while the container is running - create symlinks. {code} Comment is the same for two transition handlers - maybe change slightly to provide more context? 2) {code} + // If Container died during an upgrade, dont bother retrying. {code} What is this comment for? There’s no change in the code and it looks we're just going through the regular retry mechanism. 3) {code} + static class KilledExternallyForReInitTransition extends ContainerTransition { {code} Maybe this should be renamed? My understanding is that this really is “Killed by YARN framework to restart container" > Core changes in NodeManager to support re-initialization of Containers with > new launchContext > - > > Key: YARN-5620 > URL: https://issues.apache.org/jira/browse/YARN-5620 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Arun Suresh >Assignee: Arun Suresh > Attachments: YARN-5620.001.patch, YARN-5620.002.patch, > YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, > YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, > YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, > YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, > YARN-5620.015.patch > > > JIRA proposes to modify the ContainerManager (and other core classes) to > support upgrade of a running container with a new {{ContainerLaunchContext}} > as well as the ability to rollback the upgrade if the container is not able > to restart using the new launch Context. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue
[ https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492609#comment-15492609 ] Wangda Tan commented on YARN-4945: -- [~sunilg], took a quick look at the patch, overall approach looks good. For the TODO items, I think reservation logic support can be moved to a separate ticket, for apps running inside the same queue, it is more likely that resources are more homogeneous. For the other two TODOs, it's better to be addressed in the same patch. And one minor comment: - Definition and initialization of IntraQueuePreemptionPolicy is in the IntraQueueCandidatesSelector now, but I think it might be better to move them to IntraQueuePreemptableResourceCalculator. And I think we might not need the userLimitBasedPolicy, it could be a part of the existing IntraQueuePreemptionPolicy. I will include more detailed reviews for the final patch :). Thanks, > [Umbrella] Capacity Scheduler Preemption Within a queue > --- > > Key: YARN-4945 > URL: https://issues.apache.org/jira/browse/YARN-4945 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Wangda Tan > Attachments: Intra-Queue Preemption Use Cases.pdf, > IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, > YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, > YARN-2009.v1.patch > > > This is umbrella ticket to track efforts of preemption within a queue to > support features like: > YARN-2009. YARN-2113. YARN-4781. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler
[ https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492585#comment-15492585 ] Naganarasimha G R commented on YARN-4329: - Thanks [~yufeigu] for working on this! As i am not much acquainted with Fair Scheduler (and was only aware of the first 2 in the list) i did not go ahead with working on it, And yes its better than logging (as per YARN-5563) as it will be availble from REST/CLI/WEB etc... Basic framework is already set as part of YARN-3946, If you have any concerns or queries you can reach me. > Allow fetching exact reason as to why a submitted app is in ACCEPTED state in > Fair Scheduler > > > Key: YARN-4329 > URL: https://issues.apache.org/jira/browse/YARN-4329 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler, resourcemanager >Reporter: Naganarasimha G R >Assignee: Yufei Gu > > Similar to YARN-3946, it would be useful to capture possible reason why the > Application is in accepted state in FairScheduler -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message
[ https://issues.apache.org/jira/browse/YARN-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492551#comment-15492551 ] Hadoop QA commented on YARN-5631: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 54s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s {color} | {color:green} branch-2.8 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 13s {color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 1 new + 45 unchanged - 0 fixed = 46 total (was 45) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 52s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_101. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 20s {color} | {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_111. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 146m 22s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_101 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy | | | hadoop.yarn.client.TestGetGroups | | JDK v1.8.0_101 Timed out junit tests | org.apache.hadoop.yarn.client.cli.TestYarnCLI | | | org.apache.hadoop.yarn.client.api.impl.TestYarnClient | | | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient | | | org.apache.hadoop.yarn.client.api.impl.TestNMClient | | JDK v1.7.0_111 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy | | |