date:20160915


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495189#comment-15495189
 ] 

Sunil G commented on YARN-5545:
---

Thanks [~jlowe] for the valuable thoughts and suggestions.

Thanks [~leftnoteasy]. It makes sense for me. [~bibinchundatt], However I think 
we do not need another config to enforce strict checking. It can be done in 
todays form.

I will file a followup jira for same. IN that, we can check and reject app 
submission to any queue, if system-wide limit is met. Thoughts?

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
>   ... 25 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Commented] (YARN-5655) TestContainerManagerSecurity is failing

2016-09-15 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495169#comment-15495169
 ] 

Robert Kanter commented on YARN-5655:
-

This only seems to be a problem in branch-2.8 and branch-2.  trunk seems to be 
fine.  However, I'm getting a different failure than you:
{noformat}
Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity
Tests run: 2, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 44.517 sec <<< 
FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity
testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
  Time elapsed: 23.939 sec  <<< FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:360)
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)

testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
  Time elapsed: 19.823 sec  <<< FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:360)
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
{noformat}
(The {{null}} is misleading and is because of a junit bug which happens when an 
{{assertTrue}} or {{assertFalse}} has no (optional) message)


> TestContainerManagerSecurity is failing
> ---
>
> Key: YARN-5655
> URL: https://issues.apache.org/jira/browse/YARN-5655
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jason Lowe
>Assignee: Robert Kanter
>
> TestContainerManagerSecurity has been failing recently in 2.8:
> {noformat}
> Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 44.478 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
> testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 34.964 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-09-15 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495161#comment-15495161
 ] 

Bibin A Chundatt commented on YARN-5545:


{quote}
Since the maximum-application is major used to cap memory consumed by apps in 
RM. So I think at least in a follow up JIRA, system-level maximum applications 
should be enforced.
{quote}
+1 for the same. similar to cgroups we can add configuration for strict mode to 
be enable.

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
>   ... 25 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue

2016-09-15 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495060#comment-15495060
 ] 

Eric Payne commented on YARN-4945:
--

[~sunilg],
I noticed in the resourcemanager log that the metrics were not as I would 
expect after running applications. For example, after 1 application has 
completed running, the {{#queue-active-applications}} metrics remains 1 instead 
of 0:
{code}
2016-09-16 01:11:10,189 [SchedulerEventDispatcher:Event Processor] INFO 
capacity.LeafQueue: Application removed - appId: application_1473988192446_0001 
user: hadoop1 queue: glamdring #user-pending-applications: 0 
#user-active-applications: 0 #queue-pending-applications: 0 
#queue-active-applications: 1
{code}
After 3 applications have run, the metrics are even more unexpected:
{code}
2016-09-16 01:12:34,622 [SchedulerEventDispatcher:Event Processor] INFO 
capacity.LeafQueue: Application removed - appId: application_1473988192446_0003 
user: hadoop1 queue: glamdring #user-pending-applications: -4 
#user-active-applications: 4 #queue-pending-applications: 0 
#queue-active-applications: 3
{code}
I believe the cause of this is in {{LeafQueue#getAllApplications}}:
{code}
  public Collection getAllApplications() {
Collection apps =
pendingOrderingPolicy.getSchedulableEntities();
apps.addAll(orderingPolicy.getSchedulableEntities());

return Collections.unmodifiableCollection(apps);
  }
{code}
The call to {{pendingOrderingPolicy.getSchedulableEntities()}} returns the 
{{AbstractComparatorOrderingPolicy#schedulableEntities}} object, and then the 
call to {{apps.addAll(orderingPolicy.getSchedulableEntities())}} adds 
additional {{FiCaSchedulerApp}}'s to {{schedulableEntities}}.

By creating a copy of the return value of 
{{pendingOrderingPolicy.getSchedulableEntities()}}, I have been able to verify 
that the {{schedulableEntities}} does not have extra entries. For example:
{code}
  public Collection getAllApplications() {
Collection apps = new TreeSet(
pendingOrderingPolicy.getSchedulableEntities());
apps.addAll(orderingPolicy.getSchedulableEntities());

return Collections.unmodifiableCollection(apps);
  }
{code}

> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch, YARN-2009.v2.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5642) Typos in 9 log messages

2016-09-15 Thread Mehran Hassani (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495042#comment-15495042
 ] 

Mehran Hassani commented on YARN-5642:
--

This means my patch has conflicts with the trunk ? 

> Typos in 9 log messages 
> 
>
> Key: YARN-5642
> URL: https://issues.apache.org/jira/browse/YARN-5642
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mehran Hassani
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-5642.001.patch
>
>
> I am conducting research on log related bugs. I tried to make a tool to fix 
> repetitive yet simple patterns of bugs that are related to logs. Typos in log 
> messages are one of the reoccurring bugs. Therefore, I made a tool find typos 
> in log statements. During my experiments, I managed to find the following 
> typos in Hadoop YARN:
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."), 
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java,
>  LOG.info("Completed reading history information of all conatiners"+ " of 
> application attempt " + appAttemptId), 
> conatiners should be containers
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java,
>  LOG.info("Neither virutal-memory nor physical-memory monitoring is " 
> +"needed. Not running the monitor-thread"), 
> virutal should be virtual
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java,
>  LOG.info("Intialized plan {} based on reservable queue {}" plan.toString()  
> planQueueName), 
> Intialized should be Initialized
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java,
>  LOG.info("Initializing " + queueName + "\n" +"capacity = " + 
> queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + 
> "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= 
> parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + 
> queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" 
> +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" 
> [= 1.0 maximumCapacity undefined  " +"(parentAbsoluteMaxCapacity * 
> maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= 
> configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= 
> configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications 
> +" [= configuredMaximumSystemApplicationsPerQueue or" +" 
> (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" 
> +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= 
> (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" 
> +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= 
> usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" 
> +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / 
> clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + 
> maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + 
> "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= 
> (float)(maximumAllocationMemory - minimumAllocationMemory) / " 
> +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + 
> maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " 
> + numContainers +" [= currentNumContainers ]" + "\n" +"state = " + state +" 
> [= configuredState ]" + "\n" +"acls = " + aclsString +" [= configuredAcls ]" 
> + "\n" +"nodeLocalityDelay = " + nodeLocalityDelay + "\n" +"labels=" + 
> labelStrBuilder.toString() + "\n" +"reservationsContinueLooking = " 
> +reservationsContinueLooking + "\n" +"preemptionDisabled = " + 
> getPreemptionDisabled() + "\n" +"defaultAppPriorityPerQueue = " + 
> defaultAppPriorityPerQueue), 
> asbolute should be absolute
> In file 
>

[jira] [Commented] (YARN-5642) Typos in 9 log messages


[ 
https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15495027#comment-15495027
 ] 

Hadoop QA commented on YARN-5642:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} 
| {color:red} YARN-5642 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828778/YARN-5642.001.patch |
| JIRA Issue | YARN-5642 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13119/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Typos in 9 log messages 
> 
>
> Key: YARN-5642
> URL: https://issues.apache.org/jira/browse/YARN-5642
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mehran Hassani
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-5642.001.patch
>
>
> I am conducting research on log related bugs. I tried to make a tool to fix 
> repetitive yet simple patterns of bugs that are related to logs. Typos in log 
> messages are one of the reoccurring bugs. Therefore, I made a tool find typos 
> in log statements. During my experiments, I managed to find the following 
> typos in Hadoop YARN:
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."), 
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java,
>  LOG.info("Completed reading history information of all conatiners"+ " of 
> application attempt " + appAttemptId), 
> conatiners should be containers
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java,
>  LOG.info("Neither virutal-memory nor physical-memory monitoring is " 
> +"needed. Not running the monitor-thread"), 
> virutal should be virtual
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java,
>  LOG.info("Intialized plan {} based on reservable queue {}" plan.toString()  
> planQueueName), 
> Intialized should be Initialized
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java,
>  LOG.info("Initializing " + queueName + "\n" +"capacity = " + 
> queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + 
> "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= 
> parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + 
> queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" 
> +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" 
> [= 1.0 maximumCapacity undefined  " +"(parentAbsoluteMaxCapacity * 
> maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= 
> configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= 
> configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications 
> +" [= configuredMaximumSystemApplicationsPerQueue or" +" 
> (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" 
> +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= 
> (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" 
> +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= 
> usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" 
> +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / 
> clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + 
> maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + 
> "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= 
>

[jira] [Updated] (YARN-5642) Typos in 9 log messages

2016-09-15 Thread Mehran Hassani (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehran Hassani updated YARN-5642:
-
Attachment: YARN-5642.001.patch

> Typos in 9 log messages 
> 
>
> Key: YARN-5642
> URL: https://issues.apache.org/jira/browse/YARN-5642
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mehran Hassani
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-5642.001.patch
>
>
> I am conducting research on log related bugs. I tried to make a tool to fix 
> repetitive yet simple patterns of bugs that are related to logs. Typos in log 
> messages are one of the reoccurring bugs. Therefore, I made a tool find typos 
> in log statements. During my experiments, I managed to find the following 
> typos in Hadoop YARN:
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."), 
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java,
>  LOG.info("Completed reading history information of all conatiners"+ " of 
> application attempt " + appAttemptId), 
> conatiners should be containers
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java,
>  LOG.info("Neither virutal-memory nor physical-memory monitoring is " 
> +"needed. Not running the monitor-thread"), 
> virutal should be virtual
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java,
>  LOG.info("Intialized plan {} based on reservable queue {}" plan.toString()  
> planQueueName), 
> Intialized should be Initialized
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java,
>  LOG.info("Initializing " + queueName + "\n" +"capacity = " + 
> queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + 
> "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= 
> parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + 
> queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" 
> +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" 
> [= 1.0 maximumCapacity undefined  " +"(parentAbsoluteMaxCapacity * 
> maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= 
> configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= 
> configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications 
> +" [= configuredMaximumSystemApplicationsPerQueue or" +" 
> (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" 
> +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= 
> (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" 
> +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= 
> usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" 
> +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / 
> clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + 
> maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + 
> "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= 
> (float)(maximumAllocationMemory - minimumAllocationMemory) / " 
> +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + 
> maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " 
> + numContainers +" [= currentNumContainers ]" + "\n" +"state = " + state +" 
> [= configuredState ]" + "\n" +"acls = " + aclsString +" [= configuredAcls ]" 
> + "\n" +"nodeLocalityDelay = " + nodeLocalityDelay + "\n" +"labels=" + 
> labelStrBuilder.toString() + "\n" +"reservationsContinueLooking = " 
> +reservationsContinueLooking + "\n" +"preemptionDisabled = " + 
> getPreemptionDisabled() + "\n" +"defaultAppPriorityPerQueue = " + 
> defaultAppPriorityPerQueue), 
> asbolute should be absolute
> In file 
>

[jira] [Commented] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR

2016-09-15 Thread Kai Sasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494957#comment-15494957
 ] 

Kai Sasaki commented on YARN-5145:
--

[~sunilg] I see. Sorry for my misunderstanding and missed the documentation. 
I'll update to use {{configs.env}} under {{HADOOP_CONF_DIR}} as described 
initially. Thanks you so much for clear explanation!

> [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
> -
>
> Key: YARN-5145
> URL: https://issues.apache.org/jira/browse/YARN-5145
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
> Attachments: YARN-5145-YARN-3368.01.patch
>
>
> Existing YARN UI configuration is under Hadoop package's directory: 
> $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to 
> $HADOOP_CONF_DIR like other configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5642) Typos in 9 log messages

2016-09-15 Thread Mehran Hassani (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehran Hassani updated YARN-5642:
-
Summary: Typos in 9 log messages   (was: Typos in 11 log messages )

> Typos in 9 log messages 
> 
>
> Key: YARN-5642
> URL: https://issues.apache.org/jira/browse/YARN-5642
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mehran Hassani
>Priority: Trivial
>  Labels: newbie
>
> I am conducting research on log related bugs. I tried to make a tool to fix 
> repetitive yet simple patterns of bugs that are related to logs. Typos in log 
> messages are one of the reoccurring bugs. Therefore, I made a tool find typos 
> in log statements. During my experiments, I managed to find the following 
> typos in Hadoop YARN:
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."), 
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java,
>  LOG.info("Completed reading history information of all conatiners"+ " of 
> application attempt " + appAttemptId), 
> conatiners should be containers
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java,
>  LOG.info("Neither virutal-memory nor physical-memory monitoring is " 
> +"needed. Not running the monitor-thread"), 
> virutal should be virtual
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java,
>  LOG.info("Intialized plan {} based on reservable queue {}" plan.toString()  
> planQueueName), 
> Intialized should be Initialized
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java,
>  LOG.info("Initializing " + queueName + "\n" +"capacity = " + 
> queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + 
> "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= 
> parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + 
> queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" 
> +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" 
> [= 1.0 maximumCapacity undefined  " +"(parentAbsoluteMaxCapacity * 
> maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= 
> configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= 
> configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications 
> +" [= configuredMaximumSystemApplicationsPerQueue or" +" 
> (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" 
> +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= 
> (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" 
> +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= 
> usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" 
> +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / 
> clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + 
> maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + 
> "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= 
> (float)(maximumAllocationMemory - minimumAllocationMemory) / " 
> +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + 
> maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " 
> + numContainers +" [= currentNumContainers ]" + "\n" +"state = " + state +" 
> [= configuredState ]" + "\n" +"acls = " + aclsString +" [= configuredAcls ]" 
> + "\n" +"nodeLocalityDelay = " + nodeLocalityDelay + "\n" +"labels=" + 
> labelStrBuilder.toString() + "\n" +"reservationsContinueLooking = " 
> +reservationsContinueLooking + "\n" +"preemptionDisabled = " + 
> getPreemptionDisabled() + "\n" +"defaultAppPriorityPerQueue = " + 
> defaultAppPriorityPerQueue), 
> asbolute should be absolute
> In file 
>

[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue


[ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494825#comment-15494825
 ] 

Wangda Tan commented on YARN-4945:
--

1) YarnConfiguration:
- Instead of have a separate SELECT_CANDIDATES_FOR_INTRAQUEUE_PREEMPTION, 
should we only have a "queue.intra-queue-preemption-enabled"? I cannot clearly 
think what it means in semantic, one example is, after we have user-limit 
preemption support, what happens if we only enable the user-limit preemption 
(without priority preemption enabled)?

2) PCPP:
- Unused imports / methods
- getPartitionResource: avoid clone resources? Because we will clone resource 
twice for every app. If you concern about consistency, you can clone it once 
before starting preemption calculation
- It seems to me, partitionToUnderServedQueues can be kept in 
AbstractPreemptableResourceCalculator.
In addition, Map could be Map. (LinkedHashSet is not necessarily needed, because we won't have 
two TempQueuePerPartition with the same queueName and same partition)

3) CapacitySchedulerPreemptionUtils:
- deductPreemptableResourcePerApp, is following a valid comment?
bq. // High priority app is coming first 
- Remove unnecessary param in method and new generic type (like new 
HashMap(...)), better to move to Intellij? :p
- {getResToObtainByPartitionForApps}} can be removed, we can directly use 
policy.getResourceDemandFromAppsPerQueue

4) FiCaSchedulerApp: 
Mvoe getTotalPendingRequestsPerPartition to ResourceUsage? I can see we could 
have requirements to: getUsedResourceByPartition, 
getReservedReosurceByPartition, etc. in the future

5) PreemptionCandidatesSelector:
- All non-abstract methods can be static, correct?
- All TODOs in comments are done, correct?

6) IntraQueuePreemptionPolicy and PriorityIntraQueuePreemptionPolicy:
- Overall: Do you think if the name: -Policy is too big? What it essentially do 
is computing how much resource to preempt from each app, how about call it 
something like IntraQueuePreemptionComputePlugin? Would like to hear thoughts 
from you and Eric for this as well.
- Rename the PriorityIntraQueuePreemptionPolicy to 
FifoIntraQueuePreemptionPolicy if you agree with [my 
comment|https://issues.apache.org/jira/browse/YARN-4945?focusedCommentId=15494454=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15494454]
- PriorityIntraQueuePreemptionPolicy#getResourceDemandFromAppsPerQueue: 
a. resToObtainByPartition can be removed from parameter
b. IIUC, it gets resourceToObtain for each app instead of gets resourceDemand 
for each app, rename it properly?
c. This logic is not correct:
{code}
// If demand is from same priority level, skip the same.
if (!tq.isPendingDemandForHigherPriorityApps(a1.getPriority())) {
  continue;
}
{code} 
It can only avoid highest priority in a queue applications preempt from each 
other, but it cannot avoid 2nd highest applications from each other. And the 
performance can be improved as well, I believe in some settings, maxAppPriority 
can be as much as MAX_INT. Please look for below comments/pesudo code for 
details.
- computeAppsIdealAllocation:
a. Calling getUserLimitHeadRoomPerApp is too expensive, instead we can add one 
method in LeafQueue to get UserLimit by userName. Have a Map of username to 
headroom inside the method can compute user limit at most once for different 
user. And this logic can be reused to compute user-limit preemption
b. {{tq.addPendinResourcePerPriority(tmpApp.getPriority(), tmpApp.pending);}} 
could be changed if you agree with above .c
c. I think we should move the {{skip the same priority demand}} logic into this 
method. One approach in my mind is:
{code}
// General idea:
// Use two pointer, one from most prioritized app, one from least prioritized 
app
// Each app has two quotas, one is how much resource required (ideal - used),
// Another one is how much resource can be preempted
// Move the two pointer and update the two quotas to get:
// For application X, is there any app with higher priority need the resource?

p1 = most-prioritized-app.iterator
p2 = least-prioritized-app.iterator

// For each app, we have:
// - "toPreemptFromOther" which initialized to (ideal - (used - selected)).
// - "actuallyToBePreempted" initialized to 0

while (p1.getPriority() > p2.getPriority() && p1 != p2) {
Resource rest = p2.toBePreempt - p2.actuallyToBePreempted;
if (rest > 0) {
if (p1.toBePreemptFromOther > 0) {
Resource toPreempt = min(p1.toBePreemptFromOther, rest);
p1.toBePreemptFromOther -= toPreempt
p2.actuallyToBePreempted += toPreempt
}
}

if (p2.toBePreempt - p2.actuallyToBePreempted == 0) {
// Nothing more can be preempt from p2, move to next
p2 --;
}

if (p1.toBePreemptFromOther == 0) {

[jira] [Updated] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery


 [ 
https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-5638:

Attachment: YARN-5638-trunk.v1.patch

First version of patch to demonstrate the idea. Right now I've split collector 
discovery process into two steps: In the first step, collector manager reports 
the collector to NM, and NM sends the collector data to the RM for 
registration. In the second step, the RM (synchronously) assigns the collector 
a timestamp (rm's timestamp and a version number) and store it in memory. The 
RM then updates known collector data via heartbeats to all NMs as before. The 
only difference is the RM attach timestamp information for each collector to 
NMs so that once there's a rebuild process, NMs can report this information. 

> Introduce a collector timestamp to uniquely identify collectors creation 
> order in collector discovery
> -
>
> Key: YARN-5638
> URL: https://issues.apache.org/jira/browse/YARN-5638
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
> Attachments: YARN-5638-trunk.v1.patch
>
>
> As discussed in YARN-3359, we need to further identify timeline collectors' 
> creation order to rebuild collector discovery data in the RM. This JIRA 
> proposes to use  to order collectors 
> for each application in the RM. This timestamp can then be used when a 
> standby RM becomes active and rebuild collector discovery data. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp


[ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494699#comment-15494699
 ] 

Hadoop QA commented on YARN-3141:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
57s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
21s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 21s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
36s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 17s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 15 new + 48 unchanged - 23 fixed = 63 total (was 71) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 37m 55s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m 1s {color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828731/YARN-3141.4.patch |
| JIRA Issue | YARN-3141 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4b1c39304449 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fcbac00 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13118/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/13118/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/13118/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results |

[jira] [Commented] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery


[ 
https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494688#comment-15494688
 ] 

Li Lu commented on YARN-5638:
-

Updated the description of this JIRA. What we need here is not a new type of 
"collector id", but to store timestamp data in the RMs and NMs for the 
collectors. This can address the problem when we rebuild collector status for a 
new active rm, as discussed in YARN-3359: 

bq. when one application has two different attempts running (due to some 
network problems, for example) and the RM is trying to rebuild collector 
status, the RM needs to know which collector is for the latest app attempt and 
which one is for the stale attempt.

We do not necessarily need to associate collectors to application attempts. 
Actually, according to timeline server v2 design, we should only associate app 
collectors to applications. However, when maintaining collector data in RMs and 
NMs, we can store the timestamp of each collector. In this way, when the RM 
needs to rebuild collector status, it can gather all known collector data from 
NMs, use the timestamp to decide the most recent state of the collectors, and 
then rebuild all states. 

> Introduce a collector timestamp to uniquely identify collectors creation 
> order in collector discovery
> -
>
> Key: YARN-5638
> URL: https://issues.apache.org/jira/browse/YARN-5638
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
>
> As discussed in YARN-3359, we need to further identify timeline collectors' 
> creation order to rebuild collector discovery data in the RM. This JIRA 
> proposes to use  to order collectors 
> for each application in the RM. This timestamp can then be used when a 
> standby RM becomes active and rebuild collector discovery data. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery


 [ 
https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-5638:

Description: As discussed in YARN-3359, we need to further identify 
timeline collectors' creation order to rebuild collector discovery data in the 
RM. This JIRA proposes to use  to order 
collectors for each application in the RM. This timestamp can then be used when 
a standby RM becomes active and rebuild collector discovery data.   (was: As 
discussed in YARN-3359, we need to further identify timeline collectors and 
their creation order for better service discovery and resource isolation. This 
JIRA proposes to use  to accurately identify each 
timeline collector. )

> Introduce a collector timestamp to uniquely identify collectors creation 
> order in collector discovery
> -
>
> Key: YARN-5638
> URL: https://issues.apache.org/jira/browse/YARN-5638
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
>
> As discussed in YARN-3359, we need to further identify timeline collectors' 
> creation order to rebuild collector discovery data in the RM. This JIRA 
> proposes to use  to order collectors 
> for each application in the RM. This timestamp can then be used when a 
> standby RM becomes active and rebuild collector discovery data. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5638) Introduce a collector timestamp to uniquely identify collectors creation order in collector discovery


 [ 
https://issues.apache.org/jira/browse/YARN-5638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Lu updated YARN-5638:

Summary: Introduce a collector timestamp to uniquely identify collectors 
creation order in collector discovery  (was: Introduce a collector Id to 
uniquely identify collectors and their creation order)

> Introduce a collector timestamp to uniquely identify collectors creation 
> order in collector discovery
> -
>
> Key: YARN-5638
> URL: https://issues.apache.org/jira/browse/YARN-5638
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Li Lu
>Assignee: Li Lu
>
> As discussed in YARN-3359, we need to further identify timeline collectors 
> and their creation order for better service discovery and resource isolation. 
> This JIRA proposes to use  to accurately identify 
> each timeline collector. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5336) Put in some limit for accepting key-values in hbase writer

2016-09-15 Thread Vrushali C (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-5336:
-
Assignee: Haibo Chen  (was: Vrushali C)

> Put in some limit for accepting key-values in hbase writer
> --
>
> Key: YARN-5336
> URL: https://issues.apache.org/jira/browse/YARN-5336
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Haibo Chen
>  Labels: YARN-5355
>
> As recommended by [~jrottinghuis] , need to add in some limit (default and 
> configurable) for accepting key values to be written to the backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5336) Put in some limit for accepting key-values in hbase writer

2016-09-15 Thread Vrushali C (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494615#comment-15494615
 ] 

Vrushali C commented on YARN-5336:
--

Assigning to [~haibochen]

> Put in some limit for accepting key-values in hbase writer
> --
>
> Key: YARN-5336
> URL: https://issues.apache.org/jira/browse/YARN-5336
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Vrushali C
>Assignee: Haibo Chen
>  Labels: YARN-5355
>
> As recommended by [~jrottinghuis] , need to add in some limit (default and 
> configurable) for accepting key values to be written to the backend.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494592#comment-15494592
 ] 

Wangda Tan commented on YARN-5545:
--

Thanks [~jlowe], [~sunilg] for suggestions.

I generally agree with approach at 
https://issues.apache.org/jira/browse/YARN-5545?focusedCommentId=15494147=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15494147.

Since the maximum-application is major used to cap memory consumed by apps in 
RM. So I think at least in a follow up JIRA, system-level maximum applications 
should be enforced. We should not allow pending + running apps number beyond 
system-level maximum applications. Without this, it gonna be hard to estimate 
how many apps in RM.

Thoughts?

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
>   ... 25 more
> {noformat}



--
This message was

[jira] [Updated] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp


 [ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3141:
-
Attachment: YARN-3141.4.patch

Attached ver.4 patch.

> Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
> --
>
> Key: YARN-3141
> URL: https://issues.apache.org/jira/browse/YARN-3141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch, 
> YARN-3141.4.patch
>
>
> Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, 
> as mentioned in YARN-3091, a possible solution is using read/write lock. 
> Other fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp


[ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494547#comment-15494547
 ] 

Wangda Tan commented on YARN-3141:
--

Thanks [~templedf],

However I think for most of the comments, we should keep them as-is, a volatile 
varible doesn't mean we don't need *extra lock to protect consistency between 
variables*. 
For a simplest example,
{code}
volatile boolean a;
volatile int b;

void update_b(b') {
if (a) {
b = b'
}
}

void update_a(a') {
a = a'
}

boolean get_a() {
return a; 
}

boolean get_b() {
return b;
}
{code} 

If two separate thread, thread #1 calls update_b first, and thread #2 calls 
update_a, it is possible that update_a returns before update_b returns. And if 
we read the two variables, data inconsistency happens.
So I would rather be more conservative: if a method read some fields and write 
some fields, the safest way is add a single write lock to protect all them. 
Same to a method read multiple fields, we should have a single readlock for 
them.

Most of the comments fall into the category, we could not shorten the critical 
sections of them.

What I have addressed in this patch:

bq. SchedulerApplicationAttempt.getLiveContainersMap() should be default 
visibility and @VisibleForTesting
bq. In FSAppAttempt.getAllowedLocalityLevel(), 
FSAppAttempt.getAllowedLocalityLevelByTime(), FSAppAttempt.setReservation(), 
and FSAppAttempt.clearReservation() the write lock acquisition can be delayed 
until after the arg validation
bq. There's an unused import in FaCiSchedulerApp
bq. 

> Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
> --
>
> Key: YARN-3141
> URL: https://issues.apache.org/jira/browse/YARN-3141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch
>
>
> Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, 
> as mentioned in YARN-3091, a possible solution is using read/write lock. 
> Other fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue


[ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494454#comment-15494454
 ] 

Wangda Tan commented on YARN-4945:
--

[~eepayne], 

bq. I need to understand what it would mean to combine all intra-queue priority 
policies into one.
To clarify, we may not combine *all* intra-queue policies into one, but if you 
look at queue internal policies. There are majorly two groups:
1) Fair + user-limit + priority
2) Fifo + user-limit + priority

User-limit and priority will be always on and ordering policy like Fair/Fifo is 
a changeable config. So it makes sense to me to have two different policies, 
one for Fifo (plus priority/UL) and Fair (same plus priority/UL)

bq. If they are combined, then is it still necessary to make 
IntraQueuePreemptionPolicy an interface?
As I mentioned above, we can have a fair intra-queue policy.

To be honest, I haven't thought a good way that a list of policies can better 
solve the priority + user-limit preemption problem. Could you share some ideas 
about it. For example, how to better consider both in the final decision

> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch, YARN-2009.v2.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application


[ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494437#comment-15494437
 ] 

Hadoop QA commented on YARN-3692:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
35s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 48s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 39s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 2s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 2s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 27s 
{color} | {color:red} root: The patch generated 1 new + 233 unchanged - 1 fixed 
= 234 total (was 234) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 35s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
22s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s 
{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client 
generated 1 new + 157 unchanged - 0 fixed = 158 total (was 157) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s 
{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 37m 29s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 0s 
{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 115m 17s 
{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
29s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 220m 50s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828683/0005-YARN-3692.1.patch
 |
| JIRA Issue | YARN-3692 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 1355118415f6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
|

[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp

2016-09-15 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494440#comment-15494440
 ] 

Daniel Templeton commented on YARN-3141:


Continuing with more comments on v2.  Sorry, I started these before you 
uploaded v3.  These comments are a little more speculative.  I'm not 100% 
certain that everything I'm recommending is safe. :)

* {{SchedulerApplicationAttempt.getLiveContainersMap()}} should be default 
visibility and {{@VisibleForTesting}}
* {{SchedulerApplicationAttempt.addRMContainer()}}, 
{{SchedulerApplicationAttempt.removeRMContainer()}}, 
{{SchedulerApplicationAttempt.updateResourceRequests()}}, 
{{SchedulerApplicationAttempt.recoverResourceRequestsForContainer()}}, 
{{SchedulerApplicationAttempt.reserve()}}, and 
{{SchedulerApplicationAttempt.updateBlacklist()}} should have the write locks 
pushed down to inside the _if_
* {{SchedulerApplicationAttempt.getHeadroom()}} and 
{{SchedulerApplicationAttempt.getResourceLimit()}} are identical. 
{{SchedulerApplicationAttempt.getResourceLimit()}} is not used outside 
{{SchedulerApplicationAttempt}}
* In {{SchedulerApplicationAttempt.resetSchedulingOpportunities()}}, is the 
write lock needed?
* In {{SchedulerApplicationAttempt.getLiveContainers()}} is the read lock 
needed?
* In {{SchedulerApplicationAttempt.stop()}} the {{isStopped}} update can happen 
before the lock
* In {{SchedulerApplicationAttempt.getReservedContainers()}} the lock should 
only cover the _for_ loop
* In {{FSAppAttempt.getAllowedLocalityLevel()}}, 
{{FSAppAttempt.getAllowedLocalityLevelByTime()}}, 
{{FSAppAttempt.setReservation()}}, and {{FSAppAttempt.clearReservation()}} the 
write lock acquisition can be delayed until after the arg validation
* There's an unused import in {{FaCiSchedulerApp}}
* In {{FaCiSchedulerApp.containerCompleted()}} the write lock acquisition can 
be delayed until after removing from {{liveContainers}}
* In {{FaCiSchedulerApp.allocate()}} the write lock acquisition can be delayed 
until after the stop check, and maybe after the sanity check
* In {{FaCiSchedulerApp.unreserve()}} the write lock acquisition can be delayed 
until after canceling the increase request
* In {{FaCiSchedulerApp.markContainerForPreemption()}} the write lock 
acquisition can be push down inside the _if_



> Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
> --
>
> Key: YARN-3141
> URL: https://issues.apache.org/jira/browse/YARN-3141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch
>
>
> Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, 
> as mentioned in YARN-3091, a possible solution is using read/write lock. 
> Other fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue

2016-09-15 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494394#comment-15494394
 ] 

Eric Payne commented on YARN-4945:
--

bq. I would say it may not be necessarily to have two separate policies to 
consider priority and user-limit.
[~leftnoteasy], I'm not sure how I feel about that yet. I need to understand 
what it would mean to combine all intra-queue priority policies into one. 
Whatever the design, I want to make sure it is not cumbersome to solve the 
user-limit-percent inversion that we often see.

If they are combined, then is it still necessary to make 
{{IntraQueuePreemptionPolicy}} an interface? Wouldn't this just be the 
implementation class and then there would be no need for 
{{PriorityIntraQueuePreemptionPolicy}} or other derivative classes?

> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch, YARN-2009.v2.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue


[ 
https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494387#comment-15494387
 ] 

Hadoop QA commented on YARN-3140:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
8s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
40s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 46s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
49s {color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
59s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 24s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 41s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: The patch generated 40 
new + 91 unchanged - 57 fixed = 131 total (was 148) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
44s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s 
{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 5s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 46s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 59s {color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 37m 59s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 30s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
 |
|   | hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor |
|   | hadoop.yarn.server.applicationhistoryservice.webapp.TestAHSWebServices |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12827903/YARN-3140.2.patch |
| JIRA Issue | YARN-3140 |
| Optional Tests |  asflicense  findbugs  xml  compile  javac  javadoc  
mvninstall  mvnsite  unit  checkstyle  |
|

[jira] [Assigned] (YARN-5655) TestContainerManagerSecurity is failing

2016-09-15 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter reassigned YARN-5655:
---

Assignee: Robert Kanter

Sure.  I'll take a look later today.

> TestContainerManagerSecurity is failing
> ---
>
> Key: YARN-5655
> URL: https://issues.apache.org/jira/browse/YARN-5655
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jason Lowe
>Assignee: Robert Kanter
>
> TestContainerManagerSecurity has been failing recently in 2.8:
> {noformat}
> Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 44.478 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
> testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 34.964 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5655) TestContainerManagerSecurity is failing


[ 
https://issues.apache.org/jira/browse/YARN-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494375#comment-15494375
 ] 

Jason Lowe commented on YARN-5655:
--

git bisect points to this commit when it started failing in branch-2.8:
{noformat}
commit f9016dfec33f1d6486c03a54f0a479ed08aff34f
Author: Karthik Kambatla 
Date:   Tue Sep 6 16:23:06 2016 -0700

YARN-5566. Client-side NM graceful decom is not triggered when jobs finish. 
(Robert Kanter via kasha)
{noformat}
  [~rkanter] [~kasha] could you take a look?

> TestContainerManagerSecurity is failing
> ---
>
> Key: YARN-5655
> URL: https://issues.apache.org/jira/browse/YARN-5655
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.0
>Reporter: Jason Lowe
>
> TestContainerManagerSecurity has been failing recently in 2.8:
> {noformat}
> Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec 
> <<< FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity
> testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 44.478 sec  <<< ERROR!
> java.lang.NullPointerException: null
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
> testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
>   Time elapsed: 34.964 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333)
>   at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-5655) TestContainerManagerSecurity is failing

Jason Lowe created YARN-5655:


 Summary: TestContainerManagerSecurity is failing
 Key: YARN-5655
 URL: https://issues.apache.org/jira/browse/YARN-5655
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Jason Lowe


TestContainerManagerSecurity has been failing recently in 2.8:
{noformat}
Running org.apache.hadoop.yarn.server.TestContainerManagerSecurity
Tests run: 2, Failures: 1, Errors: 1, Skipped: 0, Time elapsed: 80.928 sec <<< 
FAILURE! - in org.apache.hadoop.yarn.server.TestContainerManagerSecurity
testContainerManager[0](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
  Time elapsed: 44.478 sec  <<< ERROR!
java.lang.NullPointerException: null
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:394)
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:337)
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)

testContainerManager[1](org.apache.hadoop.yarn.server.TestContainerManagerSecurity)
  Time elapsed: 34.964 sec  <<< FAILURE!
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:333)
at 
org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:157)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp


[ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494307#comment-15494307
 ] 

Hadoop QA commented on YARN-3141:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
29s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
33s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 19s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 16 new + 49 unchanged - 23 fixed = 65 total (was 72) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 33m 45s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 1s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828699/YARN-3141.3.patch |
| JIRA Issue | YARN-3141 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux af9636090cb8 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / fcbac00 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13117/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13117/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13117/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Improve locks in

[jira] [Comment Edited] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints

2016-09-15 Thread Varun Saxena (JIRA)

[
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494251#comment-15494251
]

Varun Saxena edited comment on YARN-5585 at 9/15/16 7:08 PM:
-

Just to summarise the suggestions given for folks to refer to.

* Applications (like Tez) would know best how to interpret their entity IDs'
and how they can be descendingly sorted. Most entity IDs' seem to have some
sort of monotonically increasing sequence like app ID. We can hence open up a
PUBLIC interface which ATSv2 users like Tez can implement to decide how to
encode and decode a particular entity type so that it is stored in descending
sorted fashion (based on creation time) in ATSv2. Encoding and decoding similar
to AppIDConverter written in our code.Because if row keys themselves can be
sorted, this will be performance wise the best possible solution. Refer to
[comment |
https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15470803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15470803]
** _Pros of the approach:_
**# Lookup will be fast.
** _Cons of the approach:_
**# We are depending on application to provide some code for this to work.
Corresponding JAR will have to be placed in classpath. Folks in other projects
may not be pleased to not have inbuilt support for this in ATS.
**# Entity IDs' may not always have a monotonically increasing sequence like
App IDs'.

* We can keep another table, say EntityCreationTable or EntityIndexTable with
row key as {{cluster!user!flow!flowrun!app!entitytype!reverse entity creation
time!entityid}}. We will make an entry into this table whenever created time is
reported for the entity. The real data would still reside in the main entity
table. Entities in this table will be sorted descendingly. On read side, we can
first peek into this table to get relevant records in descending fashion (based
on limit and/or fromId) and then use this info to query entity table. We can do
this in two ways. We can get created times from querying this index table and
apply a filter of created time range. Or alternatively we can try out
MultiRowRangeFilter. That from javadoc of HBase seems to be efficient. We will
have to do some processing to determine these multiple row key ranges. Refer
to [comment |
https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15472669=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15472669]
** _Note:_ Client should not send different created times for the same entity
otherwise that will lead to an additional row. If different created time would
be reported more than once we will have to consider the latest one.
** _Pros of the approach:_
**# Solution provided within ATS.
**# Extra write only when created time is reported.
** _Cons of the approach:_
**# Extra peek into the index table on the read side. Single entity read can
still be served directly from entity table though.

* Another option would be to change the row key of entity table to
{{cluster!user!flow!flowrun!app!entitytype!reverse entity creation
time!entityid}} and have another table to map
{{cluster!user!flow!flowrun!app!entitytype!entityid}} to entity created time.
So for a single entity call (HBase Get) we will have to first peek into the new
table and then get records from entity table.
** _Cons of the approach:_
**# On write side, we will have to first lookup into the index table which has
the entity created time or on every write client should supply entity created
time. First would impact write performance and latter may not be feasible for
client to send.
**# What should be the row key if client does not supply created time on first
write but supplies the created time on a subsequent write.

cc [~sjlee0], [~vrushalic], [~rohithsharma], [~gtCarrera9]

was (Author: varun_saxena):
Just to summarise the suggestions given for folks to refer to.

[jira] [Commented] (YARN-5585) [Atsv2] Add a new filter fromId in REST endpoints

2016-09-15 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494251#comment-15494251
 ] 

Varun Saxena commented on YARN-5585:


Just to summarise the suggestions given for folks to refer to.

* Applications (like Tez) would know best how to interpret their entity IDs' 
and how they can be descendingly sorted. Most entity IDs' seem to have some 
sort of monotonically increasing sequence like app ID. We can hence open up a 
PUBLIC interface which ATSv2 users like Tez can implement to decide how to 
encode and decode a particular entity type so that it is stored in descending 
sorted fashion (based on creation time) in ATSv2. Encoding and decoding similar 
to AppIDConverter written in our code.Because if row keys themselves can be 
sorted, this will be performance wise the best possible solution. Refer to 
[comment | 
https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15470803=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15470803]
** _Pros of the approach:_ 
**# Lookup will be fast.
** _Cons of the approach:_ 
**# We are depending on application to provide some code for this to work. 
Corresponding JAR will have to be placed in classpath. Folks in other projects 
may not be pleased to not have inbuilt support for this in ATS.
**# Entity IDs' may not always have a monotonically increasing sequence like 
App IDs'.

* We can keep another table, say EntityCreationTable or EntityIndexTable with 
row key as {{cluster!user!flow!flowrun!app!entitytype!reverse entity creation 
time!entityid}}. We will make an entry into this table whenever created time is 
reported for the entity. The real data would still reside in the main entity 
table. Entities in this table will be sorted descendingly. On read side, we can 
first peek into this table to get relevant records in descending fashion (based 
on limit and/or fromId) and then use this info to query entity table. We can do 
this in two ways. We can get created times from querying this index table and 
apply a filter of created time range. Or alternatively we can try out 
MultiRowRangeFilter. That from javadoc of HBase seems to be efficient. We will 
have to do some processing to determine these multiple row key ranges.  Refer 
to [comment | 
https://issues.apache.org/jira/browse/YARN-5585?focusedCommentId=15472669=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15472669]
** _Note:_  Client should not send different created times for the same entity 
otherwise that will lead to an additional row.  If different created time would 
be reported more than once we will have to consider the latest one.
** _Pros of the approach:_ 
**# Solution provided within ATS.
**# Extra write only when created time is reported.
** _Cons of the approach:_ 
**# Extra peek into the index table on the read side. Single entity read can 
still be served directly from entity table though.

* Another option would be to change the row key of entity table to 
cluster!user!flow!flowrun!app!entitytype!reverse entity creation time!entityid 
and have another table to map cluster!user!flow!flowrun!app!entitytype!entityid 
to entity created time.
So for a single entity call (HBase Get) we will have to first peek into the new 
table and then get records from entity table.
** _Cons of the approach:_ 
**# On write side, we will have to first lookup into the index table which has 
the entity created time or on every write client should supply entity created 
time. First would impact write performance and latter may not be feasible for 
client to send.
**# What should be the row key if client does not supply created time on first 
write but supplies the created time on a subsequent write.

cc [~sjlee0], [~vrushalic], [~rohithsharma], [~gtCarrera9]

> [Atsv2] Add a new filter fromId in REST endpoints
> -
>
> Key: YARN-5585
> URL: https://issues.apache.org/jira/browse/YARN-5585
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>Priority: Critical
> Attachments: YARN-5585.v0.patch
>
>
> TimelineReader REST API's provides lot of filters to retrieve the 
> applications. Along with those, it would be good to add new filter i.e fromId 
> so that entities can be retrieved after the fromId. 
> Current Behavior : Default limit is set to 100. If there are 1000 entities 
> then REST call gives first/last 100 entities. How to retrieve next set of 100 
> entities i.e 101 to 200 OR 900 to 801?
> Example : If applications are stored database, app-1 app-2 ... app-10.
> *getApps?limit=5* gives app-1 to app-5. But to retrieve next 5 apps, there is 
> no way to achieve this. 
> So proposal is to have fromId in the filter like 
>

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-09-15 Thread Loknath Priyatham Teja Singamsetty (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494248#comment-15494248
 ] 

Jason Lowe commented on YARN-5545:
--

Yes, that's essentially the idea.  Users can work around the issue initially 
reported today by setting a queue-specific max apps setting.  All the new 
global queue max apps setting does is allow users to easily specify a default 
max apps value for all queues that don't have a specific setting rather than 
manually set it themselves on each one.

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
>   ... 25 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail:

[jira] [Resolved] (YARN-5640) Issue while accessing resource manager webapp rest service


 [ 
https://issues.apache.org/jira/browse/YARN-5640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Loknath Priyatham Teja Singamsetty  resolved YARN-5640.
---
Resolution: Not A Bug

> Issue while accessing resource manager webapp rest service
> --
>
> Key: YARN-5640
> URL: https://issues.apache.org/jira/browse/YARN-5640
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.5.1
>Reporter: Loknath Priyatham Teja Singamsetty 
>
> I am running E2E test in phoenix which starts the minimapreduce cluster using 
> MapreduceTestingShim.java from HBaseTestingUtiltiy of hbase codebase and 
> makes rest call to get all the submitted yarn map reduce jobs 
> (http://localhost:63996/ws/v1/cluster/apps?states=NEW,ACCEPTED,SUBMITTED,RUNNING)
>  which is failing with the following stack trace. Tried debugging but 
> couldn't reach to bottom of the issue. 
> Also I couldn't find the DelegationTokenAuthenticationHandler in the source 
> code. Any clue where to find the same?
> Setup:
> Phoenix - 4.8.0
> HBase - 1.20
> Hadoop - 2.5.1
> Stack Trace:
> HTTP ERROR 500
> Problem accessing /ws/v1/cluster/apps. Reason:
> 
> org.apache.http.client.utils.URLEncodedUtils.parse(Ljava/lang/String;Ljava/nio/charset/Charset;)Ljava/util/List;
> Caused by:
> java.lang.NoSuchMethodError: 
> org.apache.http.client.utils.URLEncodedUtils.parse(Ljava/lang/String;Ljava/nio/charset/Charset;)Ljava/util/List;
>   at 
> org.apache.hadoop.security.token.delegation.web.ServletUtils.getParameter(ServletUtils.java:48)
>   at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.managementOperation(DelegationTokenAuthenticationHandler.java:171)
>   at 
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:514)
>   at 
> org.apache.hadoop.yarn.server.security.http.RMAuthenticationFilter.doFilter(RMAuthenticationFilter.java:82)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>   at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1243)
>   at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue


[ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494187#comment-15494187
 ] 

Wangda Tan commented on YARN-4945:
--

[~eepayne] / [~sunilg],

For the suggestion from [~eepayne]:
bq. I think that the objects that implement the IntraQueuePreemptionPolicy 
interface should be in in a List, and then 
IntraQueueCandidatesSelector#selectCandidates should loop over the list to 
process the different policies.

I would say it may not be necessarily to have two separate policies to consider 
priority and user-limit. In my existing rough thinking, only minor changes 
required to support FIFO + Priority + user-limit intra-queue preemption, if it 
is really required, we can refactor this part when we move to user-limit 
preemption.

The other reason is the two intra-queue preemption policy (user-limit / 
priority) can affect to each other, we cannot do priority preemption without 
considering user-limit, and vice versa. So if we can consider both with a 
reasonable code complexity, why not :)?

> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch, YARN-2009.v2.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue


[ 
https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494172#comment-15494172
 ] 

Wangda Tan commented on YARN-3140:
--

Thanks [~jianhe] for review,

Addressed all comments, except:

bq. private synchronized CSAssignment assignContainersToChildQueues 
It is already protected by writelocks inside assignContainers, so there's no 
need to keep the writelock inside assignContainersToChildQueues 

Any other comments?

(Uploaded ver.3 patch) 

> Improve locks in AbstractCSQueue/LeafQueue/ParentQueue
> --
>
> Key: YARN-3140
> URL: https://issues.apache.org/jira/browse/YARN-3140
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3140.1.patch, YARN-3140.2.patch, YARN-3140.3.patch
>
>
> Enhance locks in AbstractCSQueue/LeafQueue/ParentQueue, as mentioned in 
> YARN-3091, a possible solution is using read/write lock. Other fine-graind 
> locks for specific purposes / bugs should be addressed in separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue


 [ 
https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3140:
-
Attachment: YARN-3140.3.patch

> Improve locks in AbstractCSQueue/LeafQueue/ParentQueue
> --
>
> Key: YARN-3140
> URL: https://issues.apache.org/jira/browse/YARN-3140
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3140.1.patch, YARN-3140.2.patch, YARN-3140.3.patch
>
>
> Enhance locks in AbstractCSQueue/LeafQueue/ParentQueue, as mentioned in 
> YARN-3091, a possible solution is using read/write lock. Other fine-graind 
> locks for specific purposes / bugs should be addressed in separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp


 [ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-3141:
-
Attachment: YARN-3141.3.patch

> Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
> --
>
> Key: YARN-3141
> URL: https://issues.apache.org/jira/browse/YARN-3141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch
>
>
> Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, 
> as mentioned in YARN-3091, a possible solution is using read/write lock. 
> Other fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp


[ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494159#comment-15494159
 ] 

Wangda Tan commented on YARN-3141:
--

Thanks for review, [~templedf],
I addressed all your suggestions except:

bq. You axed the javadoc for SchedulerApplicationAttempt.isReserved()
isReserved is not used by anyone, so I removed that method

bq. It would be nice in the javadoc for all the methods that are no longer 
synchronized to note that they're MT safe. 
This is a good suggestion, but I think it's better to come in a separate patch, 
since we have to update almost every method in scheduler. 

Any other thoughts?

(Attached ver.3 patch)

> Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
> --
>
> Key: YARN-3141
> URL: https://issues.apache.org/jira/browse/YARN-3141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3141.1.patch, YARN-3141.2.patch, YARN-3141.3.patch
>
>
> Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, 
> as mentioned in YARN-3091, a possible solution is using read/write lock. 
> Other fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494147#comment-15494147
 ] 

Sunil G commented on YARN-5545:
---

Thank you very much [~jlowe] for sharing use case and detailed analysis.

I think i understood now the intend here. We will be sticking with the existing 
configuration set here, and introducing a much more flexible global queue 
max-apps. So for those queues which are not configured per-queue level, and do 
not have any capacity configured (in case of node labels and the the problem 
mentioned in this jira) will be set to this new config (global queue max-apps).

So I think more or less, we could have below pseudo code to represent this 
behavior.
{code}
maxApplications = conf.getMaximumApplicationsPerQueue(getQueuePath());
if (maxApplications < 0) {
  int maxGlobalPerQueueApps = conf.getGlobalMaximumApplicationsPerQueue();
  if(maxGlobalPerQueueApps > 0) {
 maxApplications = maxGlobalPerQueueApps;
  } else  {
int maxSystemApps = conf.getMaximumSystemApplications();
maxApplications =
  (int) (maxSystemApps * queueCapacities.getAbsoluteCapacity());
  }
}
{code}

So in cases where there are no capacity configured for some labels in a queue, 
we could make use of global queue max-apps configurations.

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to

[jira] [Commented] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR


[ 
https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494121#comment-15494121
 ] 

Sunil G commented on YARN-5145:
---

As per below doc, {{under 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/YarnUI2.md}}

{noformat}
*In $HADOOP_PREFIX/share/hadoop/yarn/webapps/rm/config/configs.env*
- Update timelineWebAddress and rmWebAddress to the actual addresses 
run resource manager and timeline server
- If you run RM locally in you computer just for test purpose, you need 
to keep `corsproxy` running. Otherwise, you need to set `localBaseAddress` to 
empty.
{noformat}

This is the help or readme doc which explains how to configure YARN UI in a 
real production cluster. WE will not be editing any .js files to change config. 
It should be done via *configs.env* itself. You can also refer TEZ project for 
same.

I could say that defaul-config.js is no longer needed anymore. I can try remove 
the same in another ticket.

> [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
> -
>
> Key: YARN-5145
> URL: https://issues.apache.org/jira/browse/YARN-5145
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
> Attachments: YARN-5145-YARN-3368.01.patch
>
>
> Existing YARN UI configuration is under Hadoop package's directory: 
> $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to 
> $HADOOP_CONF_DIR like other configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue


[ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494103#comment-15494103
 ] 

Sunil G commented on YARN-4945:
---

I think i missed the previous comment from [~eepayne]. Let me share another 
patch after checking the comments.

> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch, YARN-2009.v2.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue


 [ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4945:
--
Attachment: YARN-2009.v2.patch

Attaching v2 patch addressing the mentioned todo's. [~leftnoteasy] and 
[~eepayne], please help to review the same.

> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch, YARN-2009.v2.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit


[ 
https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15494031#comment-15494031
 ] 

Hadoop QA commented on YARN-5637:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
5s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
47s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
28s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 8 new + 267 unchanged - 0 fixed = 275 total (was 267) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 36s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 49s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.queuing.TestQueuingContainerManager
 |
|   | hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828684/YARN-5637.005.patch |
| JIRA Issue | YARN-5637 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 3f446ba56641 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 7cad7b7 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13115/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/13115/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/13115/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13115/testReport/

[jira] [Commented] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler

2016-09-15 Thread Yufei Gu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493961#comment-15493961
 ] 

Yufei Gu commented on YARN-4329:


Thanks [~Naganarasimha]! I found you've done basic framework in YARN-3946. 
That's great! Thanks. Please comment if you have any heads-up or concern.

> Allow fetching exact reason as to why a submitted app is in ACCEPTED state in 
> Fair Scheduler
> 
>
> Key: YARN-4329
> URL: https://issues.apache.org/jira/browse/YARN-4329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Yufei Gu
>
> Similar to YARN-3946, it would be useful to capture possible reason why the 
> Application is in accepted state in FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit


 [ 
https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5637:
--
Attachment: YARN-5637.005.patch

Fixing checkstyle, javadocs and tests

> Changes in NodeManager to support Container upgrade and rollback/commit
> ---
>
> Key: YARN-5637
> URL: https://issues.apache.org/jira/browse/YARN-5637
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5637.001.patch, YARN-5637.002.patch, 
> YARN-5637.003.patch, YARN-5637.004.patch, YARN-5637.005.patch
>
>
> YARN-5620 added support for re-initialization of Containers using a new 
> launch Context.
> This JIRA proposes to use the above feature to support upgrade and subsequent 
> rollback or commit of the upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit


[ 
https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493862#comment-15493862
 ] 

Hadoop QA commented on YARN-5637:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
20s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
44s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 18s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 3 new + 267 unchanged - 0 fixed = 270 total (was 267) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
48s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 5 new + 240 unchanged - 0 fixed = 245 total (was 240) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 42s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 25s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor |
|   | hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828663/YARN-5637.004.patch |
| JIRA Issue | YARN-5637 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f7d665720cf6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 7cad7b7 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13113/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/13113/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit |

[jira] [Updated] (YARN-3692) Allow REST API to set a user generated message when killing an application

2016-09-15 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-3692:

Attachment: 0005-YARN-3692.1.patch

Removed unused import and updated the patch

> Allow REST API to set a user generated message when killing an application
> --
>
> Key: YARN-3692
> URL: https://issues.apache.org/jira/browse/YARN-3692
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Rajat Jain
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-3692.patch, 0002-YARN-3692.patch, 
> 0003-YARN-3692.patch, 0004-YARN-3692.patch, 0005-YARN-3692.1.patch, 
> 0005-YARN-3692.patch
>
>
> Currently YARN's REST API supports killing an application without setting a 
> diagnostic message. It would be good to provide that support.
> *Use Case*
> Usually this helps in workflow management in a multi-tenant environment when 
> the workflow scheduler (or the hadoop admin) wants to kill a job - and let 
> the user know the reason why the job was killed. Killing the job by setting a 
> diagnostic message is a very good solution for that. Ideally, we can set the 
> diagnostic message on all such interface:
> yarn kill -applicationId ... -diagnosticMessage "some message added by 
> admin/workflow"
> REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by 
> admin/workflow'}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue

2016-09-15 Thread Eric Payne (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493893#comment-15493893
 ] 

Eric Payne commented on YARN-4945:
--

Thanks again, [~sunilg]. I will look closely at the patch, but one thing I 
wanted to bring out before too much time passes is that some of the IntraQueue 
classes seem priority-centric and do not lend themselves to adding multiple 
intra-queue policies.

- The constructor for {{IntraQueueCandidatesSelector}} passes 
{{priorityBasedPolicy}} as a parameter directly to the constructor for 
{{IntraQueuePreemptableResourceCalculator}}
- {{IntraQueueCandidatesSelector#selectCandidates}} passes 
{{priorityBasedPolicy}} as a parameter directly to 
{{CapacitySchedulerPreemptionUtils.getResToObtainByPartitionForApps}}.

I think that the objects that implement the {{IntraQueuePreemptionPolicy}} 
interface should be in in a {{List}}, and then 
{{IntraQueueCandidatesSelector#selectCandidates}} should loop over the list to 
process the different policies.


Please change the name of variables in classes that need to be independent of 
the specific intra-queue policy:
- {{CapacitySchedulerPreemptionUtils#getResToObtainByPartitionForApps}} has a 
parameter named {{priorityBasedPolicy}}, but this should be generic, like 
{{intraQueuePolicy}}
- {{IntraQueuePreemptableResourceCalculator}} also has a variable named 
{{priorityBasedPolicy}}, which I think should be more generic.
- 
{{CapacitySchedulerConfiguration#SELECT_CANDIDATES_FOR_INTRAQUEUE_PREEMPTION}}: 
since the value for this property is the switch to turn on intra-queue 
preemption, the name should be something more generic. Currently, it is 
{{yarn.resourcemanager.monitor.capacity.preemption.select_based_on_priority_of_applications}},
 but it should be something like {{enable_intra_queue_preemption}}.


> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application


[ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493832#comment-15493832
 ] 

Hadoop QA commented on YARN-3692:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
51s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 9s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
28s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 31s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
45s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 
0s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 56s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 56s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 28s 
{color} | {color:red} root: The patch generated 3 new + 233 unchanged - 1 fixed 
= 236 total (was 234) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 
25s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 14s 
{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client 
generated 1 new + 157 unchanged - 0 fixed = 158 total (was 157) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s 
{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 19s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 37m 28s 
{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 5s 
{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 114m 21s 
{color} | {color:green} hadoop-mapreduce-client-jobclient in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
30s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 217m 47s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828632/0005-YARN-3692.patch |
| JIRA Issue | YARN-3692 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux fa021b97284e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
|

[jira] [Updated] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit


 [ 
https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5637:
--
Attachment: YARN-5637.004.patch

Rebasing patch with latest YARN-5620

> Changes in NodeManager to support Container upgrade and rollback/commit
> ---
>
> Key: YARN-5637
> URL: https://issues.apache.org/jira/browse/YARN-5637
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5637.001.patch, YARN-5637.002.patch, 
> YARN-5637.003.patch, YARN-5637.004.patch
>
>
> YARN-5620 added support for re-initialization of Containers using a new 
> launch Context.
> This JIRA proposes to use the above feature to support upgrade and subsequent 
> rollback or commit of the upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5577) [Atsv2] Document object passing in infofilters with an example

2016-09-15 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493714#comment-15493714
 ] 

Rohith Sharma K S commented on YARN-5577:
-

ping [~varun_saxena]!!

> [Atsv2] Document object passing in infofilters with an example
> --
>
> Key: YARN-5577
> URL: https://issues.apache.org/jira/browse/YARN-5577
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: timelinereader, timelineserver
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>  Labels: documentation
> Attachments: YARN-5577.patch
>
>
> In HierarchicalTimelineEntity, setparent/addChild allows to set parent/child 
> entities at INFO level. The key is an string and value as an object. 
> Like below, for YARN_CONTAINER entity parent entity set for application.
> {code}
> "SYSTEM_INFO_PARENT_ENTITY": {
>"type": "YARN_APPLICATION",
>"id": "application_1471931266232_0024"
>  }
> {code}
> But to use infofilter on entity type YARN_CONTAINER for an specific 
> applicationId, IIUC there is no way to pass object as value in infofilter. 
> To make easier retrieval either
> # publish parent/child entity id and type as string rather that object like 
> below
> {code}
> "SYSTEM_INFO_PARENT_ENTITY_TYPE": "YARN_APPLICATION"
> "SYSTEM_INFO_PARENT_ENTITY_ID":"application_1471931266232_0024"
> {code}
> OR
> # Add ability to provide object as filter with below format like 
> {{infofilters=SYSTEM_INFO_PARENT_ENTITY eq ((type eq YARN_APPLICATION) AND 
> (id eq application_1471931266232_0024))}}
> I believe 2nd approach will be well applicable for any entities. But I am not 
> sure does HBase supports such a custom filters while scanning a table. 
> 1st approaches will be much easier to change. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493609#comment-15493609
 ] 

Arun Suresh edited comment on YARN-5620 at 9/15/16 3:13 PM:


Committed this to trunk and branch-2
Thanks again for the review [~jianhe] and [~vvasudev]


was (Author: asuresh):
Committed this to trunk and branch-2

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch, YARN-5620.016.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493609#comment-15493609
 ] 

Arun Suresh commented on YARN-5620:
---

Committed this to trunk and branch-2

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch, YARN-5620.016.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext


 [ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5620:
--
Fix Version/s: 3.0.0-alpha2
   2.9.0

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch, YARN-5620.016.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext

2016-09-15 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493503#comment-15493503
 ] 

Hudson commented on YARN-5620:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10443 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10443/])
YARN-5620. Core changes in NodeManager to support re-initialization of (arun 
suresh: rev 40b5a59b726733df456330a26f03d5174cc0bc1c)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/Container.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerEventType.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerState.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncher.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceSet.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestContainerManagerWithLCE.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerReInitEvent.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/event/ContainerLocalizationRequestEvent.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainersLauncherEventType.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/MockContainer.java


> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch, YARN-5620.016.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493431#comment-15493431
 ] 

Arun Suresh commented on YARN-5620:
---

Committing this shortly based on [~vvasudev]'s and [~jianhe]'s +1. Will take 
care of the unused imports when I check in.

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch, YARN-5620.016.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493411#comment-15493411
 ] 

Hadoop QA commented on YARN-5620:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
7s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 21s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 13 new + 529 unchanged - 5 fixed = 542 total (was 534) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
46s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 0 new + 240 unchanged - 2 fixed = 240 total (was 242) {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 11s {color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
16s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 27m 55s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.TestDefaultContainerExecutor |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828647/YARN-5620.016.patch |
| JIRA Issue | YARN-5620 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 45370375ce2a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2a8f55a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-YARN-Build/13112/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results |

[jira] [Commented] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message


[ 
https://issues.apache.org/jira/browse/YARN-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493403#comment-15493403
 ] 

Hadoop QA commented on YARN-5631:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
25s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s 
{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s 
{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
15s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
36s {color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} branch-2.8 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} branch-2.8 passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The 
patch generated 2 new + 28 unchanged - 17 fixed = 30 total (was 45) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 21s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
45s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 11s 
{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s 
{color} | {color:green} the patch passed with JDK v1.7.0_111 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 53s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.8.0_101. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 13s {color} 
| {color:red} hadoop-yarn-client in the patch failed with JDK v1.7.0_111. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 145m 32s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_101 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy 
|
|   | hadoop.yarn.client.TestGetGroups |
| JDK v1.8.0_101 Timed out junit tests | 
org.apache.hadoop.yarn.client.cli.TestYarnCLI |
|   | org.apache.hadoop.yarn.client.api.impl.TestYarnClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient |
|   | org.apache.hadoop.yarn.client.api.impl.TestNMClient |
| JDK v1.7.0_111 Failed junit tests | hadoop.yarn.client.api.impl.TestAMRMProxy 
|
|   |

[jira] [Updated] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext


 [ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5620:
--
Attachment: YARN-5620.016.patch

Thanks for the review [~vvasudev]..
Uploading final patch with the changes in comments and class rename you 
suggested.

Will commit after a good Jenkins run

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch, YARN-5620.016.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493271#comment-15493271
 ] 

Jason Lowe commented on YARN-5545:
--

bq. This could be configured to set max-apps per queue level in cluster level 
(queue won’t override this).

A queue-level max-app setting should always override the system-level setting.  
If a user explicitly sets the max-apps setting for a particular queue then we 
cannot ignore that.  We already have setups today where max-apps is being tuned 
at the queue-level for some queues.

Today if users set a queue-level max app limit then it overrides any 
system-level limit.  That means even today users are allowed to configure RMs 
that can accept over the system-level app limit by explicitly overriding the 
derived queue limits with specific limits that are larger.  Therefore I'm 
tempted to have the global queue config completely override the old 
system-level max-apps config because it's akin to setting the max-apps level 
for each queue explicitly.  That means we operate in one of two modes: if 
global queue max-apps is not set then we do what we do today and derive the 
max-apps based on relative capacities.  Queues that override max-apps at their 
level continue to behave as they do today and get the override setting.  If the 
global queue max-apps is set then yarn.scheduler.capacity.maximum-applications 
is completely ignored.  Queues that override max-apps at their level continue 
to behave as they do today and get the override setting.  Queues that do not 
override get the global queue setting as their max apps setting.

This preserves existing behavior if the queue is not set and is likely the 
least surprising behavior when the new setting is used, especially if we 
document for both the old system max-apps and global queue max-apps configs 
that the latter always overrides the former when set.



> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
>

[jira] [Updated] (YARN-3692) Allow REST API to set a user generated message when killing an application

2016-09-15 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-3692:

Attachment: 0005-YARN-3692.patch

Updated the patch addressing review comments. 

> Allow REST API to set a user generated message when killing an application
> --
>
> Key: YARN-3692
> URL: https://issues.apache.org/jira/browse/YARN-3692
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Rajat Jain
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-3692.patch, 0002-YARN-3692.patch, 
> 0003-YARN-3692.patch, 0004-YARN-3692.patch, 0005-YARN-3692.patch
>
>
> Currently YARN's REST API supports killing an application without setting a 
> diagnostic message. It would be good to provide that support.
> *Use Case*
> Usually this helps in workflow management in a multi-tenant environment when 
> the workflow scheduler (or the hadoop admin) wants to kill a job - and let 
> the user know the reason why the job was killed. Killing the job by setting a 
> diagnostic message is a very good solution for that. Ideally, we can set the 
> diagnostic message on all such interface:
> yarn kill -applicationId ... -diagnosticMessage "some message added by 
> admin/workflow"
> REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by 
> admin/workflow'}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message

2016-09-15 Thread Kai Sasaki (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated YARN-5631:
-
Attachment: YARN-5631-branch-2.8.02.patch

> Missing refreshClusterMaxPriority usage in rmadmin help message
> ---
>
> Key: YARN-5631
> URL: https://issues.apache.org/jira/browse/YARN-5631
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: YARN-5631-branch-2.8.01.patch, 
> YARN-5631-branch-2.8.02.patch, YARN-5631.01.patch, YARN-5631.02.patch
>
>
> {{rmadmin -help}} does not show {{-refreshClusterMaxPriority}} option in 
> usage line.
> {code}
> $ bin/yarn rmadmin -help
> rmadmin is the command to execute YARN administrative commands.
> The full syntax is:
> yarn rmadmin [-refreshQueues] [-refreshNodes [-g|graceful [timeout in 
> seconds] -client|server]] [-refreshNodesResources] 
> [-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] 
> [-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] 
> [-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">] 
> [-removeFromClusterNodeLabels ] [-replaceLabelsOnNode 
> <"node1[:port]=label1,label2 node2[:port]=label1">] 
> [-directlyAccessNodeLabelStore] [-updateNodeResource [NodeID] [MemSize] 
> [vCores] ([OvercommitTimeout]) [-help [cmd]]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3140) Improve locks in AbstractCSQueue/LeafQueue/ParentQueue

2016-09-15 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493012#comment-15493012
 ] 

Jian He commented on YARN-3140:
---

- Is this method not used ? If so, labelManager no need to be volatile, and 
remove this method
{code}
  @VisibleForTesting
  public void setNodeLabelManager(RMNodeLabelsManager mgr) {
this.labelManager = mgr;
  }
  {code}
- pendingOrderingPolicy: no need to be volatile 
- This synchronized keyword is removed, but no write lock is added
{code}
  private synchronized CSAssignment assignContainersToChildQueues(
  {code}

> Improve locks in AbstractCSQueue/LeafQueue/ParentQueue
> --
>
> Key: YARN-3140
> URL: https://issues.apache.org/jira/browse/YARN-3140
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3140.1.patch, YARN-3140.2.patch
>
>
> Enhance locks in AbstractCSQueue/LeafQueue/ParentQueue, as mentioned in 
> YARN-3091, a possible solution is using read/write lock. Other fine-graind 
> locks for specific purposes / bugs should be addressed in separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR

2016-09-15 Thread Kai Sasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15493000#comment-15493000
 ] 

Kai Sasaki commented on YARN-5145:
--

[~sunilg]
Sorry for lacking of explanation. It seems that {{configs.env}} is not used 
anymore because all configurations are included in {{default-config.js}}. And 
new YARN UI is working fine without {{configs.env}}. So we can remove this file.

And then after removing {{configs.env}} file, there is no {{config}} directory 
under {{$HADOOP_PREFIX/share/hadoop/yarn/webapps/}} because condifurations 
{{default-config.js}} are build into ember deploy. There is no configurations 
to be passed externally since they are included in ember deploy package.

So removing {{configs.env}} is intended to 
1. Since {{configs.env}} is not used anymore, it can be removed.
2. By removing config directory in ember deployed package, we found there is no 
configurations to be passed from external.

Does it make sense? 
But if any other new configurations are introduced which should be changed from 
external, we need to implement a way to pass some values to deployed ember 
package. Do you consider of such future usage?

> [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
> -
>
> Key: YARN-5145
> URL: https://issues.apache.org/jira/browse/YARN-5145
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
> Attachments: YARN-5145-YARN-3368.01.patch
>
>
> Existing YARN UI configuration is under Hadoop package's directory: 
> $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to 
> $HADOOP_CONF_DIR like other configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext

2016-09-15 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492925#comment-15492925
 ] 

Varun Vasudev commented on YARN-5620:
-

Thanks for the patch [~asuresh]. +1 except for some minor comment fixes.

1)
{code}
+   * Resource is localized while the container is running - create symlinks.
{code}

Comment is the same for two transition handlers - maybe change slightly to 
provide more context?

2)
{code}
+  // If Container died during an upgrade, dont bother retrying.
{code}
What is this comment for? There’s no change in the code and it looks we're just 
going through the regular retry mechanism.

3)
{code}
+  static class KilledExternallyForReInitTransition extends ContainerTransition 
{
{code}
Maybe this should be renamed? My understanding is that this really is “Killed 
by YARN framework to restart container"

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue


[ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492609#comment-15492609
 ] 

Wangda Tan commented on YARN-4945:
--

[~sunilg], took a quick look at the patch, overall approach looks good.

For the TODO items, I think reservation logic support can be moved to a 
separate ticket, for apps running inside the same queue, it is more likely that 
resources are more homogeneous. For the other two TODOs, it's better to be 
addressed in the same patch.

And one minor comment:
- Definition and initialization of IntraQueuePreemptionPolicy is in the 
IntraQueueCandidatesSelector now, but I think it might be better to move them 
to IntraQueuePreemptableResourceCalculator. And I think we might not need the 
userLimitBasedPolicy, it could be a part of the existing 
IntraQueuePreemptionPolicy.

I will include more detailed reviews for the final patch :).

Thanks,

> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler

2016-09-15 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492585#comment-15492585
 ] 

Naganarasimha G R commented on YARN-4329:
-

Thanks [~yufeigu] for working on this!  As i am not much acquainted with Fair 
Scheduler (and was only aware of the first 2 in the list) i did not go ahead 
with working on it, And yes its better than logging (as per YARN-5563) as it 
will be availble from REST/CLI/WEB etc...
Basic framework is already set as part of YARN-3946, If you have any concerns 
or queries you can reach me.

> Allow fetching exact reason as to why a submitted app is in ACCEPTED state in 
> Fair Scheduler
> 
>
> Key: YARN-4329
> URL: https://issues.apache.org/jira/browse/YARN-4329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Yufei Gu
>
> Similar to YARN-3946, it would be useful to capture possible reason why the 
> Application is in accepted state in FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message