[jira] [Commented] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR

2016-09-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492451#comment-15492451
 ] 

Sunil G commented on YARN-5145:
---

HI [~kaisasak]
I didnt understand the approach here. Why are we removing {{configs.env}}

This is the config which i used presently to run YARN web ui.. Also mentioned 
about same in doc files. I think what we are looking here is to have a config 
in $HADOOP_CONF_DIR and during RM start/re-start can place/mirror necessary 
conf to ember conf dir. So ember can work the way its working now, and for 
user, he can config like hadoop style. Thoughts?

> [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
> -
>
> Key: YARN-5145
> URL: https://issues.apache.org/jira/browse/YARN-5145
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
> Attachments: YARN-5145-YARN-3368.01.patch
>
>
> Existing YARN UI configuration is under Hadoop package's directory: 
> $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to 
> $HADOOP_CONF_DIR like other configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-09-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15492418#comment-15492418
 ] 

Sunil G commented on YARN-5545:
---

Thank you very much [~jlowe] for pitching in and sharing thoughts. Makes sense 
to me overall.

bq.if we go down this route then I think we should have a separate top-level 
config that, when set, specifies the default max-apps per queue explicitly 
rather than having them try to derive it based on relative capacities. We can 
then debate whether that also overrides the system-wide setting or if we still 
respect the system-wide limit.

IIUC, existing {{yarn.scheduler.capacity.maximum-applications}} can be used for 
system level max-limit for apps, and proposing new config like 
{{yarn.scheduler.capacity.global.queue-level-maximum-applications}}. This could 
be configured to set max-apps per queue level  in cluster level (queue won’t 
override this). So if I set this config as 10k, then any queue in best case 
could atmost submit 10k apps. And this will also work along with system-wide 
app limit. Hence if we configure system-wide app limit as 50k, and assuming we 
have 10queues (10k each limit), we will not end up in having 100k apps in 
cluster. Rather we will hit system-wide limit of 50k.

As more queues are added to system, admin can decrease global queue 
max-app-limit for better fine tuning if needed. If we are tending to use global 
queue max-app-limit as a relaxed boundary, then strict actions (reject an app) 
can be done based on system-wide limit. But if we are configuring this limit 
more judiciously, we can think of making same queue max-app-limit also as 
strict limit to reject apps for a queue.
I see only one problem for this now. If  we are not making {{Q * X ~ Y}} (where 
Q is number of queues, X is global per-queue limit and Y is system-wide max-app 
limit) as a strict rule, then we have 2 possibilities {{Q * X > Y}} and {{Q * X 
< Y}}.  I think mostly admin prefer to use former approach, where system-wide 
limit will be stricter and a relaxed limit for per-queue limit. But if we use 
latter, then we may reject apps even though system-wide limit is still not met. 
This may or may not be fine. I think with more discussion we can come to a 
common consensus here. 

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> 

[jira] [Updated] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message

2016-09-14 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated YARN-5631:
-
Attachment: YARN-5631-branch-2.8.01.patch

> Missing refreshClusterMaxPriority usage in rmadmin help message
> ---
>
> Key: YARN-5631
> URL: https://issues.apache.org/jira/browse/YARN-5631
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: YARN-5631-branch-2.8.01.patch, YARN-5631.01.patch, 
> YARN-5631.02.patch
>
>
> {{rmadmin -help}} does not show {{-refreshClusterMaxPriority}} option in 
> usage line.
> {code}
> $ bin/yarn rmadmin -help
> rmadmin is the command to execute YARN administrative commands.
> The full syntax is:
> yarn rmadmin [-refreshQueues] [-refreshNodes [-g|graceful [timeout in 
> seconds] -client|server]] [-refreshNodesResources] 
> [-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] 
> [-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] 
> [-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">] 
> [-removeFromClusterNodeLabels ] [-replaceLabelsOnNode 
> <"node1[:port]=label1,label2 node2[:port]=label1">] 
> [-directlyAccessNodeLabelStore] [-updateNodeResource [NodeID] [MemSize] 
> [vCores] ([OvercommitTimeout]) [-help [cmd]]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4591) YARN Web UIs should provide a robots.txt

2016-09-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491928#comment-15491928
 ] 

Hadoop QA commented on YARN-4591:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
54s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 17s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m 9s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828563/YARN-4591.002.patch |
| JIRA Issue | YARN-4591 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4b5bd7922dd6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2a8f55a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13108/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13108/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> YARN Web UIs should provide a robots.txt
> 
>
> Key: YARN-4591
> URL: https://issues.apache.org/jira/browse/YARN-4591
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Sidharta Seethana
>Priority: Trivial
> Attachments: YARN-4591.001.patch, YARN-4591.002.patch
>
>
> To prevent well-behaved crawlers from indexing public YARN UIs.
> Similar to HDFS-330 / HDFS-9651.
> I took a quick look at the Webapp stuff in YARN and it looks 

[jira] [Commented] (YARN-4758) Enable discovery of AMs by containers

2016-09-14 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491898#comment-15491898
 ] 

Junping Du commented on YARN-4758:
--

Put up a design doc as attached after discussed offline with Jian and Vinod. To 
mates who are watching this JIRA, welcome comments.
In the meanwhile, start to do some POC patch which target to work with 
MAPREDUCE-6608 end to end soon.

> Enable discovery of AMs by containers
> -
>
> Key: YARN-4758
> URL: https://issues.apache.org/jira/browse/YARN-4758
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Junping Du
> Attachments: YARN-4758. AM Discovery Service for YARN Container.pdf
>
>
> {color:red}
> This is already discussed on the umbrella JIRA YARN-1489.
> Copying some of my condensed summary from the design doc (section 3.2.10.3) 
> of YARN-4692.
> {color}
> Even after the existing work in Work­preserving AM restart (Section 3.1.2 / 
> YARN-1489), we still haven’t solved the problem of old running containers not 
> knowing where the new AM starts running after the previous AM crashes. This 
> is a specifically important problem to be solved for long running services 
> where we’d like to avoid killing service containers when AMs fail­over. So 
> far, we left this as a task for the apps, but solving it in YARN is much 
> desirable. [(Task) This looks very much like service­-registry (YARN-913), 
> but for app­containers to discover their own AMs.
> Combining this requirement (of any container being able to find their AM 
> across fail­overs) with those of services (to be able to find through DNS 
> where a service container is running - YARN-4757) will put our registry 
> scalability needs to be much higher than that of just service end­points. 
> This calls for a more distributed solution for registry readers  something 
> that is discussed in the comments section of YARN-1489 and MAPREDUCE-6608.
> See comment 
> https://issues.apache.org/jira/browse/YARN-1489?focusedCommentId=13862359=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13862359



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4758) Enable discovery of AMs by containers

2016-09-14 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated YARN-4758:
-
Attachment: YARN-4758. AM Discovery Service for YARN Container.pdf

> Enable discovery of AMs by containers
> -
>
> Key: YARN-4758
> URL: https://issues.apache.org/jira/browse/YARN-4758
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Junping Du
> Attachments: YARN-4758. AM Discovery Service for YARN Container.pdf
>
>
> {color:red}
> This is already discussed on the umbrella JIRA YARN-1489.
> Copying some of my condensed summary from the design doc (section 3.2.10.3) 
> of YARN-4692.
> {color}
> Even after the existing work in Work­preserving AM restart (Section 3.1.2 / 
> YARN-1489), we still haven’t solved the problem of old running containers not 
> knowing where the new AM starts running after the previous AM crashes. This 
> is a specifically important problem to be solved for long running services 
> where we’d like to avoid killing service containers when AMs fail­over. So 
> far, we left this as a task for the apps, but solving it in YARN is much 
> desirable. [(Task) This looks very much like service­-registry (YARN-913), 
> but for app­containers to discover their own AMs.
> Combining this requirement (of any container being able to find their AM 
> across fail­overs) with those of services (to be able to find through DNS 
> where a service container is running - YARN-4757) will put our registry 
> scalability needs to be much higher than that of just service end­points. 
> This calls for a more distributed solution for registry readers  something 
> that is discussed in the comments section of YARN-1489 and MAPREDUCE-6608.
> See comment 
> https://issues.apache.org/jira/browse/YARN-1489?focusedCommentId=13862359=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13862359



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4591) YARN Web UIs should provide a robots.txt

2016-09-14 Thread Sidharta Seethana (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sidharta Seethana updated YARN-4591:

Attachment: YARN-4591.002.patch

Uploaded a new patch that fixes the reported checkstyle issues. 

> YARN Web UIs should provide a robots.txt
> 
>
> Key: YARN-4591
> URL: https://issues.apache.org/jira/browse/YARN-4591
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Sidharta Seethana
>Priority: Trivial
> Attachments: YARN-4591.001.patch, YARN-4591.002.patch
>
>
> To prevent well-behaved crawlers from indexing public YARN UIs.
> Similar to HDFS-330 / HDFS-9651.
> I took a quick look at the Webapp stuff in YARN and it looks complicated so I 
> can't provide a quick patch. If anyone can point me in the right direction I 
> might take a look.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4591) YARN Web UIs should provide a robots.txt

2016-09-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491854#comment-15491854
 ] 

Hadoop QA commented on YARN-4591:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
18s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
56s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common: The 
patch generated 2 new + 29 unchanged - 0 fixed = 31 total (was 29) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 27s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
10s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 1s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 16s 
{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 16m 20s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828551/YARN-4591.001.patch |
| JIRA Issue | YARN-4591 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0544216c7bf7 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2a8f55a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13107/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13107/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13107/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> YARN Web UIs should provide a robots.txt
> 
>
> Key: YARN-4591
> URL: https://issues.apache.org/jira/browse/YARN-4591
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Sidharta Seethana
>  

[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp

2016-09-14 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491810#comment-15491810
 ] 

Daniel Templeton commented on YARN-3141:


Thanks for the patch, [~leftnoteasy].  I only did a superficial review.  I 
still need to look carefully at the locking and data access to make sure it's 
clean.

Comments:
* You axed the javadoc for {{SchedulerApplicationAttempt.isReserved()}}
* In {{SchedulerApplicationAttempt.showRequests()}}, the lock can be inside the 
test if debug in enabled
* In {{FSAppAttempt. unreserveInternal()}} and {{FSAppAttempt. allocate()}} you 
could downgrade the lock before the logging at the end.
* In {{FSAppAttempt. resetAllowedLocalityLevel()}}, it would be better to do: 
{code}
NodeType old;

try {
  writeLock.lock();
  old = allowedLocalityLevel.put(schedulerKey, level);
} finally {
  writeLock.unlock();
}

LOG.info("Raising locality level from " + old + " to " + level + " at "
+ " priority " + schedulerKey.getPriority());
{code}  It doesn't look like the {{schedulerKey.getPriority()}} needs 
protection.
* In {{FSAppAttempt}} line 804 (I think) you deleted a space before a brace in 
the _else_
* It would be nice in the javadoc for all the methods that are no longer 
synchronized to note that they're MT safe.  That's a convention that exists 
nowhere else in YARN, but it should...


> Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
> --
>
> Key: YARN-3141
> URL: https://issues.apache.org/jira/browse/YARN-3141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3141.1.patch, YARN-3141.2.patch
>
>
> Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, 
> as mentioned in YARN-3091, a possible solution is using read/write lock. 
> Other fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4591) YARN Web UIs should provide a robots.txt

2016-09-14 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491805#comment-15491805
 ] 

Sidharta Seethana commented on YARN-4591:
-

Submitted to Jenkins. [~wangda], could you please take a look ? Thanks!

> YARN Web UIs should provide a robots.txt
> 
>
> Key: YARN-4591
> URL: https://issues.apache.org/jira/browse/YARN-4591
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Sidharta Seethana
>Priority: Trivial
> Attachments: YARN-4591.001.patch
>
>
> To prevent well-behaved crawlers from indexing public YARN UIs.
> Similar to HDFS-330 / HDFS-9651.
> I took a quick look at the Webapp stuff in YARN and it looks complicated so I 
> can't provide a quick patch. If anyone can point me in the right direction I 
> might take a look.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4591) YARN Web UIs should provide a robots.txt

2016-09-14 Thread Sidharta Seethana (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sidharta Seethana updated YARN-4591:

Attachment: YARN-4591.001.patch

Uploaded a patch that provides a robot.txt for all YARN web UIs that disallows 
crawling. Added a unit test and tested this manually with RM/NM web UIs. 

> YARN Web UIs should provide a robots.txt
> 
>
> Key: YARN-4591
> URL: https://issues.apache.org/jira/browse/YARN-4591
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Lars Francke
>Assignee: Sidharta Seethana
>Priority: Trivial
> Attachments: YARN-4591.001.patch
>
>
> To prevent well-behaved crawlers from indexing public YARN UIs.
> Similar to HDFS-330 / HDFS-9651.
> I took a quick look at the Webapp stuff in YARN and it looks complicated so I 
> can't provide a quick patch. If anyone can point me in the right direction I 
> might take a look.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler

2016-09-14 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491714#comment-15491714
 ] 

Yufei Gu edited comment on YARN-4329 at 9/14/16 11:14 PM:
--

Close YARN-5563 to work on this one. There are some reason why an app is in 
ACCEPTED state but doesn't run in FairScheduler. 
(1) Exceed queue max apps, 
(2) Exceed user max apps, 
(3) Exceed queue maxResources, 
(4) Exceed maxAMShare.


was (Author: yufeigu):
Close YARN-5563 to work on this one. There are some reason why an app is in 
ACCEPTED state but doesn't run in FairScheduler. 
(1) queue max apps, 
(2) user max apps, 
(3) queue maxResources, 
(4) maxAMShare.

> Allow fetching exact reason as to why a submitted app is in ACCEPTED state in 
> Fair Scheduler
> 
>
> Key: YARN-4329
> URL: https://issues.apache.org/jira/browse/YARN-4329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Yufei Gu
>
> Similar to YARN-3946, it would be useful to capture possible reason why the 
> Application is in accepted state in FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5563) Add log messages for jobs in ACCEPTED state but not runnable.

2016-09-14 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491716#comment-15491716
 ] 

Yufei Gu commented on YARN-5563:


This duplicates YARN-4329, close it and work on YARN-4329.

> Add log messages for jobs in ACCEPTED state but not runnable.
> -
>
> Key: YARN-5563
> URL: https://issues.apache.org/jira/browse/YARN-5563
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>  Labels: supportability
>
> Leaf queues maintain a list of runnable and non-runnable apps. FairScheduler 
> marks an app non-runnable for different reasons: exceeding the following 
> properties of the leaf queue:
> (1) queue max apps, 
> (2) user max apps, 
> (3) queue maxResources, 
> (4) maxAMShare. 
> It would be nice to log the reason an app isn't runnable. The first three are 
> easy to infer, but the last one (maxAMShare) is particularly hard. We are 
> going to log all of them and show the reason if any in WebUI application view.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler

2016-09-14 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491714#comment-15491714
 ] 

Yufei Gu commented on YARN-4329:


Close YARN-5563 to work on this one. There are some reason why an app is in 
ACCEPTED state but doesn't run in FairScheduler. 
(1) queue max apps, 
(2) user max apps, 
(3) queue maxResources, 
(4) maxAMShare.

> Allow fetching exact reason as to why a submitted app is in ACCEPTED state in 
> Fair Scheduler
> 
>
> Key: YARN-4329
> URL: https://issues.apache.org/jira/browse/YARN-4329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Yufei Gu
>
> Similar to YARN-3946, it would be useful to capture possible reason why the 
> Application is in accepted state in FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5563) Add log messages for jobs in ACCEPTED state but not runnable.

2016-09-14 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu resolved YARN-5563.

Resolution: Duplicate

> Add log messages for jobs in ACCEPTED state but not runnable.
> -
>
> Key: YARN-5563
> URL: https://issues.apache.org/jira/browse/YARN-5563
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Yufei Gu
>Assignee: Yufei Gu
>  Labels: supportability
>
> Leaf queues maintain a list of runnable and non-runnable apps. FairScheduler 
> marks an app non-runnable for different reasons: exceeding the following 
> properties of the leaf queue:
> (1) queue max apps, 
> (2) user max apps, 
> (3) queue maxResources, 
> (4) maxAMShare. 
> It would be nice to log the reason an app isn't runnable. The first three are 
> easy to infer, but the last one (maxAMShare) is particularly hard. We are 
> going to log all of them and show the reason if any in WebUI application view.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-4329) Allow fetching exact reason as to why a submitted app is in ACCEPTED state in Fair Scheduler

2016-09-14 Thread Yufei Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu reassigned YARN-4329:
--

Assignee: Yufei Gu

> Allow fetching exact reason as to why a submitted app is in ACCEPTED state in 
> Fair Scheduler
> 
>
> Key: YARN-4329
> URL: https://issues.apache.org/jira/browse/YARN-4329
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler, resourcemanager
>Reporter: Naganarasimha G R
>Assignee: Yufei Gu
>
> Similar to YARN-3946, it would be useful to capture possible reason why the 
> Application is in accepted state in FairScheduler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5540) scheduler spends too much time looking at empty priorities

2016-09-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491697#comment-15491697
 ] 

Wangda Tan commented on YARN-5540:
--

+1 to latest patch, thanks [~jlowe].

> scheduler spends too much time looking at empty priorities
> --
>
> Key: YARN-5540
> URL: https://issues.apache.org/jira/browse/YARN-5540
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler, fairscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Nathan Roberts
>Assignee: Jason Lowe
> Attachments: YARN-5540.001.patch, YARN-5540.002.patch, 
> YARN-5540.003.patch, YARN-5540.004.patch
>
>
> We're starting to see the capacity scheduler run out of scheduling horsepower 
> when running 500-1000 applications on clusters with 4K nodes or so.
> This seems to be amplified by TEZ applications. TEZ applications have many 
> more priorities (sometimes in the hundreds) than typical MR applications and 
> therefore the loop in the scheduler which examines every priority within 
> every running application, starts to be a hotspot. The priorities appear to 
> stay around forever, even when there is no remaining resource request at that 
> priority causing us to spend a lot of time looking at nothing.
> jstack snippet:
> {noformat}
> "ResourceManager Event Processor" #28 prio=5 os_prio=0 tid=0x7fc2d453e800 
> nid=0x22f3 runnable [0x7fc2a8be2000]
>java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequest(SchedulerApplicationAttempt.java:210)
> - eliminated <0x0005e73e5dc0> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:852)
> - locked <0x0005e73e5dc0> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)
> - locked <0x0003006fcf60> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:527)
> - locked <0x0003001b22f8> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:415)
> - locked <0x0003001b22f8> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1224)
> - locked <0x000300041e40> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3141) Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp

2016-09-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491683#comment-15491683
 ] 

Wangda Tan commented on YARN-3141:
--

Please refer to 
https://issues.apache.org/jira/browse/YARN-3139?focusedCommentId=15491678=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15491678
 for performance test report.

> Improve locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp
> --
>
> Key: YARN-3141
> URL: https://issues.apache.org/jira/browse/YARN-3141
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3141.1.patch, YARN-3141.2.patch
>
>
> Enhance locks in SchedulerApplicationAttempt/FSAppAttempt/FiCaSchedulerApp, 
> as mentioned in YARN-3091, a possible solution is using read/write lock. 
> Other fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3139) Improve locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler

2016-09-14 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491678#comment-15491678
 ] 

Wangda Tan commented on YARN-3139:
--

I just tried to use SLS to benchmark the performance after applied patches: 
YARN-3139/YARN-3140/YARN-3141, I created a mini framework to run the comparison 
tests: https://github.com/leftnoteasy/yarn_application_synthesizer.

Because of some unknown issue, I cannot run the benchmark with FairScheduler, 
filed YARN-5654. Capacity scheduler runs fine.

Overview of test data: 
- 3000+ applications, ~1000 applications running in parallel
- 3000 NMs, each node configured 128G capacity
- Size of containers ranges from 8G to 32G
- Lifespan of containers ranges from 5 sec to 2 mins
- Total 245,301 containers launched
- Total time spent on the test is < 10 mins (can increase #apps / #containers 
to increase the total time), thus, there're around 500 containers allocated for 
each second.

For original CS, time:
{code}
real8m51.774s
user8m41.998s
sys 1m28.158s
{code}

For CS with R/W locking changes:
{code}
real8m48.426s
user8m31.398s
sys 1m29.861s
{code}

Summary:
- No regression in performance, didn't see deadlock happens.
- No significant performance improvement either, because existing scheduler 
allocation is still in single thread.

Hope this can with code review.

cc: [~jianhe], [~kasha], [~templedf], [~vinodkv], [~jlowe].

> Improve locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler
> --
>
> Key: YARN-3139
> URL: https://issues.apache.org/jira/browse/YARN-3139
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-3139.0.patch
>
>
> Enhance locks in AbstractYarnScheduler/CapacityScheduler/FairScheduler, as 
> mentioned in YARN-3091, a possible solution is using read/write lock. Other 
> fine-graind locks for specific purposes / bugs should be addressed in 
> separated tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5654) Not be able to run SLS with FairScheduler

2016-09-14 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-5654:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-5065

> Not be able to run SLS with FairScheduler
> -
>
> Key: YARN-5654
> URL: https://issues.apache.org/jira/browse/YARN-5654
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>
> With the config:
> https://github.com/leftnoteasy/yarn_application_synthesizer/tree/master/configs/hadoop-conf-fs
> And data:
> https://github.com/leftnoteasy/yarn_application_synthesizer/tree/master/data/scheduler-load-test-data
> Capacity Scheduler runs fine, but Fair Scheduler cannot be successfully run. 
> It reports NPE from RMAppAttemptImpl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5654) Not be able to run SLS with FairScheduler

2016-09-14 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-5654:


 Summary: Not be able to run SLS with FairScheduler
 Key: YARN-5654
 URL: https://issues.apache.org/jira/browse/YARN-5654
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Wangda Tan


With the config:
https://github.com/leftnoteasy/yarn_application_synthesizer/tree/master/configs/hadoop-conf-fs

And data:
https://github.com/leftnoteasy/yarn_application_synthesizer/tree/master/data/scheduler-load-test-data

Capacity Scheduler runs fine, but Fair Scheduler cannot be successfully run. It 
reports NPE from RMAppAttemptImpl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5163) Migrate TestClientToAMTokens and TestClientRMTokens tests from the old RPC engine

2016-09-14 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491651#comment-15491651
 ] 

Kai Zheng commented on YARN-5163:
-

Thanks [~zhouwei] for the update and thorough tests in a cluster. 

+1 on the latest patch and will commit it in few days unless receiving some 
concerns.

> Migrate TestClientToAMTokens and TestClientRMTokens tests from the old RPC 
> engine
> -
>
> Key: YARN-5163
> URL: https://issues.apache.org/jira/browse/YARN-5163
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Arun Suresh
>Assignee: Wei Zhou
> Attachments: YARN-5163-v1.patch, YARN-5163-v2.patch
>
>
> The above testcases fail due to a {{NullpointerException}} and a Cannot bind 
> to port.. Both of which should be fixed..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5163) Migrate TestClientToAMTokens and TestClientRMTokens tests from the old RPC engine

2016-09-14 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated YARN-5163:

Summary: Migrate TestClientToAMTokens and TestClientRMTokens tests from the 
old RPC engine  (was: Fix TestClientToAMTokens and TestClientRMTokens)

> Migrate TestClientToAMTokens and TestClientRMTokens tests from the old RPC 
> engine
> -
>
> Key: YARN-5163
> URL: https://issues.apache.org/jira/browse/YARN-5163
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Arun Suresh
>Assignee: Wei Zhou
> Attachments: YARN-5163-v1.patch, YARN-5163-v2.patch
>
>
> The above testcases fail due to a {{NullpointerException}} and a Cannot bind 
> to port.. Both of which should be fixed..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5621) Support LinuxContainerExecutor to create symlinks for continuously localized resources

2016-09-14 Thread Chris Douglas (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491452#comment-15491452
 ] 

Chris Douglas commented on YARN-5621:
-

bq. This may be a viable approach, we need to change the localizer heartbeat to 
send the symlink path.
The heartbeat already carries a payload with commands to the localizer. 
Including actions to symlink resources already fetched isn't that dire a change 
to either the ContainerLocalizer or the resource state machine, is it? The 
transition needs to send a LINK request to all localizers that were waiting in 
case the download failed.

bq. But if we want to create all symlinks in one go, this approach will not 
work.
This isn't going to be a transaction on the FS regardless, but can you explain 
this requirement? If symlink-on-download is disqualifying, then the container 
could still coordinate grouped symlinks by grouping LINK requests to a 
localizer. It rearranges the event flows awkwardly, but it's supportable...

> Support LinuxContainerExecutor to create symlinks for continuously localized 
> resources
> --
>
> Key: YARN-5621
> URL: https://issues.apache.org/jira/browse/YARN-5621
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-5621.1.patch, YARN-5621.2.patch, YARN-5621.3.patch, 
> YARN-5621.4.patch, YARN-5621.5.patch
>
>
> When new resources are localized, new symlink needs to be created for the 
> localized resource. This is the change for the LinuxContainerExecutor to 
> create the symlinks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5079) [Umbrella] Native YARN framework layer for services and beyond

2016-09-14 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491388#comment-15491388
 ] 

Arun Suresh commented on YARN-5079:
---

Thanks for the reply [~gsaha] and [~jianhe].
In which case, would be nice to hear your thoughts on SLIDER-1167 specifically. 
It talks about enhancements to both the Slider AM, in terms of a pluggable 
module in both the Slider AM as well as the agent. Would I be right in assuming 
that changes made to the Slider AM would be ported to YARN initially and the 
Slider Agent changes might live in Slider until those specific APIs are 
supported by the YARN NM (via YARN-5593, YARN-4726 and YARN-1503). Also, are 
new releases for Slider going to contain only bug fixes ? or are there any 
specific new features expected ?

> [Umbrella] Native YARN framework layer for services and beyond
> --
>
> Key: YARN-5079
> URL: https://issues.apache.org/jira/browse/YARN-5079
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>
> (See overview doc at YARN-4692, modifying and copy-pasting some of the 
> relevant pieces and sub-section 3.3.1 to track the specific sub-item.)
> (This is a companion to YARN-4793 in our effort to simplify the entire story, 
> but focusing on APIs)
> So far, YARN by design has restricted itself to having a very low-­level API 
> that can support any type of application. Frameworks like Apache Hadoop 
> MapReduce, Apache Tez, Apache Spark, Apache REEF, Apache Twill, Apache Helix 
> and others ended up exposing higher level APIs that end­-users can directly 
> leverage to build their applications on top of YARN. On the services side, 
> Apache Slider has done something similar.
> With our current attention on making services first­-class and simplified, 
> it's time to take a fresh look at how we can make Apache Hadoop YARN support 
> services well out of the box. Beyond the functionality that I outlined in the 
> previous sections in the doc on how NodeManagers can be enhanced to help 
> services, the biggest missing piece is the framework itself. There is a lot 
> of very important functionality that a services' framework can own together 
> with YARN in executing services end­-to­-end.
> In this JIRA I propose we look at having a native Apache Hadoop framework for 
> running services natively on YARN.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application

2016-09-14 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491261#comment-15491261
 ] 

Naganarasimha G R commented on YARN-3692:
-

Hi [~rohithsharma],
Overall the approach is fine but to add few nits more
# YarnClient. l no 186..187, {{String diagnosis}} i used better to sync it 
everywhere to the same lingo
# Similarly in YarnClientImpl ln no 412. 
# ClientRMService ln no 783 & 787, better to use append instead of "+"



> Allow REST API to set a user generated message when killing an application
> --
>
> Key: YARN-3692
> URL: https://issues.apache.org/jira/browse/YARN-3692
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Rajat Jain
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-3692.patch, 0002-YARN-3692.patch, 
> 0003-YARN-3692.patch, 0004-YARN-3692.patch
>
>
> Currently YARN's REST API supports killing an application without setting a 
> diagnostic message. It would be good to provide that support.
> *Use Case*
> Usually this helps in workflow management in a multi-tenant environment when 
> the workflow scheduler (or the hadoop admin) wants to kill a job - and let 
> the user know the reason why the job was killed. Killing the job by setting a 
> diagnostic message is a very good solution for that. Ideally, we can set the 
> diagnostic message on all such interface:
> yarn kill -applicationId ... -diagnosticMessage "some message added by 
> admin/workflow"
> REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by 
> admin/workflow'}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4945) [Umbrella] Capacity Scheduler Preemption Within a queue

2016-09-14 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4945:
--
Attachment: YARN-2009.v1.patch

Thanks for the comments. Updating v1 patch. 

There are couple of todo in this patch.
- Test class needs some more modifications
- Use reservation logic
- If all apps at same priority, current logic does preemption. Need to block 
this, I fixed this. However a detailed UT case to be added, hence kept pending 
for this version of path.

[~leftnoteasy] pls help to check latest code/approach. Also looping for 
[~eepayne]

> [Umbrella] Capacity Scheduler Preemption Within a queue
> ---
>
> Key: YARN-4945
> URL: https://issues.apache.org/jira/browse/YARN-4945
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
> Attachments: Intra-Queue Preemption Use Cases.pdf, 
> IntraQueuepreemption-CapacityScheduler (Design).pdf, YARN-2009-wip.2.patch, 
> YARN-2009-wip.patch, YARN-2009-wip.v3.patch, YARN-2009.v0.patch, 
> YARN-2009.v1.patch
>
>
> This is umbrella ticket to track efforts of preemption within a queue to 
> support features like:
> YARN-2009. YARN-2113. YARN-4781.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5642) Typos in 11 log messages

2016-09-14 Thread Mehran Hassani (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491069#comment-15491069
 ] 

Mehran Hassani commented on YARN-5642:
--

It's currently not in our toolset since we find logs from source code that may 
not have the complete sentence. But we will try to detect grammar issues later.

> Typos in 11 log messages 
> -
>
> Key: YARN-5642
> URL: https://issues.apache.org/jira/browse/YARN-5642
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mehran Hassani
>Priority: Trivial
>  Labels: newbie
>
> I am conducting research on log related bugs. I tried to make a tool to fix 
> repetitive yet simple patterns of bugs that are related to logs. Typos in log 
> messages are one of the reoccurring bugs. Therefore, I made a tool find typos 
> in log statements. During my experiments, I managed to find the following 
> typos in Hadoop YARN:
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."), 
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."),  
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java,
>  LOG.info("Completed reading history information of all conatiners"+ " of 
> application attempt " + appAttemptId), 
> conatiners should be containers
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java,
>  LOG.info("Neither virutal-memory nor physical-memory monitoring is " 
> +"needed. Not running the monitor-thread"), 
> virutal should be virtual
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java,
>  LOG.info("Intialized plan {} based on reservable queue {}" plan.toString()  
> planQueueName), 
> Intialized should be Initialized
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java,
>  LOG.info("Initializing " + queueName + "\n" +"capacity = " + 
> queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + 
> "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= 
> parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + 
> queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" 
> +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" 
> [= 1.0 maximumCapacity undefined  " +"(parentAbsoluteMaxCapacity * 
> maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= 
> configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= 
> configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications 
> +" [= configuredMaximumSystemApplicationsPerQueue or" +" 
> (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" 
> +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= 
> (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" 
> +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= 
> usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" 
> +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / 
> clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + 
> maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + 
> "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= 
> (float)(maximumAllocationMemory - minimumAllocationMemory) / " 
> +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + 
> maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " 
> + 

[jira] [Commented] (YARN-5642) Typos in 11 log messages

2016-09-14 Thread Mehran Hassani (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491068#comment-15491068
 ] 

Mehran Hassani commented on YARN-5642:
--

It's currently not in our toolset since we find logs from source code that may 
not have the complete sentence. But we will try to detect grammar issues later.

> Typos in 11 log messages 
> -
>
> Key: YARN-5642
> URL: https://issues.apache.org/jira/browse/YARN-5642
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mehran Hassani
>Priority: Trivial
>  Labels: newbie
>
> I am conducting research on log related bugs. I tried to make a tool to fix 
> repetitive yet simple patterns of bugs that are related to logs. Typos in log 
> messages are one of the reoccurring bugs. Therefore, I made a tool find typos 
> in log statements. During my experiments, I managed to find the following 
> typos in Hadoop YARN:
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."), 
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."),  
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java,
>  LOG.info("Completed reading history information of all conatiners"+ " of 
> application attempt " + appAttemptId), 
> conatiners should be containers
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java,
>  LOG.info("Neither virutal-memory nor physical-memory monitoring is " 
> +"needed. Not running the monitor-thread"), 
> virutal should be virtual
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java,
>  LOG.info("Intialized plan {} based on reservable queue {}" plan.toString()  
> planQueueName), 
> Intialized should be Initialized
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java,
>  LOG.info("Initializing " + queueName + "\n" +"capacity = " + 
> queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + 
> "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= 
> parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + 
> queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" 
> +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" 
> [= 1.0 maximumCapacity undefined  " +"(parentAbsoluteMaxCapacity * 
> maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= 
> configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= 
> configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications 
> +" [= configuredMaximumSystemApplicationsPerQueue or" +" 
> (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" 
> +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= 
> (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" 
> +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= 
> usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" 
> +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / 
> clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + 
> maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + 
> "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= 
> (float)(maximumAllocationMemory - minimumAllocationMemory) / " 
> +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + 
> maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " 
> + 

[jira] [Issue Comment Deleted] (YARN-5642) Typos in 11 log messages

2016-09-14 Thread Mehran Hassani (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehran Hassani updated YARN-5642:
-
Comment: was deleted

(was: It's currently not in our toolset since we find logs from source code 
that may not have the complete sentence. But we will try to detect grammar 
issues later.)

> Typos in 11 log messages 
> -
>
> Key: YARN-5642
> URL: https://issues.apache.org/jira/browse/YARN-5642
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mehran Hassani
>Priority: Trivial
>  Labels: newbie
>
> I am conducting research on log related bugs. I tried to make a tool to fix 
> repetitive yet simple patterns of bugs that are related to logs. Typos in log 
> messages are one of the reoccurring bugs. Therefore, I made a tool find typos 
> in log statements. During my experiments, I managed to find the following 
> typos in Hadoop YARN:
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."), 
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."),  
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java,
>  LOG.info("Completed reading history information of all conatiners"+ " of 
> application attempt " + appAttemptId), 
> conatiners should be containers
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java,
>  LOG.info("Neither virutal-memory nor physical-memory monitoring is " 
> +"needed. Not running the monitor-thread"), 
> virutal should be virtual
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java,
>  LOG.info("Intialized plan {} based on reservable queue {}" plan.toString()  
> planQueueName), 
> Intialized should be Initialized
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java,
>  LOG.info("Initializing " + queueName + "\n" +"capacity = " + 
> queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + 
> "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= 
> parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + 
> queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" 
> +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" 
> [= 1.0 maximumCapacity undefined  " +"(parentAbsoluteMaxCapacity * 
> maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= 
> configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= 
> configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications 
> +" [= configuredMaximumSystemApplicationsPerQueue or" +" 
> (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" 
> +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= 
> (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" 
> +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= 
> usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" 
> +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / 
> clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + 
> maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + 
> "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= 
> (float)(maximumAllocationMemory - minimumAllocationMemory) / " 
> +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + 
> maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " 
> + numContainers 

[jira] [Commented] (YARN-5642) Typos in 11 log messages

2016-09-14 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15491055#comment-15491055
 ] 

Ravi Prakash commented on YARN-5642:


Interesting research Mehran! Thanks for your contribution. In addition to 
typos, does your research also discover grammatical errors?

> Typos in 11 log messages 
> -
>
> Key: YARN-5642
> URL: https://issues.apache.org/jira/browse/YARN-5642
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mehran Hassani
>Priority: Trivial
>  Labels: newbie
>
> I am conducting research on log related bugs. I tried to make a tool to fix 
> repetitive yet simple patterns of bugs that are related to logs. Typos in log 
> messages are one of the reoccurring bugs. Therefore, I made a tool find typos 
> in log statements. During my experiments, I managed to find the following 
> typos in Hadoop YARN:
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."), 
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/event/AsyncDispatcher.java,
>  LOG.info("AsyncDispatcher is draining to stop  igonring any new events."), 
> igonring should be ignoring
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/YarnAuthorizationProvider.java,
>  LOG.info(authorizerClass.getName() + " is instiantiated."),  
> instiantiated should be instantiated
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/FileSystemApplicationHistoryStore.java,
>  LOG.info("Completed reading history information of all conatiners"+ " of 
> application attempt " + appAttemptId), 
> conatiners should be containers
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java,
>  LOG.info("Neither virutal-memory nor physical-memory monitoring is " 
> +"needed. Not running the monitor-thread"), 
> virutal should be virtual
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/reservation/AbstractReservationSystem.java,
>  LOG.info("Intialized plan {} based on reservable queue {}" plan.toString()  
> planQueueName), 
> Intialized should be Initialized
> In file 
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java,
>  LOG.info("Initializing " + queueName + "\n" +"capacity = " + 
> queueCapacities.getCapacity() +" [= (float) configuredCapacity / 100 ]" + 
> "\n" +"asboluteCapacity = " + queueCapacities.getAbsoluteCapacity() +" [= 
> parentAbsoluteCapacity * capacity ]" + "\n" +"maxCapacity = " + 
> queueCapacities.getMaximumCapacity() +" [= configuredMaxCapacity ]" + "\n" 
> +"absoluteMaxCapacity = " + queueCapacities.getAbsoluteMaximumCapacity() +" 
> [= 1.0 maximumCapacity undefined  " +"(parentAbsoluteMaxCapacity * 
> maximumCapacity) / 100 otherwise ]" +"\n" +"userLimit = " + userLimit +" [= 
> configuredUserLimit ]" + "\n" +"userLimitFactor = " + userLimitFactor +" [= 
> configuredUserLimitFactor ]" + "\n" +"maxApplications = " + maxApplications 
> +" [= configuredMaximumSystemApplicationsPerQueue or" +" 
> (int)(configuredMaximumSystemApplications * absoluteCapacity)]" +"\n" 
> +"maxApplicationsPerUser = " + maxApplicationsPerUser +" [= 
> (int)(maxApplications * (userLimit / 100.0f) * " +"userLimitFactor) ]" + "\n" 
> +"usedCapacity = " + queueCapacities.getUsedCapacity() +" [= 
> usedResourcesMemory / " +"(clusterResourceMemory * absoluteCapacity)]" + "\n" 
> +"absoluteUsedCapacity = " + absoluteUsedCapacity +" [= usedResourcesMemory / 
> clusterResourceMemory]" + "\n" +"maxAMResourcePerQueuePercent = " + 
> maxAMResourcePerQueuePercent +" [= configuredMaximumAMResourcePercent ]" + 
> "\n" +"minimumAllocationFactor = " + minimumAllocationFactor +" [= 
> (float)(maximumAllocationMemory - minimumAllocationMemory) / " 
> +"maximumAllocationMemory ]" + "\n" +"maximumAllocation = " + 
> maximumAllocation +" [= configuredMaxAllocation ]" + "\n" +"numContainers = " 
> + numContainers +" [= 

[jira] [Commented] (YARN-5649) Add REST endpoints for updating application timeouts

2016-09-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490940#comment-15490940
 ] 

Sunil G commented on YARN-5649:
---

Thanks [~rohithsharma] for raising the same.

We also need read-only getter api's here. This can help in having a better UI 
w.r.t timeout

For eg, if we can have some thing like below
{code}
ApplicationTimeout 
 ==>  configuredTimeout
 ==>  timeToExpire
{code}

This can help us getting a picture how much more time is pending for app to 
become expire if it is not completed. Thoughts?




> Add REST endpoints for updating application timeouts
> 
>
> Key: YARN-5649
> URL: https://issues.apache.org/jira/browse/YARN-5649
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490929#comment-15490929
 ] 

Sunil G commented on YARN-5611:
---

Yes..

Adding few more thoughts:

- It makes sense to have this class for dynamic/runtime updates. So if we want 
to change the priority/timeout of an app, then a common 
{{ApplicationAttributes}} class can address all such common app related 
attributes.
- However during submission time, its better to have these api's separately for 
priority/timeout. WE use existing {{setPriority}} api to achieve this, and 
hence its better to have these apis separately in AppSubmissionContext.
- For REST api, we have {{updateApplicationPriority}} to set priority of an 
app. Thinking out loud, it will be easier for clients to have separate REST api 
for priority or timeout. So I think we can discuss and confirm about REST side 
changes.

Thoughts.?

> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5635) Better handling when bad script is configured as Node's HealthScript

2016-09-14 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490825#comment-15490825
 ] 

Vinod Kumar Vavilapalli commented on YARN-5635:
---

Sure, you can respond instead by not unilaterally reverting patches in the 
future. And everyone will be happy.

> Better handling when bad script is configured as Node's HealthScript
> 
>
> Key: YARN-5635
> URL: https://issues.apache.org/jira/browse/YARN-5635
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Allen Wittenauer
>Assignee: Yufei Gu
>
> Earlier fix to YARN-5567 is reverted because its not ideal to get the whole 
> cluster down because of a bad script. At the same time its important to 
> report that script is erroneous which is configured as node health script as 
> it might miss to detect bad health of a node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-5619) Provide way to limit MRJob's stdout/stderr size

2016-09-14 Thread Aleksandr Balitsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Balitsky resolved YARN-5619.
--
Resolution: Duplicate

> Provide way to limit MRJob's stdout/stderr size
> ---
>
> Key: YARN-5619
> URL: https://issues.apache.org/jira/browse/YARN-5619
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: log-aggregation, nodemanager
>Affects Versions: 2.7.0
>Reporter: Aleksandr Balitsky
>Priority: Minor
>
> We can run job with huge amount of stdout/stderr and causing undesired 
> consequence. There is already a Jira which is been open for while now:
> https://issues.apache.org/jira/browse/YARN-2231
> The possible solution is to redirect Stdout's and Stderr's output to log4j in 
> YarnChild.java main method via commands:
> System.setErr( new PrintStream( new LoggingOutputStream( , 
> Level.ERROR ), true));
> System.setOut( new PrintStream( new LoggingOutputStream( , 
> Level.INFO ), true));
> In this case System.out and System.err will be redirected to log4j logger 
> with appropriate appender that will direct output to stderr or stdout files 
> with needed size limitation. 
> Advantages of such solution:
> - it allows us to restrict file sizes during job execution.
> Disadvantages:
> - It will work only for MRs jobs.
> - logs are stored in memory and are flushed on disk only after job's 
> finishing (syslog works the same way) - we can loose logs if container will 
> be killed or failed.
> Is it appropriate solution for solving this problem, or is there something 
> better?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-09-14 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490671#comment-15490671
 ] 

Jason Lowe commented on YARN-5545:
--

The problem with changing queues to use the max apps conf directly is that it 
becomes more difficult for admins to control the overall memory pressure on the 
RM from pending apps.  IIUC after that change each queue would be able to hold 
up to the system max-apps number of apps.  So each time an admin adds a queue 
it piles on another potential max-apps amount of apps the RM could be tracking 
in total.  Or if the admin increases the max-apps number by X it actually 
increases the total RM app storage by Q*X, where Q is the number of leaf queues.

That's quite different than what happens today and is a significant behavior 
change.  If we go down this route then I think we should have a separate 
top-level config that, when set, specifies the default max-apps per queue 
explicitly rather than having them try to derive it based on relative 
capacities.  We can then debate whether that also overrides the system-wide 
setting or if we still respect the system-wide limit (i.e.: queue may reject an 
app submission not because it hit the queue's max apps limit but because the RM 
hit the system-wide apps limit).  Going with a separate, new config means we 
can preserve backwards compatibility for those who have become accustomed to 
the existing behavior and no surprises when admins use their old configs on the 
new software.

I think max-am-resource-percent is a red herring with respect to the max apps 
discussion.  max-am-resource-percent only controls how many _active_ 
applications there are in a queue, and max apps is controlling the total number 
of apps in the queue.  In fact I wouldn't be surprised if the code doesn't 
check and an admin could configure the RM to allow more active apps than the 
total number of apps the queue is allowed to have at any time.


> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> 

[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490568#comment-15490568
 ] 

Sunil G commented on YARN-5611:
---

Yes [~rohithsharma].
I also thought in same line.. Then i had seen in some systems where they fill 
the non-changed values instead of null.

But if we can decide this approach make sense, i think its ok.

> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5540) scheduler spends too much time looking at empty priorities

2016-09-14 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490557#comment-15490557
 ] 

Jason Lowe commented on YARN-5540:
--

The two test failures appear to be unrelated.  Filed YARN-5652 for the 
TestRMAdminService failure and YARN-5653 for the 
TestNodeLabelContainerAllocation failure.

> scheduler spends too much time looking at empty priorities
> --
>
> Key: YARN-5540
> URL: https://issues.apache.org/jira/browse/YARN-5540
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: capacity scheduler, fairscheduler, resourcemanager
>Affects Versions: 2.7.2
>Reporter: Nathan Roberts
>Assignee: Jason Lowe
> Attachments: YARN-5540.001.patch, YARN-5540.002.patch, 
> YARN-5540.003.patch, YARN-5540.004.patch
>
>
> We're starting to see the capacity scheduler run out of scheduling horsepower 
> when running 500-1000 applications on clusters with 4K nodes or so.
> This seems to be amplified by TEZ applications. TEZ applications have many 
> more priorities (sometimes in the hundreds) than typical MR applications and 
> therefore the loop in the scheduler which examines every priority within 
> every running application, starts to be a hotspot. The priorities appear to 
> stay around forever, even when there is no remaining resource request at that 
> priority causing us to spend a lot of time looking at nothing.
> jstack snippet:
> {noformat}
> "ResourceManager Event Processor" #28 prio=5 os_prio=0 tid=0x7fc2d453e800 
> nid=0x22f3 runnable [0x7fc2a8be2000]
>java.lang.Thread.State: RUNNABLE
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerApplicationAttempt.getResourceRequest(SchedulerApplicationAttempt.java:210)
> - eliminated <0x0005e73e5dc0> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.assignContainers(LeafQueue.java:852)
> - locked <0x0005e73e5dc0> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp)
> - locked <0x0003006fcf60> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainersToChildQueues(ParentQueue.java:527)
> - locked <0x0003001b22f8> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.assignContainers(ParentQueue.java:415)
> - locked <0x0003001b22f8> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:1224)
> - locked <0x000300041e40> (a 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5653) testNonLabeledResourceRequestGetPreferrenceToNonLabeledNode fails intermittently

2016-09-14 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-5653:


 Summary: 
testNonLabeledResourceRequestGetPreferrenceToNonLabeledNode fails intermittently
 Key: YARN-5653
 URL: https://issues.apache.org/jira/browse/YARN-5653
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe


Saw the following TestNodeLabelContainerAllocation failure in a recent 
precommit:
{noformat}
Running 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
Tests run: 19, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 113.791 sec 
<<< FAILURE! - in 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
testNonLabeledResourceRequestGetPreferrenceToNonLabeledNode(org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation)
  Time elapsed: 0.266 sec  <<< FAILURE!
java.lang.AssertionError: expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation.checkLaunchedContainerNumOnNode(TestNodeLabelContainerAllocation.java:562)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation.testNonLabeledResourceRequestGetPreferrenceToNonLabeledNode(TestNodeLabelContainerAllocation.java:842)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2016-09-14 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490550#comment-15490550
 ] 

Naganarasimha G R commented on YARN-5547:
-

Thanks [~jlowe] for conclusion,
Agree having a table is better than having a suffix to a key to identify 
characteristics about the key and +1 for *"unrecognized=kill implementation"* 
as focus of this jira

> NMLeveldbStateStore should be more tolerant of unknown keys
> ---
>
> Key: YARN-5547
> URL: https://issues.apache.org/jira/browse/YARN-5547
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Ajith S
> Attachments: YARN-5547.01.patch
>
>
> Whenever new keys are added to the NM state store it will break rolling 
> downgrades because the code will throw if it encounters an unrecognized key.  
> If instead it skipped unrecognized keys it could be simpler to continue 
> supporting rolling downgrades.  We need to define the semantics of 
> unrecognized keys when containers and apps are cleaned up, e.g.: we may want 
> to delete all keys underneath an app or container directory when it is being 
> removed from the state store to prevent leaking unrecognized keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5652) testRefreshNodesResourceWithResourceReturnInRegistration fails intermittently

2016-09-14 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490519#comment-15490519
 ] 

Jason Lowe commented on YARN-5652:
--

Looks closely related to YARN-4893 and YARN-5318.  YARN-4893 added a 
drainEvents to MockRM's registerNode, but in this case the test is calling 
MockNM's registerNode method which appears to have no corresponding drainEvents 
invocation.

> testRefreshNodesResourceWithResourceReturnInRegistration fails intermittently
> -
>
> Key: YARN-5652
> URL: https://issues.apache.org/jira/browse/YARN-5652
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Jason Lowe
>
> Saw the following in a recent precommit:
> {noformat}
> Running org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
> Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 18.639 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
> testRefreshNodesResourceWithResourceReturnInRegistration(org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService)
>   Time elapsed: 0.763 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<> but 
> was:<>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testRefreshNodesResourceWithResourceReturnInRegistration(TestRMAdminService.java:286)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5652) testRefreshNodesResourceWithResourceReturnInRegistration fails intermittently

2016-09-14 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-5652:


 Summary: testRefreshNodesResourceWithResourceReturnInRegistration 
fails intermittently
 Key: YARN-5652
 URL: https://issues.apache.org/jira/browse/YARN-5652
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Jason Lowe


Saw the following in a recent precommit:
{noformat}
Running org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 18.639 sec <<< 
FAILURE! - in org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService
testRefreshNodesResourceWithResourceReturnInRegistration(org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService)
  Time elapsed: 0.763 sec  <<< FAILURE!
org.junit.ComparisonFailure: expected:<> but 
was:<>
at org.junit.Assert.assertEquals(Assert.java:115)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService.testRefreshNodesResourceWithResourceReturnInRegistration(TestRMAdminService.java:286)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5651) Changes to NMStateStore to persist reinitialization and rollback state

2016-09-14 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-5651:
-

 Summary: Changes to NMStateStore to persist reinitialization and 
rollback state
 Key: YARN-5651
 URL: https://issues.apache.org/jira/browse/YARN-5651
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Arun Suresh
Assignee: Arun Suresh






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2016-09-14 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490475#comment-15490475
 ] 

Jason Lowe commented on YARN-5547:
--

To be clear, the skipping containers during recovery is _never_ the right thing 
to do, so that's not really a valid option.

As I understand it, the current proposal is this:
* Add a new table to the state store that will contain a list of container keys 
that need compatibility processing when performing rolling downgrades
* Each key in that table will have a descriptor associated with it that will 
indicate how the recovery of the corresponding container needs to be handled.  
Options include:
** Killing the corresponding container
** Removing the key and recovering the container normally
* Any unrecognized container key that is not described in the table will cause 
the corresponding container to be killed during recovery.

We don't have to implement the entire thing in this JIRA.  We could do the 
unrecognized=kill implementation first then add the table of keys feature in a 
subsequent JIRA.

> NMLeveldbStateStore should be more tolerant of unknown keys
> ---
>
> Key: YARN-5547
> URL: https://issues.apache.org/jira/browse/YARN-5547
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Ajith S
> Attachments: YARN-5547.01.patch
>
>
> Whenever new keys are added to the NM state store it will break rolling 
> downgrades because the code will throw if it encounters an unrecognized key.  
> If instead it skipped unrecognized keys it could be simpler to continue 
> supporting rolling downgrades.  We need to define the semantics of 
> unrecognized keys when containers and apps are cleaned up, e.g.: we may want 
> to delete all keys underneath an app or container directory when it is being 
> removed from the state store to prevent leaking unrecognized keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490435#comment-15490435
 ] 

Rohith Sharma K S commented on YARN-5611:
-

I think from server end, this can be handled without any issue like below.
if(priority!-null) doPriorityUpdate
if(ApplicationTimeouts!=null) doApplicationTimeoutsUpdate

I think we need to check from client end, and also if we plan to do this 
approach then these changes should go to branch-2.8 before it get released.  

> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2016-09-14 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490429#comment-15490429
 ] 

Ajith S commented on YARN-5547:
---

As per offline discussion with [~Naganarasimha Garla] and [~varun_saxena] for 
{{If the old software could consult a table in the database that lists what 
keys are ignorable then it can fail for any unrecognized key that isn't in that 
list and safely ignore ones that are}} we can add a suffix for the keys if they 
are ignoreable so that even lower version will know if keys can be skipped 
safely

> NMLeveldbStateStore should be more tolerant of unknown keys
> ---
>
> Key: YARN-5547
> URL: https://issues.apache.org/jira/browse/YARN-5547
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Ajith S
> Attachments: YARN-5547.01.patch
>
>
> Whenever new keys are added to the NM state store it will break rolling 
> downgrades because the code will throw if it encounters an unrecognized key.  
> If instead it skipped unrecognized keys it could be simpler to continue 
> supporting rolling downgrades.  We need to define the semantics of 
> unrecognized keys when containers and apps are cleaned up, e.g.: we may want 
> to delete all keys underneath an app or container directory when it is being 
> removed from the state store to prevent leaking unrecognized keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR

2016-09-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490421#comment-15490421
 ] 

Hadoop QA commented on YARN-5145:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
17s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 0m 52s {color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9baccb9 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828454/YARN-5145-YARN-3368.01.patch
 |
| JIRA Issue | YARN-5145 |
| Optional Tests |  asflicense  |
| uname | Linux 835b92b4c31c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | YARN-3368 / 9ef8291 |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13106/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
> -
>
> Key: YARN-5145
> URL: https://issues.apache.org/jira/browse/YARN-5145
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
> Attachments: YARN-5145-YARN-3368.01.patch
>
>
> Existing YARN UI configuration is under Hadoop package's directory: 
> $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to 
> $HADOOP_CONF_DIR like other configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490403#comment-15490403
 ] 

Sunil G commented on YARN-5611:
---

Make sense to me.. But some of the options may not have any relation with each 
other while doing update at runtime.

For example, we may have changed only app priority while app is running. But we 
may end up checking app timeout and other params to dynamically update. 
Thoughts?

> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit

2016-09-14 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-5637:
--
Attachment: YARN-5637.003.patch

Updating patch.. thanks again for the review [~jianhe]..

bq. Here, we could make reInitEvent.getResourceSet() be merged with existing 
resourceSet.localizedResource upfront, so that both oldResourceSet and 
newResourceSet contain full copy of resources, rather than delta.
This was actually intentional. Consider the case where the original process has 
many resources to Localize but the upgrade launch script just needs a binary 
change in addition to the existing resources. If the resourceSets were merged 
upfront, then in the _ReInitializeContainerTransition_, the 
_ContainerLocalizationRequestEvent_ that gets sent would include ALL the 
resources, instead of just the single resource. The Container will have to 
remain in the *REINITIALIZING* state till it receives _RESOURCE_LOCALIZED_ 
events for all the resources in the combined resultset before being able to 
launch.

bq. the container.reInitContext!= null check is not needed..
Think we do, else it might cause an NPE when the _LaunchTransition_ happens as 
part of the initial container startup.

bq. I found the resourceSet is also not updated when rollback in 
RetryFailureTransition
Good catch... I also like your refactoring.. i've incorporated it in the latest 
patch

> Changes in NodeManager to support Container upgrade and rollback/commit
> ---
>
> Key: YARN-5637
> URL: https://issues.apache.org/jira/browse/YARN-5637
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5637.001.patch, YARN-5637.002.patch, 
> YARN-5637.003.patch
>
>
> YARN-5620 added support for re-initialization of Containers using a new 
> launch Context.
> This JIRA proposes to use the above feature to support upgrade and subsequent 
> rollback or commit of the upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5145) [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR

2016-09-14 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated YARN-5145:
-
Attachment: YARN-5145-YARN-3368.01.patch

> [YARN-3368] Move new YARN UI configuration to HADOOP_CONF_DIR
> -
>
> Key: YARN-5145
> URL: https://issues.apache.org/jira/browse/YARN-5145
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
> Attachments: YARN-5145-YARN-3368.01.patch
>
>
> Existing YARN UI configuration is under Hadoop package's directory: 
> $HADOOP_PREFIX/share/hadoop/yarn/webapps/, we should move it to 
> $HADOOP_CONF_DIR like other configurations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5547) NMLeveldbStateStore should be more tolerant of unknown keys

2016-09-14 Thread Ajith S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490385#comment-15490385
 ] 

Ajith S commented on YARN-5547:
---

So we have two approaches discussed here
1. Either skip container recovery - this will cause unmonitered containers
2. Container killed/failed

I am ok with second approach, but as per [~jlowe] {{The NM has to unregister 
with a service as part of the container failure}} i don't see any solution for 
such scenario. If this case we can handle separately, i can update patch based 
on second approach

> NMLeveldbStateStore should be more tolerant of unknown keys
> ---
>
> Key: YARN-5547
> URL: https://issues.apache.org/jira/browse/YARN-5547
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.6.0
>Reporter: Jason Lowe
>Assignee: Ajith S
> Attachments: YARN-5547.01.patch
>
>
> Whenever new keys are added to the NM state store it will break rolling 
> downgrades because the code will throw if it encounters an unrecognized key.  
> If instead it skipped unrecognized keys it could be simpler to continue 
> supporting rolling downgrades.  We need to define the semantics of 
> unrecognized keys when containers and apps are cleaned up, e.g.: we may want 
> to delete all keys underneath an app or container directory when it is being 
> removed from the state store to prevent leaking unrecognized keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-09-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490259#comment-15490259
 ] 

Hadoop QA commented on YARN-5561:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 
11s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
31s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
18s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 12s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice:
 The patch generated 18 new + 19 unchanged - 0 fixed = 37 total (was 19) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 20s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
11s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
35s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 13s 
{color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s 
{color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
15s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 14s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828437/YARN-5561.02.patch |
| JIRA Issue | YARN-5561 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux a1099601e42f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / ea0c2b8 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13105/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/13105/artifact/patchprocess/patch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13105/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13105/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org 

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-09-14 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490212#comment-15490212
 ] 

Rohith Sharma K S commented on YARN-5561:
-

As of now /apps cab be dropped since apps can be retrieved using flows. Based 
on the use case from UI let us decide the /apps at cluster level.

> [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and 
> entities via REST
> ---
>
> Key: YARN-5561
> URL: https://issues.apache.org/jira/browse/YARN-5561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5561.02.patch, YARN-5561.patch, YARN-5561.v0.patch
>
>
> ATSv2 model lacks retrieval of {{list-of-all-apps}}, 
> {{list-of-all-app-attempts}} and {{list-of-all-containers-per-attempt}} via 
> REST API's. And also it is required to know about all the entities in an 
> applications.
> It is pretty much highly required these URLs for Web  UI.
> New REST URL would be 
> # GET {{/ws/v2/timeline/apps}}
> # GET {{/ws/v2/timeline/apps/\{app-id\}/appattempts}}.
> # GET 
> {{/ws/v2/timeline/apps/\{app-id\}/appattempts/\{attempt-id\}/containers}}
> # GET {{/ws/v2/timeline/apps/\{app id\}/entities}} should display list of 
> entities that can be queried.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-09-14 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-5561:

Attachment: YARN-5561.02.patch

Updated the patch adding 2 more REST endpoints. 
One small difference is there in the REST endpoints for container i.e instead 
of 
*/clusters/\{clusterid\}/apps/\{appid\}/appattempts/\{appattemptid\}/containers/\{contianer-id\}*,
  I have changed to 
*/clusters/\{clusterid\}/apps/\{appid\}/containers/\{contianer-id\}*.

> [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and 
> entities via REST
> ---
>
> Key: YARN-5561
> URL: https://issues.apache.org/jira/browse/YARN-5561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5561.02.patch, YARN-5561.patch, YARN-5561.v0.patch
>
>
> ATSv2 model lacks retrieval of {{list-of-all-apps}}, 
> {{list-of-all-app-attempts}} and {{list-of-all-containers-per-attempt}} via 
> REST API's. And also it is required to know about all the entities in an 
> applications.
> It is pretty much highly required these URLs for Web  UI.
> New REST URL would be 
> # GET {{/ws/v2/timeline/apps}}
> # GET {{/ws/v2/timeline/apps/\{app-id\}/appattempts}}.
> # GET 
> {{/ws/v2/timeline/apps/\{app-id\}/appattempts/\{attempt-id\}/containers}}
> # GET {{/ws/v2/timeline/apps/\{app id\}/entities}} should display list of 
> entities that can be queried.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext

2016-09-14 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490181#comment-15490181
 ] 

Arun Suresh commented on YARN-5620:
---

The previous Jenkins 
[run|https://issues.apache.org/jira/browse/YARN-5620?focusedCommentId=15488723=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15488723]
 looks bad because I had manually cancelled it.
The 
[run|https://issues.apache.org/jira/browse/YARN-5620?focusedCommentId=15488722=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15488722]
 before that ran fine with the same patch.

The testcase failure is not related.
[~jianhe] / [~vvasudev] .. if you are fine with the latest patch, can we push 
this into trunk so we can start jenkins against YARN-5637 ?


> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-09-14 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490125#comment-15490125
 ] 

Bibin A Chundatt commented on YARN-5545:


{quote}
we could only keep only yarn.scheduler.capacity.maximum-applications at system 
level. We could avoid configuring maximum-applications per queue. 
{quote}
We should not remove maximum-applications per queue . Only when max application 
per queue is not configured the system level application has impact as per 
current implementation no need to change the same.

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
>   ... 25 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: 

[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490037#comment-15490037
 ] 

Rohith Sharma K S commented on YARN-5611:
-

+1 for the suggestion. This make sense to me.

> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489888#comment-15489888
 ] 

Jian He commented on YARN-5611:
---

Things like priority, timeout, node-label are attributes of application, 
I'm thinking whether makes sense to have a single API which incorporates these 
updates. That way, we don't need to add new API method again and again.
cc [~sunilg]

> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5163) Fix TestClientToAMTokens and TestClientRMTokens

2016-09-14 Thread Wei Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489783#comment-15489783
 ] 

Wei Zhou commented on YARN-5163:


There is no need to add a javadoc for the checkstyle issue reported.
The unit test failure was caused by an timeout, it works alright in local 
environment. Thanks!

> Fix TestClientToAMTokens and TestClientRMTokens
> ---
>
> Key: YARN-5163
> URL: https://issues.apache.org/jira/browse/YARN-5163
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Arun Suresh
>Assignee: Wei Zhou
> Attachments: YARN-5163-v1.patch, YARN-5163-v2.patch
>
>
> The above testcases fail due to a {{NullpointerException}} and a Cannot bind 
> to port.. Both of which should be fixed..



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4205) Add a service for monitoring application life time out

2016-09-14 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489769#comment-15489769
 ] 

Jian He commented on YARN-4205:
---

latest patch looks good to me.

> Add a service for monitoring application life time out
> --
>
> Key: YARN-4205
> URL: https://issues.apache.org/jira/browse/YARN-4205
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Reporter: nijel
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-4205.patch, 0002-YARN-4205.patch, 
> 0003-YARN-4205.patch, 0004-YARN-4205.patch, YARN-4205_01.patch, 
> YARN-4205_02.patch, YARN-4205_03.patch
>
>
> This JIRA intend to provide a lifetime monitor service. 
> The service will monitor the applications where the life time is configured. 
> If the application is running beyond the lifetime, it will be killed. 
> The lifetime will be considered from the submit time.
> The thread monitoring interval is configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5637) Changes in NodeManager to support Container upgrade and rollback/commit

2016-09-14 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489728#comment-15489728
 ] 

Jian He commented on YARN-5637:
---

Thanks Arun, some more comments:
- Here, we could make reInitEvent.getResourceSet() be merged with existing 
resourceSet.localizedResource upfront, so that both oldResourceSet and 
newResourceSet contain full copy of resources, rather than delta. Doing this, 
the logic of {{container.resourceSet = 
container.reInitContext.mergedResourceSet();}} will not needed. We can simply 
set it with {{container.resourceSet = reInitContext.newResoureSet}}, similar to 
what’s being done for {{container.launchContext = 
reInitContext.newLaunchContext}}
{code}
return new ReInitializationContext(reInitEvent.getReInitLaunchContext(),
    reInitEvent.getResourceSet(), container.getLaunchContext(),
    container.resourceSet, reInitEvent.getRetryFailureContext(), 
    reInitEvent.isAutoCommit());

{code}
- nit:  the container.reInitContext!= null check is not needed.
{code}
if (container.reInitContext != null 
    && container.reInitContext.autoCommit) {
{code}

- I found the resourceSet is also not updated when rollback in 
RetryFailureTransition, I also tried some refactoring, may be something like 
below:
{code}
  ContainerRetryContext retryContext = container.containerRetryContext;
  int remainingAttempts = container.remainingRetryAttempts;
  if (container.reInitContext != null) {
retryContext = container.reInitContext.retryOnFailueContext;
remainingAttempts = container.reInitContext.retryAttemptsRemaining;
  }

  if (shouldRetry(container.exitCode, retryContext,remainingAttempts)) {
// TODO state-store operation
doRelaunch(container, container.remainingRetryAttempts,
container.containerRetryContext.getRetryInterval());
  } else if (container.canRollback()) {
// rollback
container.reInitContext = new ReInitializationContext(
container.reInitContext.oldLaunchContext,
container.reInitContext.oldResourceSet, null, null,
container.containerRetryContext, true);
new KilledExternallyForReInitTransition().transition(container, event);
  } else {
// fail
new ExitedWithFailureTransition(true).transition(container, event);
return ContainerState.EXITED_WITH_FAILURE;
  }
}

  public static boolean shouldRetry(int errorCode,
  ContainerRetryContext retryContext, int remainingRetryAttempts) {
if (retryContext == null) {
  return false;
}
  .
{code}

- testContainerUpgradeRollbackDueToFailure: comment does not match code
{code}
// Wait for new processStartfile to be created
while (!oldStartFile.exists() && timeoutSecs++ < 20) {
{code}

> Changes in NodeManager to support Container upgrade and rollback/commit
> ---
>
> Key: YARN-5637
> URL: https://issues.apache.org/jira/browse/YARN-5637
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5637.001.patch, YARN-5637.002.patch
>
>
> YARN-5620 added support for re-initialization of Containers using a new 
> launch Context.
> This JIRA proposes to use the above feature to support upgrade and subsequent 
> rollback or commit of the upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5650) Add CLI endpoints for updating application timeouts

2016-09-14 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-5650:
---

 Summary: Add CLI endpoints for updating application timeouts
 Key: YARN-5650
 URL: https://issues.apache.org/jira/browse/YARN-5650
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5649) Add REST endpoints for updating application timeouts

2016-09-14 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-5649:
---

 Summary: Add REST endpoints for updating application timeouts
 Key: YARN-5649
 URL: https://issues.apache.org/jira/browse/YARN-5649
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Rohith Sharma K S
Assignee: Rohith Sharma K S






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489717#comment-15489717
 ] 

Hadoop QA commented on YARN-5611:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s {color} 
| {color:red} YARN-5611 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828402/YARN-5611.v0.patch |
| JIRA Issue | YARN-5611 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13104/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5163) Fix TestClientToAMTokens and TestClientRMTokens

2016-09-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489695#comment-15489695
 ] 

Hadoop QA commented on YARN-5163:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 30s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
49s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 7s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
41s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 6s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 0s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s 
{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
26s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 8s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 8s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 51s 
{color} | {color:red} root: The patch generated 1 new + 27 unchanged - 5 fixed 
= 28 total (was 32) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 4s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
40s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
15s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 39s 
{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 51s {color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
23s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 100m 37s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestNodeBlacklistingOnAMFailures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12828392/YARN-5163-v2.patch |
| JIRA Issue | YARN-5163 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 79bec4770fb4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / ea0c2b8 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13103/artifact/patchprocess/diff-checkstyle-root.txt
 |
| unit | 

[jira] [Commented] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489649#comment-15489649
 ] 

Rohith Sharma K S commented on YARN-5611:
-

Attached v0 patch for API support. Some points need to be discussed for the 
scenario such as
# does update timeout should adds up for current timeout or replace it? I 
basically prefer to adds to the registered applications.
# If update timeout request for unregistered application, should ignore the 
request or register from now?

> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5611) Provide an API to update lifetime of an application.

2016-09-14 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-5611:

Attachment: YARN-5611.v0.patch

> Provide an API to update lifetime of an application.
> 
>
> Key: YARN-5611
> URL: https://issues.apache.org/jira/browse/YARN-5611
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5611.v0.patch
>
>
> YARN-4205 monitors an Lifetime of an applications is monitored if required. 
> Add an client api to update lifetime of an application. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext

2016-09-14 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489593#comment-15489593
 ] 

Jian He commented on YARN-5620:
---

bq. KilledExternallyForReInitTransition we use this function to merge the 
existing ResourceSet and upgraded ResourceSe
I see,  I overlook it's passing both parameters.  thanks

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext

2016-09-14 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489547#comment-15489547
 ] 

Arun Suresh edited comment on YARN-5620 at 9/14/16 6:13 AM:


[~jianhe], I intentionally wanted to keep the _merge_ function as a static 
utility function. I would prefer not adding a new constructor... Also, if you 
notice in the _KilledExternallyForReInitTransition_ we use this function to 
merge the existing ResourceSet and upgraded ResourceSet and set the value of 
container.resourceSet to the merged RS.
I can change it if your are particular... but would prefer to keep it as such.


was (Author: asuresh):
[~jianhe], I intentionally wanted to keep the _merge_ function as a static 
utility function. I would prefer not adding a new constructor.. if you are ok 
with it.

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5620) Core changes in NodeManager to support re-initialization of Containers with new launchContext

2016-09-14 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15489547#comment-15489547
 ] 

Arun Suresh commented on YARN-5620:
---

[~jianhe], I intentionally wanted to keep the _merge_ function as a static 
utility function. I would prefer not adding a new constructor.. if you are ok 
with it.

> Core changes in NodeManager to support re-initialization of Containers with 
> new launchContext
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch, 
> YARN-5620.009.patch, YARN-5620.010.patch, YARN-5620.011.patch, 
> YARN-5620.012.patch, YARN-5620.013.patch, YARN-5620.014.patch, 
> YARN-5620.015.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org