[jira] [Commented] (YARN-3558) Additional containers getting reserved from RM in case of Fair scheduler

2015-05-20 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551884#comment-14551884
 ] 

Sunil G commented on YARN-3558:
---

Hi [~bibinchundatt]
Could you please upload the RM logs.

 Additional containers getting reserved from RM in case of Fair scheduler
 

 Key: YARN-3558
 URL: https://issues.apache.org/jira/browse/YARN-3558
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, resourcemanager
Affects Versions: 2.7.0
 Environment: OS :Suse 11 Sp3
 Setup : 2 RM 2 NM
 Scheduler : Fair scheduler
Reporter: Bibin A Chundatt

 Submit PI job with 16 maps
 Total container expected : 16 MAPS + 1 Reduce  + 1 AM
 Total containers reserved by RM is 21
 Below set of containers are not being used for execution
 container_1430213948957_0001_01_20
 container_1430213948957_0001_01_19
 RM Containers reservation and states
 {code}
  Processing container_1430213948957_0001_01_01 of type START
  Processing container_1430213948957_0001_01_01 of type ACQUIRED
  Processing container_1430213948957_0001_01_01 of type LAUNCHED
  Processing container_1430213948957_0001_01_02 of type START
  Processing container_1430213948957_0001_01_03 of type START
  Processing container_1430213948957_0001_01_02 of type ACQUIRED
  Processing container_1430213948957_0001_01_03 of type ACQUIRED
  Processing container_1430213948957_0001_01_04 of type START
  Processing container_1430213948957_0001_01_05 of type START
  Processing container_1430213948957_0001_01_04 of type ACQUIRED
  Processing container_1430213948957_0001_01_05 of type ACQUIRED
  Processing container_1430213948957_0001_01_02 of type LAUNCHED
  Processing container_1430213948957_0001_01_04 of type LAUNCHED
  Processing container_1430213948957_0001_01_06 of type RESERVED
  Processing container_1430213948957_0001_01_03 of type LAUNCHED
  Processing container_1430213948957_0001_01_05 of type LAUNCHED
  Processing container_1430213948957_0001_01_07 of type START
  Processing container_1430213948957_0001_01_07 of type ACQUIRED
  Processing container_1430213948957_0001_01_07 of type LAUNCHED
  Processing container_1430213948957_0001_01_08 of type RESERVED
  Processing container_1430213948957_0001_01_02 of type FINISHED
  Processing container_1430213948957_0001_01_06 of type START
  Processing container_1430213948957_0001_01_06 of type ACQUIRED
  Processing container_1430213948957_0001_01_06 of type LAUNCHED
  Processing container_1430213948957_0001_01_04 of type FINISHED
  Processing container_1430213948957_0001_01_09 of type START
  Processing container_1430213948957_0001_01_09 of type ACQUIRED
  Processing container_1430213948957_0001_01_09 of type LAUNCHED
  Processing container_1430213948957_0001_01_10 of type RESERVED
  Processing container_1430213948957_0001_01_03 of type FINISHED
  Processing container_1430213948957_0001_01_08 of type START
  Processing container_1430213948957_0001_01_08 of type ACQUIRED
  Processing container_1430213948957_0001_01_08 of type LAUNCHED
  Processing container_1430213948957_0001_01_05 of type FINISHED
  Processing container_1430213948957_0001_01_11 of type START
  Processing container_1430213948957_0001_01_11 of type ACQUIRED
  Processing container_1430213948957_0001_01_11 of type LAUNCHED
  Processing container_1430213948957_0001_01_07 of type FINISHED
  Processing container_1430213948957_0001_01_12 of type START
  Processing container_1430213948957_0001_01_12 of type ACQUIRED
  Processing container_1430213948957_0001_01_12 of type LAUNCHED
  Processing container_1430213948957_0001_01_13 of type RESERVED
  Processing container_1430213948957_0001_01_06 of type FINISHED
  Processing container_1430213948957_0001_01_10 of type START
  Processing container_1430213948957_0001_01_10 of type ACQUIRED
  Processing container_1430213948957_0001_01_10 of type LAUNCHED
  Processing container_1430213948957_0001_01_09 of type FINISHED
  Processing container_1430213948957_0001_01_14 of type START
  Processing container_1430213948957_0001_01_14 of type ACQUIRED
  Processing container_1430213948957_0001_01_14 of type LAUNCHED
  Processing container_1430213948957_0001_01_15 of type RESERVED
  Processing container_1430213948957_0001_01_08 of type FINISHED
  Processing container_1430213948957_0001_01_13 of type START
  Processing container_1430213948957_0001_01_16 of type RESERVED
  Processing container_1430213948957_0001_01_13 of type ACQUIRED
  Processing container_1430213948957_0001_01_13 of type LAUNCHED
  Processing container_1430213948957_0001_01_11 of 

[jira] [Updated] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-20 Thread Lavkesh Lahngir (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lavkesh Lahngir updated YARN-3591:
--
Attachment: YARN-3591.4.patch

 Resource Localisation on a bad disk causes subsequent containers failure 
 -

 Key: YARN-3591
 URL: https://issues.apache.org/jira/browse/YARN-3591
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Lavkesh Lahngir
Assignee: Lavkesh Lahngir
 Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
 YARN-3591.2.patch, YARN-3591.3.patch, YARN-3591.4.patch


 It happens when a resource is localised on the disk, after localising that 
 disk has gone bad. NM keeps paths for localised resources in memory.  At the 
 time of resource request isResourcePresent(rsrc) will be called which calls 
 file.exists() on the localised path.
 In some cases when disk has gone bad, inodes are stilled cached and 
 file.exists() returns true. But at the time of reading, file will not open.
 Note: file.exists() actually calls stat64 natively which returns true because 
 it was able to find inode information from the OS.
 A proposal is to call file.list() on the parent path of the resource, which 
 will call open() natively. If the disk is good it should return an array of 
 paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551885#comment-14551885
 ] 

Anubhav Dhoot commented on YARN-3675:
-

This fixes the issue where scheduling can happen after the node has been 
removed. Because of this when the application is removed, its will clean up its 
reserved and completed containers. And at that time it will try to call a 
method on the FSSchedulerNode which is null. Here is the trace of the same 
instance as above where it shows the scheduling happening just after the node 
is removed. Looking at continuousSchedulingAttempt we can get the reference to 
the node before we take scheduler lock when calling attemptScheduling.

{noformat}
hadoop-YARN-1-RESOURCEMANAGER-hostname.log.out:2015-05-11 00:27:42,793 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
Removed node nmhostname:8041 
hadoop-YARN-1-RESOURCEMANAGER-hostname.log.out:2015-05-11 00:27:42,793 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Assigned 
container container_e25_1431107530707_159950_01_21 of capacity 
memory:2048, vCores:1 on host nmhostname:8041, which has 1 containers, 
memory:2048, vCores:1 used an
hadoop-YARN-1-RESOURCEMANAGER-hostname.log.out:2015-05-11 00:27:42,796 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt: 
Making reservation: node=nmhostname app_id=application_1431107530707_159852
{noformat}

 FairScheduler: RM quits when node removal races with continousscheduling on 
 the same node
 -

 Key: YARN-3675
 URL: https://issues.apache.org/jira/browse/YARN-3675
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3675.001.patch


 With continuous scheduling, scheduling can be done on a node thats just 
 removed causing errors like below.
 {noformat}
 12:28:53.782 AM FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
 Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
   at java.lang.Thread.run(Thread.java:745)
 12:28:53.783 AMINFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551895#comment-14551895
 ] 

Hadoop QA commented on YARN-3646:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 46s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 43s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   2m 44s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 48s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | common tests |  23m 54s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | yarn tests |   6m 54s | Tests passed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   1m 56s | Tests passed in 
hadoop-yarn-common. |
| | |  73m 55s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734062/YARN-3646.001.patch |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / ce53c8e |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8017/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8017/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8017/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8017/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8017/console |


This message was automatically generated.

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.001.patch, YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 

[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3675:

Attachment: YARN-3675.001.patch

 FairScheduler: RM quits when node removal races with continousscheduling on 
 the same node
 -

 Key: YARN-3675
 URL: https://issues.apache.org/jira/browse/YARN-3675
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3675.001.patch


 With continuous scheduling, scheduling can be done on a node thats just 
 removed causing errors like below.
 {noformat}
 12:28:53.782 AM FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
 Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
   at java.lang.Thread.run(Thread.java:745)
 12:28:53.783 AMINFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551946#comment-14551946
 ] 

Hadoop QA commented on YARN-3675:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 47s | The applied patch generated  1 
new checkstyle issues (total was 74, now 75). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m  8s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 30s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734074/YARN-3675.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8018/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8018/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8018/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8018/console |


This message was automatically generated.

 FairScheduler: RM quits when node removal races with continousscheduling on 
 the same node
 -

 Key: YARN-3675
 URL: https://issues.apache.org/jira/browse/YARN-3675
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3675.001.patch


 With continuous scheduling, scheduling can be done on a node thats just 
 removed causing errors like below.
 {noformat}
 12:28:53.782 AM FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
 Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
   at java.lang.Thread.run(Thread.java:745)
 12:28:53.783 AMINFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3689) FifoComparator logic is wrong. In method compare in FifoPolicy.java file, the s1 and s2 should change position when compare priority

2015-05-20 Thread zhoulinlin (JIRA)
zhoulinlin created YARN-3689:


 Summary: FifoComparator logic is wrong. In method compare in 
FifoPolicy.java file, the s1 and s2 should change position when compare 
priority 
 Key: YARN-3689
 URL: https://issues.apache.org/jira/browse/YARN-3689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, scheduler
Affects Versions: 2.5.0
Reporter: zhoulinlin


In method compare in FifoPolicy.java file, the s1 and s2 should change 
position when compare priority.

I did a test. Configured the schedulerpolicy fifo,  submitted 2 jobs to the 
same queue.
The result is below:
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
before sort --  
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432094103221_0001 appPririty:4  appStartTime:1432094170038
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432094103221_0002 appPririty:2  appStartTime:1432094173131
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: after 
sort % 
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432094103221_0001 appPririty:4  appStartTime:1432094170038 
 
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432094103221_0002 appPririty:2  appStartTime:1432094173131 
 

But when change the s1 and s2 position like below:

public int compare(Schedulable s1, Schedulable s2) {
  int res = s2.getPriority().compareTo(s1.getPriority());
.}

The result:
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
before sort -- 
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432090734333_0009 appPririty:4  appStartTime:1432092992503
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432090734333_0010 appPririty:2  appStartTime:1432092996437
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: after 
sort % 
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432090734333_0010 appPririty:2  appStartTime:1432092996437
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432090734333_0009 appPririty:4  appStartTime:1432092992503 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-20 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3543:
-
Attachment: (was: 0003-YARN-3543.patch)

 ApplicationReport should be able to tell whether the Application is AM 
 managed or not. 
 ---

 Key: YARN-3543
 URL: https://issues.apache.org/jira/browse/YARN-3543
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.6.0
Reporter: Spandan Dutta
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG


 Currently we can know whether the application submitted by the user is AM 
 managed from the applicationSubmissionContext. This can be only done  at the 
 time when the user submits the job. We should have access to this info from 
 the ApplicationReport as well so that we can check whether an app is AM 
 managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3690) 'mvn site' fails on JDK8

2015-05-20 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-3690:
--

Assignee: Brahma Reddy Battula

 'mvn site' fails on JDK8
 

 Key: YARN-3690
 URL: https://issues.apache.org/jira/browse/YARN-3690
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
 Environment: CentOS 7.0, Oracle JDK 8u45.
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula

 'mvn site' failed by the following error:
 {noformat}
 [ERROR] 
 /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
  error: package org.apache.hadoop.yarn.factories has already been annotated
 [ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN })
 [ERROR] ^
 [ERROR] java.lang.AssertionError
 [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
 [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
 [ERROR] at 
 com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
 [ERROR] at 
 com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
 [ERROR] at 
 com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
 [ERROR] at 
 com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
 [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
 [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
 [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
 [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
 [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
 [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
 [ERROR] at 
 com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
 [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
 [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
 [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
 [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
 [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
 [ERROR] javadoc: error - fatal error
 [ERROR] 
 [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc 
 -J-Xmx1024m @options @packages
 [ERROR] 
 [ERROR] Refer to the generated Javadoc files in 
 '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
 [ERROR] - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3690) 'mvn site' fails on JDK8

2015-05-20 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-3690:

Description: 
'mvn site' failed by the following error:
{noformat}
[ERROR] 
/home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
 error: package org.apache.hadoop.yarn.factories has already been annotated
[ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN })
[ERROR] ^
[ERROR] java.lang.AssertionError
[ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
[ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
[ERROR] at 
com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
[ERROR] at 
com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
[ERROR] at 
com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
[ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
[ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
[ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
[ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
[ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
[ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
[ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
[ERROR] at 
com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
[ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
[ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
[ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
[ERROR] javadoc: error - fatal error
[ERROR] 
[ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
{noformat}

  was:
{noformat}
[ERROR] 
/home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
 error: package org.apache.hadoop.yarn.factories has already been annotated
[ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN })
[ERROR] ^
[ERROR] java.lang.AssertionError
[ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
[ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
[ERROR] at 
com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
[ERROR] at 
com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
[ERROR] at 
com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
[ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
[ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
[ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
[ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
[ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
[ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
[ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
[ERROR] at 
com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
[ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
[ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
[ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
[ERROR] javadoc: error - fatal error
[ERROR] 
[ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
{noformat}


 'mvn site' fails on JDK8
 

[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552181#comment-14552181
 ] 

Hudson commented on YARN-3601:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java
* hadoop-yarn-project/CHANGES.txt


 Fix UT TestRMFailover.testRMWebAppRedirect
 --

 Key: YARN-3601
 URL: https://issues.apache.org/jira/browse/YARN-3601
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
 Environment: Red Hat Enterprise Linux Workstation release 6.5 
 (Santiago)
Reporter: Weiwei Yang
Assignee: Weiwei Yang
Priority: Critical
  Labels: test
 Fix For: 2.7.1

 Attachments: YARN-3601.001.patch


 This test case was not working since the commit from YARN-2605. It failed 
 with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552208#comment-14552208
 ] 

Hudson commented on YARN-3565:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto


 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Fix For: 2.8.0

 Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
 YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552209#comment-14552209
 ] 

Hudson commented on YARN-2821:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java


 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
 YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
 apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez3:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez4:45454
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=3
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_03, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_04, 
 containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_05, 
 containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO 

[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552214#comment-14552214
 ] 

Hudson commented on YARN-3677:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java


 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Fix For: 2.7.1

 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552203#comment-14552203
 ] 

Hudson commented on YARN-3583:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java


 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Fix For: 2.8.0

 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch, 0004-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
 using plain label name.
 This will help to bring other label details such as Exclusivity to client 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552204#comment-14552204
 ] 

Hudson commented on YARN-3601:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java


 Fix UT TestRMFailover.testRMWebAppRedirect
 --

 Key: YARN-3601
 URL: https://issues.apache.org/jira/browse/YARN-3601
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
 Environment: Red Hat Enterprise Linux Workstation release 6.5 
 (Santiago)
Reporter: Weiwei Yang
Assignee: Weiwei Yang
Priority: Critical
  Labels: test
 Fix For: 2.7.1

 Attachments: YARN-3601.001.patch


 This test case was not working since the commit from YARN-2605. It failed 
 with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552212#comment-14552212
 ] 

Hudson commented on YARN-3302:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


 TestDockerContainerExecutor should run automatically if it can detect docker 
 in the usual place
 ---

 Key: YARN-3302
 URL: https://issues.apache.org/jira/browse/YARN-3302
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Ravi Prakash
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
 YARN-3302-trunk.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-20 Thread Raju Bairishetti (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raju Bairishetti updated YARN-3646:
---
Attachment: YARN-3646.002.patch

[~rohithsharma] Thanks for the review and comments. Attached a new patch

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.001.patch, YARN-3646.002.patch, YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875163 Retry#0
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-20 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552225#comment-14552225
 ] 

Rohith commented on YARN-3646:
--

+1 lgtm (non-binding)..  wait for jenkins report!!

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.001.patch, YARN-3646.002.patch, YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
 
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875163 Retry#0
 
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-20 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552165#comment-14552165
 ] 

Rohith commented on YARN-3543:
--

Build machine is not able to run all those test at one shot. Similar issue had 
faced earlier in YARN-2784.  I think need to split the  JIRA into proto change, 
WebUI change, AH change and more.

 ApplicationReport should be able to tell whether the Application is AM 
 managed or not. 
 ---

 Key: YARN-3543
 URL: https://issues.apache.org/jira/browse/YARN-3543
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.6.0
Reporter: Spandan Dutta
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
 0004-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG


 Currently we can know whether the application submitted by the user is AM 
 managed from the applicationSubmissionContext. This can be only done  at the 
 time when the user submits the job. We should have access to this info from 
 the ApplicationReport as well so that we can check whether an app is AM 
 managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3690) 'mvn site' fails on JDK8

2015-05-20 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created YARN-3690:
---

 Summary: 'mvn site' fails on JDK8
 Key: YARN-3690
 URL: https://issues.apache.org/jira/browse/YARN-3690
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
 Environment: CentOS 7.0, Oracle JDK 8u45.
Reporter: Akira AJISAKA


{noformat}
[ERROR] 
/home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
 error: package org.apache.hadoop.yarn.factories has already been annotated
[ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN })
[ERROR] ^
[ERROR] java.lang.AssertionError
[ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
[ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
[ERROR] at 
com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
[ERROR] at 
com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
[ERROR] at 
com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
[ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
[ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
[ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
[ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
[ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
[ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
[ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
[ERROR] at 
com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
[ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
[ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
[ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
[ERROR] javadoc: error - fatal error
[ERROR] 
[ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: YARN-3344-trunk.004.patch

updated patch with formatting issue fixed

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3344-branch-trunk.001.patch, 
 YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch, 
 YARN-3344-trunk.004.patch


 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3690) 'mvn site' fails on JDK8

2015-05-20 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552145#comment-14552145
 ] 

Akira AJISAKA commented on YARN-3690:
-

The problems is: 
* There are 2 package-info.java for org.apache.hadoop.yarn.factories. One is in 
hadoop-yarn-common and the other is in hadoop-yarn-api.
* Both of the two packages are annotated.

 'mvn site' fails on JDK8
 

 Key: YARN-3690
 URL: https://issues.apache.org/jira/browse/YARN-3690
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
 Environment: CentOS 7.0, Oracle JDK 8u45.
Reporter: Akira AJISAKA
Assignee: Brahma Reddy Battula

 'mvn site' failed by the following error:
 {noformat}
 [ERROR] 
 /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
  error: package org.apache.hadoop.yarn.factories has already been annotated
 [ERROR] @InterfaceAudience.LimitedPrivate({ MapReduce, YARN })
 [ERROR] ^
 [ERROR] java.lang.AssertionError
 [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
 [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
 [ERROR] at 
 com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
 [ERROR] at 
 com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
 [ERROR] at 
 com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
 [ERROR] at 
 com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
 [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
 [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
 [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
 [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
 [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
 [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
 [ERROR] at 
 com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
 [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
 [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
 [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
 [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
 [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
 [ERROR] javadoc: error - fatal error
 [ERROR] 
 [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc 
 -J-Xmx1024m @options @packages
 [ERROR] 
 [ERROR] Refer to the generated Javadoc files in 
 '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
 [ERROR] - [Help 1]
 [ERROR] 
 [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
 switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR] 
 [ERROR] For more information about the errors and possible solutions, please 
 read the following articles:
 [ERROR] [Help 1] 
 http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552195#comment-14552195
 ] 

Hadoop QA commented on YARN-3344:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  1s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 46s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 57s | The applied patch generated  2 
new checkstyle issues (total was 43, now 42). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   2m  4s | Tests passed in 
hadoop-yarn-common. |
| | |  39m 39s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734110/YARN-3344-trunk.004.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8022/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8022/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8022/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8022/console |


This message was automatically generated.

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3344-branch-trunk.001.patch, 
 YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch, 
 YARN-3344-trunk.004.patch


 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552125#comment-14552125
 ] 

Hadoop QA commented on YARN-3543:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 47s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 14 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 16s | The applied patch generated  1 
new checkstyle issues (total was 14, now 14). |
| {color:green}+1{color} | whitespace |   0m 11s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   7m  9s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | mapreduce tests | 116m 37s | Tests failed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   6m 37s | Tests failed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   2m  3s | Tests passed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |   0m 19s | Tests failed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |   0m 28s | Tests passed in 
hadoop-yarn-server-common. |
| {color:red}-1{color} | yarn tests |   0m 22s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 171m 57s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.client.api.impl.TestAHSClient |
|   | hadoop.yarn.client.TestApplicationClientProtocolOnHA |
|   | hadoop.yarn.client.cli.TestYarnCLI |
|   | hadoop.yarn.client.api.impl.TestYarnClient |
| Timed out tests | org.apache.hadoop.mapreduce.TestMRJobClient |
|   | org.apache.hadoop.mapreduce.TestMapReduceLazyOutput |
| Failed build | hadoop-yarn-server-applicationhistoryservice |
|   | hadoop-yarn-server-resourcemanager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734085/0004-YARN-3543.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/console |


This message was automatically generated.

 ApplicationReport should be able to tell whether the Application is AM 
 managed or not. 
 ---

 Key: YARN-3543
 URL: https://issues.apache.org/jira/browse/YARN-3543
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.6.0
Reporter: 

[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552186#comment-14552186
 ] 

Hudson commented on YARN-2821:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* hadoop-yarn-project/CHANGES.txt


 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
 YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
 apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez3:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez4:45454
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=3
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_03, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_04, 
 containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_05, 
 containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 

[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552190#comment-14552190
 ] 

Hudson commented on YARN-3302:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


 TestDockerContainerExecutor should run automatically if it can detect docker 
 in the usual place
 ---

 Key: YARN-3302
 URL: https://issues.apache.org/jira/browse/YARN-3302
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Ravi Prakash
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
 YARN-3302-trunk.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552192#comment-14552192
 ] 

Hudson commented on YARN-3677:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java


 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Fix For: 2.7.1

 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552185#comment-14552185
 ] 

Hudson commented on YARN-3565:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java


 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Fix For: 2.8.0

 Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
 YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552180#comment-14552180
 ] 

Hudson commented on YARN-3583:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java


 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Fix For: 2.8.0

 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch, 0004-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
 using plain label name.
 This will help to bring other label details such as Exclusivity to client 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552058#comment-14552058
 ] 

Hadoop QA commented on YARN-2336:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 57s | Site still builds. |
| {color:red}-1{color} | checkstyle |   0m 45s | The applied patch generated  1 
new checkstyle issues (total was 8, now 8). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  49m 57s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  92m 12s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734018/YARN-2336.009.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8020/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8020/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8020/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8020/console |


This message was automatically generated.

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-20 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3543:
-
Attachment: 0004-YARN-3543.patch

Attached same patch to kick off Jenkins

 ApplicationReport should be able to tell whether the Application is AM 
 managed or not. 
 ---

 Key: YARN-3543
 URL: https://issues.apache.org/jira/browse/YARN-3543
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: api
Affects Versions: 2.6.0
Reporter: Spandan Dutta
Assignee: Rohith
  Labels: BB2015-05-TBR
 Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
 0004-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG


 Currently we can know whether the application submitted by the user is AM 
 managed from the applicationSubmissionContext. This can be only done  at the 
 time when the user submits the job. We should have access to this info from 
 the ApplicationReport as well so that we can check whether an app is AM 
 managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-20 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551962#comment-14551962
 ] 

Akira AJISAKA commented on YARN-2336:
-

The test failure looks unrelated to the patch. Kicked 
https://builds.apache.org/job/PreCommit-YARN-Build/8020/

 Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
 --

 Key: YARN-2336
 URL: https://issues.apache.org/jira/browse/YARN-2336
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Affects Versions: 2.4.1, 2.6.0
Reporter: Kenji Kikushima
Assignee: Akira AJISAKA
  Labels: BB2015-05-RFC
 Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
 YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
 YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch


 When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
 blacket JSON for childQueues.
 This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551976#comment-14551976
 ] 

Hadoop QA commented on YARN-3591:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 20s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 28s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  42m 15s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734083/YARN-3591.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8019/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8019/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8019/console |


This message was automatically generated.

 Resource Localisation on a bad disk causes subsequent containers failure 
 -

 Key: YARN-3591
 URL: https://issues.apache.org/jira/browse/YARN-3591
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Lavkesh Lahngir
Assignee: Lavkesh Lahngir
 Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
 YARN-3591.2.patch, YARN-3591.3.patch, YARN-3591.4.patch


 It happens when a resource is localised on the disk, after localising that 
 disk has gone bad. NM keeps paths for localised resources in memory.  At the 
 time of resource request isResourcePresent(rsrc) will be called which calls 
 file.exists() on the localised path.
 In some cases when disk has gone bad, inodes are stilled cached and 
 file.exists() returns true. But at the time of reading, file will not open.
 Note: file.exists() actually calls stat64 natively which returns true because 
 it was able to find inode information from the OS.
 A proposal is to call file.list() on the parent path of the resource, which 
 will call open() natively. If the disk is good it should return an array of 
 paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-20 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552091#comment-14552091
 ] 

Rohith commented on YARN-3646:
--

Thanks for updating the patch, some comments on tests 
# I think we can remove the tests added in the hadoop-common project, since 
yarn-client verifies required funcitionality. And basically hadoop-common test 
was mocking the RMProxy functionality which test was passing without RMProxy 
fix also.
# code never reach {{Assert.fail();}}. better to remove it
# Catch the ApplicationNotFoundException instead of catching throwable. I think 
you can add {{expected = ApplicationNotFoundException.class}} in the @Test 
annotation  like below.
{code}
@Test(timeout = 3, expected = ApplicationNotFoundException.class)
  public void testClientWithRetryPolicyForEver() throws Exception {
YarnConfiguration conf = new YarnConfiguration();
conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, -1);

ResourceManager rm = null;
YarnClient yarnClient = null;
try {
  // start rm
  rm = new ResourceManager();
  rm.init(conf);
  rm.start();

  yarnClient = YarnClient.createYarnClient();
  yarnClient.init(conf);
  yarnClient.start();

  // create invalid application id
  ApplicationId appId = ApplicationId.newInstance(1430126768987L, 10645);

  // RM should throw ApplicationNotFoundException exception
  yarnClient.getApplicationReport(appId);
} finally {
  if (yarnClient != null) {
yarnClient.stop();
  }
  if (rm != null) {
rm.stop();
  }
}
  }
{code}
# can you rename the test name with actual functionality test, like 
{{testShouldNotRetryForeverForNonNetworkExceptions}}

 Applications are getting stuck some times in case of retry policy forever
 -

 Key: YARN-3646
 URL: https://issues.apache.org/jira/browse/YARN-3646
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Reporter: Raju Bairishetti
 Attachments: YARN-3646.001.patch, YARN-3646.patch


 We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
 retry policy.
 Yarn client is infinitely retrying in case of exceptions from the RM as it is 
 using retrying policy as FOREVER. The problem is it is retrying for all kinds 
 of exceptions (like ApplicationNotFoundException), even though it is not a 
 connection failure. Due to this my application is not progressing further.
 *Yarn client should not retry infinitely in case of non connection failures.*
 We have written a simple yarn-client which is trying to get an application 
 report for an invalid  or older appId. ResourceManager is throwing an 
 ApplicationNotFoundException as this is an invalid or older appId.  But 
 because of retry policy FOREVER, client is keep on retrying for getting the 
 application report and ResourceManager is throwing 
 ApplicationNotFoundException continuously.
 {code}
 private void testYarnClientRetryPolicy() throws  Exception{
 YarnConfiguration conf = new YarnConfiguration();
 conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
 -1);
 YarnClient yarnClient = YarnClient.createYarnClient();
 yarnClient.init(conf);
 yarnClient.start();
 ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
 10645);
 ApplicationReport report = yarnClient.getApplicationReport(appId);
 }
 {code}
 *RM logs:*
 {noformat}
 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
 org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
 from 10.14.120.231:61621 Call#875162 Retry#0
 org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
 with id 'application_1430126768987_10645' doesn't exist in RM.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
   at 
 org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
   at 
 org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 

[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-trunk.001.patch)

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3344-branch-trunk.003.patch, 
 YARN-3344-trunk.004.patch


 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: YARN-3344-trunk.005.patch

updated patch with checkstyle issue handled

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3344-trunk.005.patch


 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-trunk.004.patch)

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
Assignee: Ravindra Kumar Naik

 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-trunk.003.patch)

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
Assignee: Ravindra Kumar Naik

 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-trunk.002.patch)

 procfs stat file is not in the expected format warning
 --

 Key: YARN-3344
 URL: https://issues.apache.org/jira/browse/YARN-3344
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Jon Bringhurst
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3344-branch-trunk.003.patch, 
 YARN-3344-trunk.004.patch


 Although this doesn't appear to be causing any functional issues, it is 
 spamming our log files quite a bit. :)
 It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
 /proc/pid/stat files.
 Here's the error I'm seeing:
 {noformat}
 source_host: asdf,
 method: constructProcessInfo,
 level: WARN,
 message: Unexpected: procfs stat file is not in the expected format 
 for process with pid 6953
 file: ProcfsBasedProcessTree.java,
 line_number: 514,
 class: org.apache.hadoop.yarn.util.ProcfsBasedProcessTree,
 {noformat}
 And here's the basic info on process with pid 6953:
 {noformat}
 [asdf ~]$ cat /proc/6953/stat
 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
 2 18446744073709551615 0 0 17 13 0 0 0 0 0
 [asdf ~]$ ps aux|grep 6953
 root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
 /export/apps/salt/minion-scripts/module-sync.py
 jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
 [asdf ~]$ 
 {noformat}
 This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2015-05-20 Thread MENG DING (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552375#comment-14552375
 ] 

MENG DING commented on YARN-1902:
-

I have been experimenting with the idea of changing AppSchedulingInfo to 
maintain a total request table, a fulfilled allocation table, and then 
calculate the difference of the two tables as the real outstanding request 
table used for scheduling. All is fine until I realized that this cannot handle 
one use case where a AMRMClient, right before sending the allocation heartbeat, 
removes all container requests, and add new container requests at the same 
priority and location (possibly with different resource capability).  
AppSchedulingInfo does not know about this, and may not treat the newly added 
container requests as outstanding requests.

I agree that currently I do not see a clean solution without affecting backward 
compatibility. 

 Allocation of too many containers when a second request is done with the same 
 resource capability
 -

 Key: YARN-1902
 URL: https://issues.apache.org/jira/browse/YARN-1902
 Project: Hadoop YARN
  Issue Type: Bug
  Components: client
Affects Versions: 2.2.0, 2.3.0, 2.4.0
Reporter: Sietse T. Au
Assignee: Sietse T. Au
  Labels: client
 Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch


 Regarding AMRMClientImpl
 Scenario 1:
 Given a ContainerRequest x with Resource y, when addContainerRequest is 
 called z times with x, allocate is called and at least one of the z allocated 
 containers is started, then if another addContainerRequest call is done and 
 subsequently an allocate call to the RM, (z+1) containers will be allocated, 
 where 1 container is expected.
 Scenario 2:
 No containers are started between the allocate calls. 
 Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
 are requested in both scenarios, but that only in the second scenario, the 
 correct behavior is observed.
 Looking at the implementation I have found that this (z+1) request is caused 
 by the structure of the remoteRequestsTable. The consequence of MapResource, 
 ResourceRequestInfo is that ResourceRequestInfo does not hold any 
 information about whether a request has been sent to the RM yet or not.
 There are workarounds for this, such as releasing the excess containers 
 received.
 The solution implemented is to initialize a new ResourceRequest in 
 ResourceRequestInfo when a request has been successfully sent to the RM.
 The patch includes a test in which scenario one is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552557#comment-14552557
 ] 

Hudson commented on YARN-3302:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java


 TestDockerContainerExecutor should run automatically if it can detect docker 
 in the usual place
 ---

 Key: YARN-3302
 URL: https://issues.apache.org/jira/browse/YARN-3302
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Ravi Prakash
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
 YARN-3302-trunk.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552559#comment-14552559
 ] 

Hudson commented on YARN-3677:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt


 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Fix For: 2.7.1

 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552548#comment-14552548
 ] 

Hudson commented on YARN-3583:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java


 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Fix For: 2.8.0

 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch, 0004-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
 using plain label name.
 This will help to bring other label details such as Exclusivity to client 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552553#comment-14552553
 ] 

Hudson commented on YARN-3565:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java


 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Fix For: 2.8.0

 Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
 YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552596#comment-14552596
 ] 

Hudson commented on YARN-3565:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java


 NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
 instead of String
 -

 Key: YARN-3565
 URL: https://issues.apache.org/jira/browse/YARN-3565
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Naganarasimha G R
Priority: Blocker
 Fix For: 2.8.0

 Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
 YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch


 Now NM HB/Register uses SetString, it will be hard to add new fields if we 
 want to support specifying NodeLabel type such as exclusivity/constraints, 
 etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552597#comment-14552597
 ] 

Hudson commented on YARN-2821:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java


 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
 YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
 apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez3:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez4:45454
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=3
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_03, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_04, 
 containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_05, 
 containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 

[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552601#comment-14552601
 ] 

Hudson commented on YARN-3302:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


 TestDockerContainerExecutor should run automatically if it can detect docker 
 in the usual place
 ---

 Key: YARN-3302
 URL: https://issues.apache.org/jira/browse/YARN-3302
 Project: Hadoop YARN
  Issue Type: Sub-task
Affects Versions: 2.6.0
Reporter: Ravi Prakash
Assignee: Ravindra Kumar Naik
 Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
 YARN-3302-trunk.003.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552591#comment-14552591
 ] 

Hudson commented on YARN-3583:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto


 Support of NodeLabel object instead of plain String in YarnClient side.
 ---

 Key: YARN-3583
 URL: https://issues.apache.org/jira/browse/YARN-3583
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: client
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Fix For: 2.8.0

 Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
 0003-YARN-3583.patch, 0004-YARN-3583.patch


 Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
 getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
 using plain label name.
 This will help to bring other label details such as Exclusivity to client 
 side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552603#comment-14552603
 ] 

Hudson commented on YARN-3677:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java


 Fix findbugs warnings in yarn-server-resourcemanager
 

 Key: YARN-3677
 URL: https://issues.apache.org/jira/browse/YARN-3677
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Reporter: Akira AJISAKA
Assignee: Vinod Kumar Vavilapalli
Priority: Minor
  Labels: newbie
 Fix For: 2.7.1

 Attachments: YARN-3677-20150519.txt


 There is 1 findbugs warning in FileSystemRMStateStore.java.
 {noformat}
 Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
 time
 Unsynchronized access at FileSystemRMStateStore.java: [line 156]
 Field 
 org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
 Synchronized 66% of the time
 Synchronized access at FileSystemRMStateStore.java: [line 148]
 Synchronized access at FileSystemRMStateStore.java: [line 859]
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be

2015-05-20 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552686#comment-14552686
 ] 

Craig Welch commented on YARN-3626:
---

Checkstyle looks insignificant.

[~cnauroth], [~vinodkv], I've changed the approach to use the environment 
instead of configuration as suggested, can one of you review pls?

 On Windows localized resources are not moved to the front of the classpath 
 when they should be
 --

 Key: YARN-3626
 URL: https://issues.apache.org/jira/browse/YARN-3626
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
 Environment: Windows
Reporter: Craig Welch
Assignee: Craig Welch
 Fix For: 2.7.1

 Attachments: YARN-3626.0.patch, YARN-3626.11.patch, 
 YARN-3626.14.patch, YARN-3626.4.patch, YARN-3626.6.patch, YARN-3626.9.patch


 In response to the mapreduce.job.user.classpath.first setting the classpath 
 is ordered differently so that localized resources will appear before system 
 classpath resources when tasks execute.  On Windows this does not work 
 because the localized resources are not linked into their final location when 
 the classpath jar is created.  To compensate for that localized jar resources 
 are added directly to the classpath generated for the jar rather than being 
 discovered from the localized directories.  Unfortunately, they are always 
 appended to the classpath, and so are never preferred over system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552549#comment-14552549
 ] 

Hudson commented on YARN-3601:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java


 Fix UT TestRMFailover.testRMWebAppRedirect
 --

 Key: YARN-3601
 URL: https://issues.apache.org/jira/browse/YARN-3601
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
 Environment: Red Hat Enterprise Linux Workstation release 6.5 
 (Santiago)
Reporter: Weiwei Yang
Assignee: Weiwei Yang
Priority: Critical
  Labels: test
 Fix For: 2.7.1

 Attachments: YARN-3601.001.patch


 This test case was not working since the commit from YARN-2605. It failed 
 with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552592#comment-14552592
 ] 

Hudson commented on YARN-3601:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java
* hadoop-yarn-project/CHANGES.txt


 Fix UT TestRMFailover.testRMWebAppRedirect
 --

 Key: YARN-3601
 URL: https://issues.apache.org/jira/browse/YARN-3601
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager, webapp
 Environment: Red Hat Enterprise Linux Workstation release 6.5 
 (Santiago)
Reporter: Weiwei Yang
Assignee: Weiwei Yang
Priority: Critical
  Labels: test
 Fix For: 2.7.1

 Attachments: YARN-3601.001.patch


 This test case was not working since the commit from YARN-2605. It failed 
 with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552534#comment-14552534
 ] 

Varun Saxena commented on YARN-3051:


Well, I am still stuck on trying to get the attribute set via 
HttpServer2#setAttribute in WebServices class. Will update patch once that is 
done.

 [Storage abstraction] Create backing storage read interface for ATS readers
 ---

 Key: YARN-3051
 URL: https://issues.apache.org/jira/browse/YARN-3051
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, 
 YARN-3051_temp.patch


 Per design in YARN-2928, create backing storage read interface that can be 
 implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552554#comment-14552554
 ] 

Hudson commented on YARN-2821:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java
* hadoop-yarn-project/CHANGES.txt


 Distributed shell app master becomes unresponsive sometimes
 ---

 Key: YARN-2821
 URL: https://issues.apache.org/jira/browse/YARN-2821
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications/distributed-shell
Affects Versions: 2.5.1
Reporter: Varun Vasudev
Assignee: Varun Vasudev
 Fix For: 2.8.0

 Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
 YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
 apache-yarn-2821.1.patch


 We've noticed that once in a while the distributed shell app master becomes 
 unresponsive and is eventually killed by the RM. snippet of the logs -
 {noformat}
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
 appattempt_1415123350094_0017_01 received 0 previous attempts' running 
 containers on AM registration.
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
 container ask: Capability[memory:10, vCores:1]Priority[0]
 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez2:45454
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_02, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
 container launch container for 
 containerid=container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 START_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
 QUERY_CONTAINER for Container container_1415123350094_0017_01_02
 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
 onprem-tez2:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez3:45454
 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
 onprem-tez4:45454
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
 RM for container ask, allocatedCnt=3
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_03, 
 containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_04, 
 containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
 command on a new container., 
 containerId=container_1415123350094_0017_01_05, 
 containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
 containerResourceMemory1024, containerResourceVirtualCores1
 

[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations

2015-05-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552680#comment-14552680
 ] 

Chris Nauroth commented on YARN-3685:
-

[~vinodkv], thanks for the notification.  I was not aware of this design goal 
at the time of YARN-316.

Perhaps it's possible to move the classpath jar generation to the MR client or 
AM.  It's not immediately obvious to me which of those 2 choices is better.  
We'd need to change the manifest to use relative paths in the Class-Path 
attribute instead of absolute paths.  (The client and AM are not aware of the 
exact layout of the NodeManager's {{yarn.nodemanager.local-dirs}}, so the 
client can't predict the absolute paths at time of container launch.)

There is one piece of logic that I don't see how to handle though.  Some 
classpath entries are defined in terms of environment variables.  These 
environment variables are expanded at the NodeManager via the container launch 
scripts.  This was true of Linux even before YARN-316, so in that sense, YARN 
did already have some classpath logic indirectly.  Environment variables cannot 
be used inside a manifest's Class-Path, so for Windows, NodeManager expands the 
environment variables before populating Class-Path.  It would be incorrect to 
do the environment variable expansion at the MR client, because it might be 
running with different configuration than the NodeManager.  I suppose if the AM 
did the expansion, then that would work in most cases, but it creates an 
assumption that the AM container is running with configuration that matches all 
NodeManagers in the cluster.  I don't believe that assumption exists today.

If we do move classpath handling out of the NodeManager, then it would be a 
backwards-incompatible change, and so it could not be shipped in the 2.x 
release line.

 NodeManager unnecessarily knows about classpath-jars due to Windows 
 limitations
 ---

 Key: YARN-3685
 URL: https://issues.apache.org/jira/browse/YARN-3685
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Vinod Kumar Vavilapalli
Assignee: Vinod Kumar Vavilapalli

 Found this while looking at cleaning up ContainerExecutor via YARN-3648, 
 making it a sub-task.
 YARN *should not* know about classpaths. Our original design modeled around 
 this. But when we added windows suppport, due to classpath issues, we ended 
 up breaking this abstraction via YARN-316. We should clean this up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3686) CapacityScheduler should trim default_node_label_expression

2015-05-20 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3686:
--
Attachment: 0002-YARN-3686.patch

Uploading another patch covering a negative scenario.

 CapacityScheduler should trim default_node_label_expression
 ---

 Key: YARN-3686
 URL: https://issues.apache.org/jira/browse/YARN-3686
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Sunil G
Priority: Critical
 Attachments: 0001-YARN-3686.patch, 0002-YARN-3686.patch


 We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2015-05-20 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552699#comment-14552699
 ] 

Anubhav Dhoot commented on YARN-3467:
-

Attaching the ApplicationAttempt page. It does show the number of running 
containers. But it does not show actual allocated resources overall for the 
application attempt. 

 Expose allocatedMB, allocatedVCores, and runningContainers metrics on running 
 Applications in RM Web UI
 ---

 Key: YARN-3467
 URL: https://issues.apache.org/jira/browse/YARN-3467
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: webapp, yarn
Affects Versions: 2.5.0
Reporter: Anthony Rojas
Assignee: Anubhav Dhoot
Priority: Minor
 Attachments: ApplicationAttemptPage.png


 The YARN REST API can report on the following properties:
 *allocatedMB*: The sum of memory in MB allocated to the application's running 
 containers
 *allocatedVCores*: The sum of virtual cores allocated to the application's 
 running containers
 *runningContainers*: The number of containers currently running for the 
 application
 Currently, the RM Web UI does not report on these items (at least I couldn't 
 find any entries within the Web UI).
 It would be useful for YARN Application and Resource troubleshooting to have 
 these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552700#comment-14552700
 ] 

Craig Welch commented on YARN-3681:
---

[~varun_saxena] the patch you had doesn't apply properly for me, I've uploaded 
a patch which does the same things which does, and which I've had the 
opportunity to test.

@xgong, can you take a look at this one (.0.patch)?  Thanks.

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3686) CapacityScheduler should trim default_node_label_expression

2015-05-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552711#comment-14552711
 ] 

Wangda Tan commented on YARN-3686:
--

[~sunilg], thanks for working on this, comments:
- I think you can try to add to 
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeNodeLabelExpressionInRequest(ResourceRequest,
 QueueInfo)}}, which needs trim node-label-expression as well
- Actually this is a regression, in 2.6 queue's node label expression with 
spaces can setup without any issue. It's better to add test to make sure 1. 
spaces in resource request will be trimmed 2. spaces in queue configuration 
(default-node-label-expression) will be trimmed.

 CapacityScheduler should trim default_node_label_expression
 ---

 Key: YARN-3686
 URL: https://issues.apache.org/jira/browse/YARN-3686
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Sunil G
Priority: Critical
 Attachments: 0001-YARN-3686.patch, 0002-YARN-3686.patch


 We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-05-20 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552669#comment-14552669
 ] 

Anubhav Dhoot commented on YARN-2005:
-

Assigning to myself to as I am starting work on this. [~sunilg] let me know if 
you have made progress on this already.

 Blacklisting support for scheduling AMs
 ---

 Key: YARN-2005
 URL: https://issues.apache.org/jira/browse/YARN-2005
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe

 It would be nice if the RM supported blacklisting a node for an AM launch 
 after the same node fails a configurable number of AM attempts.  This would 
 be similar to the blacklisting support for scheduling task attempts in the 
 MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2005) Blacklisting support for scheduling AMs

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot reassigned YARN-2005:
---

Assignee: Anubhav Dhoot

 Blacklisting support for scheduling AMs
 ---

 Key: YARN-2005
 URL: https://issues.apache.org/jira/browse/YARN-2005
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: resourcemanager
Affects Versions: 0.23.10, 2.4.0
Reporter: Jason Lowe
Assignee: Anubhav Dhoot

 It would be nice if the RM supported blacklisting a node for an AM launch 
 after the same node fails a configurable number of AM attempts.  This would 
 be similar to the blacklisting support for scheduling task attempts in the 
 MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3467:

Attachment: ApplicationAttemptPage.png

 Expose allocatedMB, allocatedVCores, and runningContainers metrics on running 
 Applications in RM Web UI
 ---

 Key: YARN-3467
 URL: https://issues.apache.org/jira/browse/YARN-3467
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: webapp, yarn
Affects Versions: 2.5.0
Reporter: Anthony Rojas
Assignee: Anubhav Dhoot
Priority: Minor
 Attachments: ApplicationAttemptPage.png


 The YARN REST API can report on the following properties:
 *allocatedMB*: The sum of memory in MB allocated to the application's running 
 containers
 *allocatedVCores*: The sum of virtual cores allocated to the application's 
 running containers
 *runningContainers*: The number of containers currently running for the 
 application
 Currently, the RM Web UI does not report on these items (at least I couldn't 
 find any entries within the Web UI).
 It would be useful for YARN Application and Resource troubleshooting to have 
 these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3691) Limit number of reservations for an app

2015-05-20 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-3691:
-

 Summary: Limit number of reservations for an app
 Key: YARN-3691
 URL: https://issues.apache.org/jira/browse/YARN-3691
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Arun Suresh


Currently, It is possible to reserve resource for an app on all nodes. Limiting 
this to possibly just a number of nodes (or a ratio of the total cluster size) 
would improve utilization of the cluster and will reduce the possibility of 
starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552710#comment-14552710
 ] 

Hadoop QA commented on YARN-3681:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734163/YARN-3681.0.patch |
| Optional Tests |  |
| git revision | trunk / 4aa730c |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8026/console |


This message was automatically generated.

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3691) Limit number of reservations for an app

2015-05-20 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh reassigned YARN-3691:
-

Assignee: Arun Suresh

 Limit number of reservations for an app
 ---

 Key: YARN-3691
 URL: https://issues.apache.org/jira/browse/YARN-3691
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Arun Suresh
Assignee: Arun Suresh

 Currently, It is possible to reserve resource for an app on all nodes. 
 Limiting this to possibly just a number of nodes (or a ratio of the total 
 cluster size) would improve utilization of the cluster and will reduce the 
 possibility of starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3647) RMWebServices api's should use updated api from CommonNodeLabelsManager to get NodeLabel object

2015-05-20 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552619#comment-14552619
 ] 

Sunil G commented on YARN-3647:
---

Test case failure and findbugs error are not related to this patch.

 RMWebServices api's should use updated api from CommonNodeLabelsManager to 
 get NodeLabel object
 ---

 Key: YARN-3647
 URL: https://issues.apache.org/jira/browse/YARN-3647
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Sunil G
Assignee: Sunil G
 Attachments: 0001-YARN-3647.patch, 0002-YARN-3647.patch


 After YARN-3579, RMWebServices apis can use the updated version of apis in 
 CommonNodeLabelsManager which gives full NodeLabel object instead of creating 
 NodeLabel object from plain label name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3675:

Attachment: YARN-3675.002.patch

Fixed checkstyle issue 

 FairScheduler: RM quits when node removal races with continousscheduling on 
 the same node
 -

 Key: YARN-3675
 URL: https://issues.apache.org/jira/browse/YARN-3675
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3675.001.patch, YARN-3675.002.patch


 With continuous scheduling, scheduling can be done on a node thats just 
 removed causing errors like below.
 {noformat}
 12:28:53.782 AM FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
 Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
   at java.lang.Thread.run(Thread.java:745)
 12:28:53.783 AMINFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3681:
--
Attachment: YARN-3681.0.patch

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3691) FairScheduler: Limit number of reservations for a container

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553001#comment-14553001
 ] 

Karthik Kambatla commented on YARN-3691:


The number of reservations should be per component and not per application? If 
an app is looking to get resources for 10 containers, it should be able to make 
reservations independently for each container. 

 FairScheduler: Limit number of reservations for a container
 ---

 Key: YARN-3691
 URL: https://issues.apache.org/jira/browse/YARN-3691
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Arun Suresh
Assignee: Arun Suresh

 Currently, It is possible to reserve resource for an app on all nodes. 
 Limiting this to possibly just a number of nodes (or a ratio of the total 
 cluster size) would improve utilization of the cluster and will reduce the 
 possibility of starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553158#comment-14553158
 ] 

Hudson commented on YARN-2918:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7875 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7875/])
Move YARN-2918 from 2.8.0 to 2.7.1 (wangda: rev 
03f897fd1a3779251023bae358207069b89addbf)
* hadoop-yarn-project/CHANGES.txt


 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - Add labels to RM
 - Submit applications
 Now admin has to:
 - Configure queue (without label)
 - Start RM
 - Add labels to RM
 - Refresh queue's config (with label)
 - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-20 Thread Vrushali C (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vrushali C updated YARN-3411:
-
Attachment: YARN-3411-YARN-2928.007.patch


Uploading YARN-3411-YARN-2928.007.patch. I think I have addressed everyone's 
comments. I have been going up and down scrolling on this jira page since 
yesterday and I hope I have not missed out on any comment. 

[~gtCarrera9] I have not yet moved the test data into TestTimelineWriterImpl 
since it has almost a similar information setup for timeline entity but with 
more cases. I can modify it later. I have tested the HBase writer with 
Sangjin's driver code as well. 

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
 YARN-3411-YARN-2928.007.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
 YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
 YARN-3411.poc.7.txt, YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3411) [Storage implementation] explore the native HBase write schema for storage

2015-05-20 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553186#comment-14553186
 ] 

Li Lu commented on YARN-3411:
-

Hi [~vrushalic], sure, don't worry about the test code clean up for now. I'll 
try it locally. 

 [Storage implementation] explore the native HBase write schema for storage
 --

 Key: YARN-3411
 URL: https://issues.apache.org/jira/browse/YARN-3411
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Sangjin Lee
Assignee: Vrushali C
Priority: Critical
 Attachments: ATSv2BackendHBaseSchemaproposal.pdf, 
 YARN-3411-YARN-2928.001.patch, YARN-3411-YARN-2928.002.patch, 
 YARN-3411-YARN-2928.003.patch, YARN-3411-YARN-2928.004.patch, 
 YARN-3411-YARN-2928.005.patch, YARN-3411-YARN-2928.006.patch, 
 YARN-3411-YARN-2928.007.patch, YARN-3411.poc.2.txt, YARN-3411.poc.3.txt, 
 YARN-3411.poc.4.txt, YARN-3411.poc.5.txt, YARN-3411.poc.6.txt, 
 YARN-3411.poc.7.txt, YARN-3411.poc.txt


 There is work that's in progress to implement the storage based on a Phoenix 
 schema (YARN-3134).
 In parallel, we would like to explore an implementation based on a native 
 HBase schema for the write path. Such a schema does not exclude using 
 Phoenix, especially for reads and offline queries.
 Once we have basic implementations of both options, we could evaluate them in 
 terms of performance, scalability, usability, etc. and make a call.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553015#comment-14553015
 ] 

Karthik Kambatla commented on YARN-314:
---

I am essentially proposing an efficient way to index the pending requests 
across multiple axes. Each of these indices are captured by a map. The only 
reason to colocate them is to not disperse this indexing (mapping) logic across 
multiple classes. 

We should able to quickly look up all requests for an app for reporting etc., 
and also look up all node-local requests across applications at schedule time 
without having to iterate through all the applications. 

The maps could be - App, Priority, Locality, ResourceRequest, Locality 
(node/rack), Priority, App, ResourceRequest. Current {{AppSchedulingInfo}} 
could stay as is and use the former map to get the corresponding requests.

 Schedulers should allow resource requests of different sizes at the same 
 priority and location
 --

 Key: YARN-314
 URL: https://issues.apache.org/jira/browse/YARN-314
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
 Attachments: yarn-314-prelim.patch


 Currently, resource requests for the same container and locality are expected 
 to all be the same size.
 While it it doesn't look like it's needed for apps currently, and can be 
 circumvented by specifying different priorities if absolutely necessary, it 
 seems to me that the ability to request containers with different resource 
 requirements at the same priority level should be there for the future and 
 for completeness sake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server

2015-05-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-2556:
---
Attachment: YARN-2556.10.patch

Add JobHistoryFileReplayMapper mapper

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Chang Li
  Labels: BB2015-05-TBR
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.2.patch, YARN-2556.3.patch, 
 YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, 
 YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, 
 yarn2556.patch, yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3388) Allocation in LeafQueue could get stuck because DRF calculator isn't well supported when computing user-limit

2015-05-20 Thread Nathan Roberts (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553165#comment-14553165
 ] 

Nathan Roberts commented on YARN-3388:
--

Thanks [~leftnoteasy] for the comments. I agree 2b is the way to go. I will 
upload a new patch soon.

 Allocation in LeafQueue could get stuck because DRF calculator isn't well 
 supported when computing user-limit
 -

 Key: YARN-3388
 URL: https://issues.apache.org/jira/browse/YARN-3388
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacityscheduler
Affects Versions: 2.6.0
Reporter: Nathan Roberts
Assignee: Nathan Roberts
 Attachments: YARN-3388-v0.patch, YARN-3388-v1.patch, 
 YARN-3388-v2.patch


 When there are multiple active users in a queue, it should be possible for 
 those users to make use of capacity up-to max_capacity (or close). The 
 resources should be fairly distributed among the active users in the queue. 
 This works pretty well when there is a single resource being scheduled.   
 However, when there are multiple resources the situation gets more complex 
 and the current algorithm tends to get stuck at Capacity. 
 Example illustrated in subsequent comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3686) CapacityScheduler should trim default_node_label_expression

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553000#comment-14553000
 ] 

Hadoop QA commented on YARN-3686:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 29s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 16s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 20s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734160/0002-YARN-3686.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8029/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8029/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8029/console |


This message was automatically generated.

 CapacityScheduler should trim default_node_label_expression
 ---

 Key: YARN-3686
 URL: https://issues.apache.org/jira/browse/YARN-3686
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: api, client, resourcemanager
Reporter: Wangda Tan
Assignee: Sunil G
Priority: Critical
 Attachments: 0001-YARN-3686.patch, 0002-YARN-3686.patch


 We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3681:
--
Attachment: YARN-3681.branch-2.0.patch

Here is one for branch-2

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
 YARN-3681.1.patch, YARN-3681.branch-2.0.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3691) FairScheduler: Limit number of reservations for a container

2015-05-20 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-3691:
---
Summary: FairScheduler: Limit number of reservations for a container  (was: 
Limit number of reservations for an app)

 FairScheduler: Limit number of reservations for a container
 ---

 Key: YARN-3691
 URL: https://issues.apache.org/jira/browse/YARN-3691
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Arun Suresh
Assignee: Arun Suresh

 Currently, It is possible to reserve resource for an app on all nodes. 
 Limiting this to possibly just a number of nodes (or a ratio of the total 
 cluster size) would improve utilization of the cluster and will reduce the 
 possibility of starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (YARN-3691) FairScheduler: Limit number of reservations for a container

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553001#comment-14553001
 ] 

Karthik Kambatla edited comment on YARN-3691 at 5/20/15 8:09 PM:
-

The number of reservations should be per container and not per application? If 
an app is looking to get resources for 10 containers, it should be able to make 
reservations independently for each container. 


was (Author: kasha):
The number of reservations should be per component and not per application? If 
an app is looking to get resources for 10 containers, it should be able to make 
reservations independently for each container. 

 FairScheduler: Limit number of reservations for a container
 ---

 Key: YARN-3691
 URL: https://issues.apache.org/jira/browse/YARN-3691
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Arun Suresh
Assignee: Arun Suresh

 Currently, It is possible to reserve resource for an app on all nodes. 
 Limiting this to possibly just a number of nodes (or a ratio of the total 
 cluster size) would improve utilization of the cluster and will reduce the 
 possibility of starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553189#comment-14553189
 ] 

Xuan Gong commented on YARN-3681:
-

Committed into trunk/branch-2/branch-2.7. Thanks, craig and varun

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Fix For: 2.7.1

 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
 YARN-3681.1.patch, YARN-3681.branch-2.0.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-20 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2918:
-
Fix Version/s: 2.7.1

 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - Add labels to RM
 - Submit applications
 Now admin has to:
 - Configure queue (without label)
 - Start RM
 - Add labels to RM
 - Refresh queue's config (with label)
 - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553086#comment-14553086
 ] 

Wangda Tan commented on YARN-2918:
--

Back-ported this patch to 2.7.1, updating fix version.

 Don't fail RM if queue's configured labels are not existed in 
 cluster-node-labels
 -

 Key: YARN-2918
 URL: https://issues.apache.org/jira/browse/YARN-2918
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Rohith
Assignee: Wangda Tan
 Fix For: 2.8.0, 2.7.1

 Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch


 Currently, if admin setup labels on queues 
 {{queue-path.accessible-node-labels = ...}}. And the label is not added to 
 RM, queue's initialization will fail and RM will fail too:
 {noformat}
 2014-12-03 20:11:50,126 FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
 ResourceManager
 ...
 Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
 please check.
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.init(AbstractCSQueue.java:109)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.init(LeafQueue.java:120)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
   at 
 org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
 {noformat}
 This is not a good user experience, we should stop fail RM so that admin can 
 configure queue/labels in following steps:
 - Configure queue (with label)
 - Start RM
 - Add labels to RM
 - Submit applications
 Now admin has to:
 - Configure queue (without label)
 - Start RM
 - Add labels to RM
 - Refresh queue's config (with label)
 - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553159#comment-14553159
 ] 

Hudson commented on YARN-3681:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7875 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7875/])
YARN-3681. yarn cmd says could not find main class 'queue' in windows. 
(xgong: rev 5774f6b1e577ee64bde8c7c1e39f404b9e651176)
* hadoop-yarn-project/hadoop-yarn/bin/yarn.cmd
* hadoop-yarn-project/CHANGES.txt


 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
 YARN-3681.1.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553114#comment-14553114
 ] 

Xuan Gong commented on YARN-3681:
-

Use git apply -p0 --whitespace=fix could apply the patch.
The patch looks good to me.
+1 will commit

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
 YARN-3681.1.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553151#comment-14553151
 ] 

Hadoop QA commented on YARN-3675:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 34s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 46s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 16s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | yarn tests |  50m  4s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 17s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.server.resourcemanager.TestRMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734207/YARN-3675.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8030/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8030/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8030/console |


This message was automatically generated.

 FairScheduler: RM quits when node removal races with continousscheduling on 
 the same node
 -

 Key: YARN-3675
 URL: https://issues.apache.org/jira/browse/YARN-3675
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3675.001.patch, YARN-3675.002.patch, 
 YARN-3675.003.patch


 With continuous scheduling, scheduling can be done on a node thats just 
 removed causing errors like below.
 {noformat}
 12:28:53.782 AM FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
 Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
   at java.lang.Thread.run(Thread.java:745)
 12:28:53.783 AMINFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552736#comment-14552736
 ] 

Hadoop QA commented on YARN-3681:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734165/YARN-3681.1.patch |
| Optional Tests |  |
| git revision | trunk / 4aa730c |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8027/console |


This message was automatically generated.

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
 YARN-3681.1.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3692) Allow REST API to set a user generated message when killing an application

2015-05-20 Thread Rajat Jain (JIRA)
Rajat Jain created YARN-3692:


 Summary: Allow REST API to set a user generated message when 
killing an application
 Key: YARN-3692
 URL: https://issues.apache.org/jira/browse/YARN-3692
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Rajat Jain


Currently YARN's REST API supports killing an application without setting a 
diagnostic message. It would be good to provide that support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3675:

Attachment: YARN-3675.003.patch

Removed spurious changes and changed visibility of attemptScheduling

 FairScheduler: RM quits when node removal races with continousscheduling on 
 the same node
 -

 Key: YARN-3675
 URL: https://issues.apache.org/jira/browse/YARN-3675
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3675.001.patch, YARN-3675.002.patch, 
 YARN-3675.003.patch


 With continuous scheduling, scheduling can be done on a node thats just 
 removed causing errors like below.
 {noformat}
 12:28:53.782 AM FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
 Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
   at java.lang.Thread.run(Thread.java:745)
 12:28:53.783 AMINFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2355) MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552958#comment-14552958
 ] 

Hadoop QA commented on YARN-2355:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 45s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 39s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |  50m  1s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734179/YARN-2355.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8028/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8028/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8028/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8028/console |


This message was automatically generated.

 MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container
 --

 Key: YARN-2355
 URL: https://issues.apache.org/jira/browse/YARN-2355
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Darrell Taylor
  Labels: newbie
 Attachments: YARN-2355.001.patch


 After YARN-2074, YARN-614 and YARN-611, the application cannot judge whether 
 it has the chance to try based on MAX_APP_ATTEMPTS_ENV alone. We should be 
 able to notify the application of the up-to-date remaining retry quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552959#comment-14552959
 ] 

Karthik Kambatla commented on YARN-3467:


We should add this information to ApplicationAttempt page, and also preferably 
to the RM Web UI. I have heard asks for both number of containers and allocated 
resources on the RM applications page, so people can sort applications by that. 

 Expose allocatedMB, allocatedVCores, and runningContainers metrics on running 
 Applications in RM Web UI
 ---

 Key: YARN-3467
 URL: https://issues.apache.org/jira/browse/YARN-3467
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: webapp, yarn
Affects Versions: 2.5.0
Reporter: Anthony Rojas
Assignee: Anubhav Dhoot
Priority: Minor
 Attachments: ApplicationAttemptPage.png


 The YARN REST API can report on the following properties:
 *allocatedMB*: The sum of memory in MB allocated to the application's running 
 containers
 *allocatedVCores*: The sum of virtual cores allocated to the application's 
 running containers
 *runningContainers*: The number of containers currently running for the 
 application
 Currently, the RM Web UI does not report on these items (at least I couldn't 
 find any entries within the Web UI).
 It would be useful for YARN Application and Resource troubleshooting to have 
 these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3681:
--
Attachment: YARN-3681.1.patch

Oh the irony, neither did my own.  Updated to one which does.

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
 YARN-3681.1.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552841#comment-14552841
 ] 

Hadoop QA commented on YARN-3675:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 47s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 16s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 51s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  87m 29s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734156/YARN-3675.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8025/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8025/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8025/console |


This message was automatically generated.

 FairScheduler: RM quits when node removal races with continousscheduling on 
 the same node
 -

 Key: YARN-3675
 URL: https://issues.apache.org/jira/browse/YARN-3675
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Anubhav Dhoot
Assignee: Anubhav Dhoot
 Attachments: YARN-3675.001.patch, YARN-3675.002.patch


 With continuous scheduling, scheduling can be done on a node thats just 
 removed causing errors like below.
 {noformat}
 12:28:53.782 AM FATAL 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
 Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
 java.lang.NullPointerException
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
   at 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
   at java.lang.Thread.run(Thread.java:745)
 12:28:53.783 AMINFO 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says could not find main class 'queue' in windows

2015-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552738#comment-14552738
 ] 

Varun Saxena commented on YARN-3681:


[~cwelch], it has to do with line endings.
I have to run {{unix2dos}} to convert line endings for Jenkins to accept it. 
Windows batch files patches do not always apply depending on settings of line 
endings done by the user. I think my patch did not apply for you because of 
that reason.

 yarn cmd says could not find main class 'queue' in windows
 

 Key: YARN-3681
 URL: https://issues.apache.org/jira/browse/YARN-3681
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Affects Versions: 2.7.0
 Environment: Windows Only
Reporter: Sumana Sathish
Assignee: Varun Saxena
Priority: Blocker
  Labels: windows, yarn-client
 Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
 YARN-3681.1.patch, yarncmd.png


 Attached the screenshot of the command prompt in windows running yarn queue 
 command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2355) MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container

2015-05-20 Thread Darrell Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darrell Taylor updated YARN-2355:
-
Attachment: YARN-2355.001.patch

 MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container
 --

 Key: YARN-2355
 URL: https://issues.apache.org/jira/browse/YARN-2355
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Zhijie Shen
Assignee: Darrell Taylor
  Labels: newbie
 Attachments: YARN-2355.001.patch


 After YARN-2074, YARN-614 and YARN-611, the application cannot judge whether 
 it has the chance to try based on MAX_APP_ATTEMPTS_ENV alone. We should be 
 able to notify the application of the up-to-date remaining retry quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location

2015-05-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552800#comment-14552800
 ] 

Wangda Tan commented on YARN-314:
-

[~kasha],
Actually I'm not quite sure about this proposal, what's the benefit of putting 
all apps' requests together comparing to hold one data structure per app, is 
there any use case?

 Schedulers should allow resource requests of different sizes at the same 
 priority and location
 --

 Key: YARN-314
 URL: https://issues.apache.org/jira/browse/YARN-314
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: scheduler
Affects Versions: 2.0.2-alpha
Reporter: Sandy Ryza
 Attachments: yarn-314-prelim.patch


 Currently, resource requests for the same container and locality are expected 
 to all be the same size.
 While it it doesn't look like it's needed for apps currently, and can be 
 circumvented by specifying different priorities if absolutely necessary, it 
 seems to me that the ability to request containers with different resource 
 requirements at the same priority level should be there for the future and 
 for completeness sake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2408) Resource Request REST API for YARN

2015-05-20 Thread Renan DelValle (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552849#comment-14552849
 ] 

Renan DelValle commented on YARN-2408:
--


[~leftnoteasy], thanks for taking a look at the patch, really appreciate it.

1) I agree, the original patch I had was very verbose so I shrunk down the 
amount of data being transferred by clustering resource requests together. 
Seems to be the best alternative to keeping original ResourceRequest structures.

2) I will take a look at that and implement it that way. (Thank you for 
pointing me in the right direction). On the resource-by-label inclusion, do you 
think it would be better to wait until it is patched into the trunk in order to 
make the process easier?


 Resource Request REST API for YARN
 --

 Key: YARN-2408
 URL: https://issues.apache.org/jira/browse/YARN-2408
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: webapp
Reporter: Renan DelValle
  Labels: features
 Attachments: YARN-2408-6.patch


 I’m proposing a new REST API for YARN which exposes a snapshot of the 
 Resource Requests that exist inside of the Scheduler. My motivation behind 
 this new feature is to allow external software to monitor the amount of 
 resources being requested to gain more insightful information into cluster 
 usage than is already provided. The API can also be used by external software 
 to detect a starved application and alert the appropriate users and/or sys 
 admin so that the problem may be remedied.
 Here is the proposed API (a JSON counterpart is also available):
 {code:xml}
 resourceRequests
   MB7680/MB
   VCores7/VCores
   appMaster
 applicationIdapplication_1412191664217_0001/applicationId
 
 applicationAttemptIdappattempt_1412191664217_0001_01/applicationAttemptId
 queueNamedefault/queueName
 totalMB6144/totalMB
 totalVCores6/totalVCores
 numResourceRequests3/numResourceRequests
 requests
   request
 MB1024/MB
 VCores1/VCores
 numContainers6/numContainers
 relaxLocalitytrue/relaxLocality
 priority20/priority
 resourceNames
   resourceNamelocalMachine/resourceName
   resourceName/default-rack/resourceName
   resourceName*/resourceName
 /resourceNames
   /request
 /requests
   /appMaster
   appMaster
   ...
   /appMaster
 /resourceRequests
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-20 Thread Varun Saxena (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Saxena updated YARN-3051:
---
Attachment: YARN-3051-YARN-2928.03.patch

 [Storage abstraction] Create backing storage read interface for ATS readers
 ---

 Key: YARN-3051
 URL: https://issues.apache.org/jira/browse/YARN-3051
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Sangjin Lee
Assignee: Varun Saxena
 Attachments: YARN-3051-YARN-2928.03.patch, 
 YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, YARN-3051_temp.patch


 Per design in YARN-2928, create backing storage read interface that can be 
 implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3609) Move load labels from storage from serviceInit to serviceStart to make it works with RM HA case.

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553377#comment-14553377
 ] 

Hadoop QA commented on YARN-3609:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734279/YARN-3609.3.branch-2.7.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8966d42 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8035/console |


This message was automatically generated.

 Move load labels from storage from serviceInit to serviceStart to make it 
 works with RM HA case.
 

 Key: YARN-3609
 URL: https://issues.apache.org/jira/browse/YARN-3609
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
 Attachments: YARN-3609.1.preliminary.patch, YARN-3609.2.patch, 
 YARN-3609.3.branch-2.7.patch, YARN-3609.3.patch


 Now RMNodeLabelsManager loads label when serviceInit, but 
 RMActiveService.start() is called when RM HA transition happens.
 We haven't done this before because queue's initialization happens in 
 serviceInit as well, we need make sure labels added to system before init 
 queue, after YARN-2918, we should be able to do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2556) Tool to measure the performance of the timeline server

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553284#comment-14553284
 ] 

Hadoop QA commented on YARN-2556:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   6m 53s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 6 new or modified test files. |
| {color:green}+1{color} | javac |   9m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | release audit |   0m 31s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 19s | There were no new checkstyle 
issues. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 1  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   2m  2s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 39s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m 51s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | mapreduce tests |  99m 47s | Tests failed in 
hadoop-mapreduce-client-jobclient. |
| | | 120m 54s | |
\\
\\
|| Reason || Tests ||
| Timed out tests | org.apache.hadoop.mapred.TestMerge |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734234/YARN-2556.10.patch |
| Optional Tests | javac unit findbugs checkstyle |
| git revision | trunk / 03f897f |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/8031/artifact/patchprocess/whitespace.txt
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8031/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8031/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8031/console |


This message was automatically generated.

 Tool to measure the performance of the timeline server
 --

 Key: YARN-2556
 URL: https://issues.apache.org/jira/browse/YARN-2556
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Chang Li
  Labels: BB2015-05-TBR
 Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
 YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.2.patch, YARN-2556.3.patch, 
 YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, 
 YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, 
 yarn2556.patch, yarn2556_wip.patch


 We need to be able to understand the capacity model for the timeline server 
 to give users the tools they need to deploy a timeline server with the 
 correct capacity.
 I propose we create a mapreduce job that can measure timeline server write 
 and read performance. Transactions per second, I/O for both read and write 
 would be a good start.
 This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >