[jira] [Updated] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-20 Thread Lavkesh Lahngir (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lavkesh Lahngir updated YARN-3591:
--
Attachment: YARN-3591.4.patch

> Resource Localisation on a bad disk causes subsequent containers failure 
> -
>
> Key: YARN-3591
> URL: https://issues.apache.org/jira/browse/YARN-3591
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Lavkesh Lahngir
>Assignee: Lavkesh Lahngir
> Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
> YARN-3591.2.patch, YARN-3591.3.patch, YARN-3591.4.patch
>
>
> It happens when a resource is localised on the disk, after localising that 
> disk has gone bad. NM keeps paths for localised resources in memory.  At the 
> time of resource request isResourcePresent(rsrc) will be called which calls 
> file.exists() on the localised path.
> In some cases when disk has gone bad, inodes are stilled cached and 
> file.exists() returns true. But at the time of reading, file will not open.
> Note: file.exists() actually calls stat64 natively which returns true because 
> it was able to find inode information from the OS.
> A proposal is to call file.list() on the parent path of the resource, which 
> will call open() natively. If the disk is good it should return an array of 
> paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551946#comment-14551946
 ] 

Hadoop QA commented on YARN-3675:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 36s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 47s | The applied patch generated  1 
new checkstyle issues (total was 74, now 75). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m  8s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 30s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734074/YARN-3675.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8018/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8018/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8018/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8018/console |


This message was automatically generated.

> FairScheduler: RM quits when node removal races with continousscheduling on 
> the same node
> -
>
> Key: YARN-3675
> URL: https://issues.apache.org/jira/browse/YARN-3675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3675.001.patch
>
>
> With continuous scheduling, scheduling can be done on a node thats just 
> removed causing errors like below.
> {noformat}
> 12:28:53.782 AM FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
> Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
>   at java.lang.Thread.run(Thread.java:745)
> 12:28:53.783 AMINFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3689) FifoComparator logic is wrong. In method "compare" in "FifoPolicy.java" file, the "s1" and "s2" should change position when compare priority

2015-05-20 Thread zhoulinlin (JIRA)
zhoulinlin created YARN-3689:


 Summary: FifoComparator logic is wrong. In method "compare" in 
"FifoPolicy.java" file, the "s1" and "s2" should change position when compare 
priority 
 Key: YARN-3689
 URL: https://issues.apache.org/jira/browse/YARN-3689
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, scheduler
Affects Versions: 2.5.0
Reporter: zhoulinlin


In method "compare" in "FifoPolicy.java" file, the "s1" and "s2" should change 
position when compare priority.

I did a test. Configured the schedulerpolicy "fifo",  submitted 2 jobs to the 
same queue.
The result is below:
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
before sort --  
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432094103221_0001 appPririty:4  appStartTime:1432094170038
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432094103221_0002 appPririty:2  appStartTime:1432094173131
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: after 
sort % 
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432094103221_0001 appPririty:4  appStartTime:1432094170038 
 
2015-05-20 11:57:41,449 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432094103221_0002 appPririty:2  appStartTime:1432094173131 
 

But when change the "s1" and "s2" position like below:

public int compare(Schedulable s1, Schedulable s2) {
  int res = s2.getPriority().compareTo(s1.getPriority());
.}

The result:
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
before sort -- 
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432090734333_0009 appPririty:4  appStartTime:1432092992503
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432090734333_0010 appPririty:2  appStartTime:1432092996437
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: after 
sort % 
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432090734333_0010 appPririty:2  appStartTime:1432092996437
2015-05-20 11:36:37,119 DEBUG 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue: 
appName:application_1432090734333_0009 appPririty:4  appStartTime:1432092992503 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-20 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551962#comment-14551962
 ] 

Akira AJISAKA commented on YARN-2336:
-

The test failure looks unrelated to the patch. Kicked 
https://builds.apache.org/job/PreCommit-YARN-Build/8020/

> Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
> --
>
> Key: YARN-2336
> URL: https://issues.apache.org/jira/browse/YARN-2336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.1, 2.6.0
>Reporter: Kenji Kikushima
>Assignee: Akira AJISAKA
>  Labels: BB2015-05-RFC
> Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
> YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
> YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch
>
>
> When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
> blacket JSON for childQueues.
> This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-20 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3543:
-
Attachment: (was: 0003-YARN-3543.patch)

> ApplicationReport should be able to tell whether the Application is AM 
> managed or not. 
> ---
>
> Key: YARN-3543
> URL: https://issues.apache.org/jira/browse/YARN-3543
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.6.0
>Reporter: Spandan Dutta
>Assignee: Rohith
>  Labels: BB2015-05-TBR
> Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
> 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
> 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG
>
>
> Currently we can know whether the application submitted by the user is AM 
> managed from the applicationSubmissionContext. This can be only done  at the 
> time when the user submits the job. We should have access to this info from 
> the ApplicationReport as well so that we can check whether an app is AM 
> managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-20 Thread Rohith (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith updated YARN-3543:
-
Attachment: 0004-YARN-3543.patch

Attached same patch to kick off Jenkins

> ApplicationReport should be able to tell whether the Application is AM 
> managed or not. 
> ---
>
> Key: YARN-3543
> URL: https://issues.apache.org/jira/browse/YARN-3543
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.6.0
>Reporter: Spandan Dutta
>Assignee: Rohith
>  Labels: BB2015-05-TBR
> Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
> 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
> 0004-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG
>
>
> Currently we can know whether the application submitted by the user is AM 
> managed from the applicationSubmissionContext. This can be only done  at the 
> time when the user submits the job. We should have access to this info from 
> the ApplicationReport as well so that we can check whether an app is AM 
> managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3591) Resource Localisation on a bad disk causes subsequent containers failure

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551976#comment-14551976
 ] 

Hadoop QA commented on YARN-3591:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 42s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 20s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 28s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  42m 15s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734083/YARN-3591.4.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8019/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8019/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8019/console |


This message was automatically generated.

> Resource Localisation on a bad disk causes subsequent containers failure 
> -
>
> Key: YARN-3591
> URL: https://issues.apache.org/jira/browse/YARN-3591
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Lavkesh Lahngir
>Assignee: Lavkesh Lahngir
> Attachments: 0001-YARN-3591.1.patch, 0001-YARN-3591.patch, 
> YARN-3591.2.patch, YARN-3591.3.patch, YARN-3591.4.patch
>
>
> It happens when a resource is localised on the disk, after localising that 
> disk has gone bad. NM keeps paths for localised resources in memory.  At the 
> time of resource request isResourcePresent(rsrc) will be called which calls 
> file.exists() on the localised path.
> In some cases when disk has gone bad, inodes are stilled cached and 
> file.exists() returns true. But at the time of reading, file will not open.
> Note: file.exists() actually calls stat64 natively which returns true because 
> it was able to find inode information from the OS.
> A proposal is to call file.list() on the parent path of the resource, which 
> will call open() natively. If the disk is good it should return an array of 
> paths with length at-least 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2336) Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552058#comment-14552058
 ] 

Hadoop QA commented on YARN-2336:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 39s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 57s | Site still builds. |
| {color:red}-1{color} | checkstyle |   0m 45s | The applied patch generated  1 
new checkstyle issues (total was 8, now 8). |
| {color:green}+1{color} | whitespace |   0m  1s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 14s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  49m 57s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  92m 12s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734018/YARN-2336.009.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle site |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8020/artifact/patchprocess/diffcheckstylehadoop-yarn-server-resourcemanager.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8020/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8020/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8020/console |


This message was automatically generated.

> Fair scheduler REST api returns a missing '[' bracket JSON for deep queue tree
> --
>
> Key: YARN-2336
> URL: https://issues.apache.org/jira/browse/YARN-2336
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.4.1, 2.6.0
>Reporter: Kenji Kikushima
>Assignee: Akira AJISAKA
>  Labels: BB2015-05-RFC
> Attachments: YARN-2336-2.patch, YARN-2336-3.patch, YARN-2336-4.patch, 
> YARN-2336.005.patch, YARN-2336.007.patch, YARN-2336.008.patch, 
> YARN-2336.009.patch, YARN-2336.009.patch, YARN-2336.patch
>
>
> When we have sub queues in Fair Scheduler, REST api returns a missing '[' 
> blacket JSON for childQueues.
> This issue found by [~ajisakaa] at YARN-1050.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-20 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552091#comment-14552091
 ] 

Rohith commented on YARN-3646:
--

Thanks for updating the patch, some comments on tests 
# I think we can remove the tests added in the hadoop-common project, since 
yarn-client verifies required funcitionality. And basically hadoop-common test 
was mocking the RMProxy functionality which test was passing without RMProxy 
fix also.
# code never reach {{Assert.fail("");}}. better to remove it
# Catch the ApplicationNotFoundException instead of catching throwable. I think 
you can add {{expected = ApplicationNotFoundException.class}} in the @Test 
annotation  like below.
{code}
@Test(timeout = 3, expected = ApplicationNotFoundException.class)
  public void testClientWithRetryPolicyForEver() throws Exception {
YarnConfiguration conf = new YarnConfiguration();
conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, -1);

ResourceManager rm = null;
YarnClient yarnClient = null;
try {
  // start rm
  rm = new ResourceManager();
  rm.init(conf);
  rm.start();

  yarnClient = YarnClient.createYarnClient();
  yarnClient.init(conf);
  yarnClient.start();

  // create invalid application id
  ApplicationId appId = ApplicationId.newInstance(1430126768987L, 10645);

  // RM should throw ApplicationNotFoundException exception
  yarnClient.getApplicationReport(appId);
} finally {
  if (yarnClient != null) {
yarnClient.stop();
  }
  if (rm != null) {
rm.stop();
  }
}
  }
{code}
# can you rename the test name with actual functionality test, like 
{{testShouldNotRetryForeverForNonNetworkExceptions}}

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.001.patch, YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.had

[jira] [Created] (YARN-3690) 'mvn site' fails on JDK8

2015-05-20 Thread Akira AJISAKA (JIRA)
Akira AJISAKA created YARN-3690:
---

 Summary: 'mvn site' fails on JDK8
 Key: YARN-3690
 URL: https://issues.apache.org/jira/browse/YARN-3690
 Project: Hadoop YARN
  Issue Type: Bug
  Components: documentation
 Environment: CentOS 7.0, Oracle JDK 8u45.
Reporter: Akira AJISAKA


{noformat}
[ERROR] 
/home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
 error: package org.apache.hadoop.yarn.factories has already been annotated
[ERROR] @InterfaceAudience.LimitedPrivate({ "MapReduce", "YARN" })
[ERROR] ^
[ERROR] java.lang.AssertionError
[ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
[ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
[ERROR] at 
com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
[ERROR] at 
com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
[ERROR] at 
com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
[ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
[ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
[ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
[ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
[ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
[ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
[ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
[ERROR] at 
com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
[ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
[ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
[ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
[ERROR] javadoc: error - fatal error
[ERROR] 
[ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3690) 'mvn site' fails on JDK8

2015-05-20 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated YARN-3690:

Description: 
'mvn site' failed by the following error:
{noformat}
[ERROR] 
/home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
 error: package org.apache.hadoop.yarn.factories has already been annotated
[ERROR] @InterfaceAudience.LimitedPrivate({ "MapReduce", "YARN" })
[ERROR] ^
[ERROR] java.lang.AssertionError
[ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
[ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
[ERROR] at 
com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
[ERROR] at 
com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
[ERROR] at 
com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
[ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
[ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
[ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
[ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
[ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
[ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
[ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
[ERROR] at 
com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
[ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
[ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
[ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
[ERROR] javadoc: error - fatal error
[ERROR] 
[ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
{noformat}

  was:
{noformat}
[ERROR] 
/home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
 error: package org.apache.hadoop.yarn.factories has already been annotated
[ERROR] @InterfaceAudience.LimitedPrivate({ "MapReduce", "YARN" })
[ERROR] ^
[ERROR] java.lang.AssertionError
[ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
[ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
[ERROR] at 
com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
[ERROR] at 
com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
[ERROR] at 
com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
[ERROR] at com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
[ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
[ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
[ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
[ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
[ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
[ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
[ERROR] at 
com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
[ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
[ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
[ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
[ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
[ERROR] javadoc: error - fatal error
[ERROR] 
[ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc -J-Xmx1024m 
@options @packages
[ERROR] 
[ERROR] Refer to the generated Javadoc files in 
'/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
{noformat}


> 'mvn site' fails o

[jira] [Assigned] (YARN-3690) 'mvn site' fails on JDK8

2015-05-20 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula reassigned YARN-3690:
--

Assignee: Brahma Reddy Battula

> 'mvn site' fails on JDK8
> 
>
> Key: YARN-3690
> URL: https://issues.apache.org/jira/browse/YARN-3690
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
> Environment: CentOS 7.0, Oracle JDK 8u45.
>Reporter: Akira AJISAKA
>Assignee: Brahma Reddy Battula
>
> 'mvn site' failed by the following error:
> {noformat}
> [ERROR] 
> /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
>  error: package org.apache.hadoop.yarn.factories has already been annotated
> [ERROR] @InterfaceAudience.LimitedPrivate({ "MapReduce", "YARN" })
> [ERROR] ^
> [ERROR] java.lang.AssertionError
> [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
> [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
> [ERROR] at 
> com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
> [ERROR] at 
> com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
> [ERROR] at 
> com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
> [ERROR] at 
> com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
> [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
> [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
> [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
> [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
> [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
> [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
> [ERROR] at 
> com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
> [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
> [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
> [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
> [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
> [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
> [ERROR] javadoc: error - fatal error
> [ERROR] 
> [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc 
> -J-Xmx1024m @options @packages
> [ERROR] 
> [ERROR] Refer to the generated Javadoc files in 
> '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
> [ERROR] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552125#comment-14552125
 ] 

Hadoop QA commented on YARN-3543:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 47s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 14 new or modified test files. |
| {color:green}+1{color} | javac |   7m 35s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 36s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 16s | The applied patch generated  1 
new checkstyle issues (total was 14, now 14). |
| {color:green}+1{color} | whitespace |   0m 11s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 39s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   7m  9s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:red}-1{color} | mapreduce tests | 116m 37s | Tests failed in 
hadoop-mapreduce-client-jobclient. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:red}-1{color} | yarn tests |   6m 37s | Tests failed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   2m  3s | Tests passed in 
hadoop-yarn-common. |
| {color:red}-1{color} | yarn tests |   0m 19s | Tests failed in 
hadoop-yarn-server-applicationhistoryservice. |
| {color:green}+1{color} | yarn tests |   0m 28s | Tests passed in 
hadoop-yarn-server-common. |
| {color:red}-1{color} | yarn tests |   0m 22s | Tests failed in 
hadoop-yarn-server-resourcemanager. |
| | | 171m 57s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.yarn.client.api.impl.TestAHSClient |
|   | hadoop.yarn.client.TestApplicationClientProtocolOnHA |
|   | hadoop.yarn.client.cli.TestYarnCLI |
|   | hadoop.yarn.client.api.impl.TestYarnClient |
| Timed out tests | org.apache.hadoop.mapreduce.TestMRJobClient |
|   | org.apache.hadoop.mapreduce.TestMapReduceLazyOutput |
| Failed build | hadoop-yarn-server-applicationhistoryservice |
|   | hadoop-yarn-server-resourcemanager |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734085/0004-YARN-3543.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / ce53c8e |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt
 |
| hadoop-mapreduce-client-jobclient test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-mapreduce-client-jobclient.txt
 |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| hadoop-yarn-server-applicationhistoryservice test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-server-applicationhistoryservice.txt
 |
| hadoop-yarn-server-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-server-common.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8021/console |


This message was automatically generated.

> ApplicationReport should be able to tell whether the Application is AM 
> managed or not. 
> ---
>
> Key: YARN-3543
> URL: https://issues.apache.org/jira/browse/YARN-3543
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.6.0
>Repor

[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: YARN-3344-trunk.004.patch

updated patch with formatting issue fixed

> procfs stat file is not in the expected format warning
> --
>
> Key: YARN-3344
> URL: https://issues.apache.org/jira/browse/YARN-3344
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jon Bringhurst
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3344-branch-trunk.001.patch, 
> YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch, 
> YARN-3344-trunk.004.patch
>
>
> Although this doesn't appear to be causing any functional issues, it is 
> spamming our log files quite a bit. :)
> It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
> /proc//stat files.
> Here's the error I'm seeing:
> {noformat}
> "source_host": "asdf",
> "method": "constructProcessInfo",
> "level": "WARN",
> "message": "Unexpected: procfs stat file is not in the expected format 
> for process with pid 6953"
> "file": "ProcfsBasedProcessTree.java",
> "line_number": "514",
> "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree",
> {noformat}
> And here's the basic info on process with pid 6953:
> {noformat}
> [asdf ~]$ cat /proc/6953/stat
> 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
> 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
> 2 18446744073709551615 0 0 17 13 0 0 0 0 0
> [asdf ~]$ ps aux|grep 6953
> root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
> /export/apps/salt/minion-scripts/module-sync.py
> jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
> [asdf ~]$ 
> {noformat}
> This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3690) 'mvn site' fails on JDK8

2015-05-20 Thread Akira AJISAKA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552145#comment-14552145
 ] 

Akira AJISAKA commented on YARN-3690:
-

The problems is: 
* There are 2 package-info.java for org.apache.hadoop.yarn.factories. One is in 
hadoop-yarn-common and the other is in hadoop-yarn-api.
* Both of the two packages are annotated.

> 'mvn site' fails on JDK8
> 
>
> Key: YARN-3690
> URL: https://issues.apache.org/jira/browse/YARN-3690
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: documentation
> Environment: CentOS 7.0, Oracle JDK 8u45.
>Reporter: Akira AJISAKA
>Assignee: Brahma Reddy Battula
>
> 'mvn site' failed by the following error:
> {noformat}
> [ERROR] 
> /home/aajisaka/git/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/factories/package-info.java:18:
>  error: package org.apache.hadoop.yarn.factories has already been annotated
> [ERROR] @InterfaceAudience.LimitedPrivate({ "MapReduce", "YARN" })
> [ERROR] ^
> [ERROR] java.lang.AssertionError
> [ERROR] at com.sun.tools.javac.util.Assert.error(Assert.java:126)
> [ERROR] at com.sun.tools.javac.util.Assert.check(Assert.java:45)
> [ERROR] at 
> com.sun.tools.javac.code.SymbolMetadata.setDeclarationAttributesWithCompletion(SymbolMetadata.java:161)
> [ERROR] at 
> com.sun.tools.javac.code.Symbol.setDeclarationAttributesWithCompletion(Symbol.java:215)
> [ERROR] at 
> com.sun.tools.javac.comp.MemberEnter.actualEnterAnnotations(MemberEnter.java:952)
> [ERROR] at 
> com.sun.tools.javac.comp.MemberEnter.access$600(MemberEnter.java:64)
> [ERROR] at com.sun.tools.javac.comp.MemberEnter$5.run(MemberEnter.java:876)
> [ERROR] at com.sun.tools.javac.comp.Annotate.flush(Annotate.java:143)
> [ERROR] at com.sun.tools.javac.comp.Annotate.enterDone(Annotate.java:129)
> [ERROR] at com.sun.tools.javac.comp.Enter.complete(Enter.java:512)
> [ERROR] at com.sun.tools.javac.comp.Enter.main(Enter.java:471)
> [ERROR] at com.sun.tools.javadoc.JavadocEnter.main(JavadocEnter.java:78)
> [ERROR] at 
> com.sun.tools.javadoc.JavadocTool.getRootDocImpl(JavadocTool.java:186)
> [ERROR] at com.sun.tools.javadoc.Start.parseAndExecute(Start.java:346)
> [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:219)
> [ERROR] at com.sun.tools.javadoc.Start.begin(Start.java:205)
> [ERROR] at com.sun.tools.javadoc.Main.execute(Main.java:64)
> [ERROR] at com.sun.tools.javadoc.Main.main(Main.java:54)
> [ERROR] javadoc: error - fatal error
> [ERROR] 
> [ERROR] Command line was: /usr/java/jdk1.8.0_45/jre/../bin/javadoc 
> -J-Xmx1024m @options @packages
> [ERROR] 
> [ERROR] Refer to the generated Javadoc files in 
> '/home/aajisaka/git/hadoop/target/site/hadoop-project/api' dir.
> [ERROR] -> [Help 1]
> [ERROR] 
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR] 
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3543) ApplicationReport should be able to tell whether the Application is AM managed or not.

2015-05-20 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552165#comment-14552165
 ] 

Rohith commented on YARN-3543:
--

Build machine is not able to run all those test at one shot. Similar issue had 
faced earlier in YARN-2784.  I think need to split the  JIRA into proto change, 
WebUI change, AH change and more.

> ApplicationReport should be able to tell whether the Application is AM 
> managed or not. 
> ---
>
> Key: YARN-3543
> URL: https://issues.apache.org/jira/browse/YARN-3543
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: api
>Affects Versions: 2.6.0
>Reporter: Spandan Dutta
>Assignee: Rohith
>  Labels: BB2015-05-TBR
> Attachments: 0001-YARN-3543.patch, 0001-YARN-3543.patch, 
> 0002-YARN-3543.patch, 0002-YARN-3543.patch, 0003-YARN-3543.patch, 
> 0004-YARN-3543.patch, 0004-YARN-3543.patch, YARN-3543-AH.PNG, YARN-3543-RM.PNG
>
>
> Currently we can know whether the application submitted by the user is AM 
> managed from the applicationSubmissionContext. This can be only done  at the 
> time when the user submits the job. We should have access to this info from 
> the ApplicationReport as well so that we can check whether an app is AM 
> managed or not anytime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552181#comment-14552181
 ] 

Hudson commented on YARN-3601:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java
* hadoop-yarn-project/CHANGES.txt


> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Fix For: 2.7.1
>
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552190#comment-14552190
 ] 

Hudson commented on YARN-3302:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


> TestDockerContainerExecutor should run automatically if it can detect docker 
> in the usual place
> ---
>
> Key: YARN-3302
> URL: https://issues.apache.org/jira/browse/YARN-3302
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
> YARN-3302-trunk.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552192#comment-14552192
 ] 

Hudson commented on YARN-3677:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java


> Fix findbugs warnings in yarn-server-resourcemanager
> 
>
> Key: YARN-3677
> URL: https://issues.apache.org/jira/browse/YARN-3677
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Akira AJISAKA
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>  Labels: newbie
> Fix For: 2.7.1
>
> Attachments: YARN-3677-20150519.txt
>
>
> There is 1 findbugs warning in FileSystemRMStateStore.java.
> {noformat}
> Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
> time
> Unsynchronized access at FileSystemRMStateStore.java: [line 156]
> Field 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> Synchronized 66% of the time
> Synchronized access at FileSystemRMStateStore.java: [line 148]
> Synchronized access at FileSystemRMStateStore.java: [line 859]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552185#comment-14552185
 ] 

Hudson commented on YARN-3565:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java


> NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
> instead of String
> -
>
> Key: YARN-3565
> URL: https://issues.apache.org/jira/browse/YARN-3565
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
> YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch
>
>
> Now NM HB/Register uses Set, it will be hard to add new fields if we 
> want to support specifying NodeLabel type such as exclusivity/constraints, 
> etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552180#comment-14552180
 ] 

Hudson commented on YARN-3583:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java


> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552186#comment-14552186
 ] 

Hudson commented on YARN-2821:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #202 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/202/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* hadoop-yarn-project/CHANGES.txt


> Distributed shell app master becomes unresponsive sometimes
> ---
>
> Key: YARN-2821
> URL: https://issues.apache.org/jira/browse/YARN-2821
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Affects Versions: 2.5.1
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.8.0
>
> Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
> YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
> apache-yarn-2821.1.patch
>
>
> We've noticed that once in a while the distributed shell app master becomes 
> unresponsive and is eventually killed by the RM. snippet of the logs -
> {noformat}
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
> appattempt_1415123350094_0017_01 received 0 previous attempts' running 
> containers on AM registration.
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez2:45454
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_02, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez3:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez4:45454
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=3
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_03, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_04, 
> containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_05, 
> containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distrib

[jira] [Commented] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552195#comment-14552195
 ] 

Hadoop QA commented on YARN-3344:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  1s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 47s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 46s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   0m 57s | The applied patch generated  2 
new checkstyle issues (total was 43, now 42). |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 25s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   2m  4s | Tests passed in 
hadoop-yarn-common. |
| | |  39m 39s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734110/YARN-3344-trunk.004.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| checkstyle |  
https://builds.apache.org/job/PreCommit-YARN-Build/8022/artifact/patchprocess/diffcheckstylehadoop-yarn-common.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8022/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8022/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8022/console |


This message was automatically generated.

> procfs stat file is not in the expected format warning
> --
>
> Key: YARN-3344
> URL: https://issues.apache.org/jira/browse/YARN-3344
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jon Bringhurst
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3344-branch-trunk.001.patch, 
> YARN-3344-branch-trunk.002.patch, YARN-3344-branch-trunk.003.patch, 
> YARN-3344-trunk.004.patch
>
>
> Although this doesn't appear to be causing any functional issues, it is 
> spamming our log files quite a bit. :)
> It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
> /proc//stat files.
> Here's the error I'm seeing:
> {noformat}
> "source_host": "asdf",
> "method": "constructProcessInfo",
> "level": "WARN",
> "message": "Unexpected: procfs stat file is not in the expected format 
> for process with pid 6953"
> "file": "ProcfsBasedProcessTree.java",
> "line_number": "514",
> "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree",
> {noformat}
> And here's the basic info on process with pid 6953:
> {noformat}
> [asdf ~]$ cat /proc/6953/stat
> 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
> 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
> 2 18446744073709551615 0 0 17 13 0 0 0 0 0
> [asdf ~]$ ps aux|grep 6953
> root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
> /export/apps/salt/minion-scripts/module-sync.py
> jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
> [asdf ~]$ 
> {noformat}
> This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552203#comment-14552203
 ] 

Hudson commented on YARN-3583:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java


> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552204#comment-14552204
 ] 

Hudson commented on YARN-3601:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java


> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Fix For: 2.7.1
>
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552208#comment-14552208
 ] 

Hudson commented on YARN-3565:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto


> NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
> instead of String
> -
>
> Key: YARN-3565
> URL: https://issues.apache.org/jira/browse/YARN-3565
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
> YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch
>
>
> Now NM HB/Register uses Set, it will be hard to add new fields if we 
> want to support specifying NodeLabel type such as exclusivity/constraints, 
> etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552209#comment-14552209
 ] 

Hudson commented on YARN-2821:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java


> Distributed shell app master becomes unresponsive sometimes
> ---
>
> Key: YARN-2821
> URL: https://issues.apache.org/jira/browse/YARN-2821
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Affects Versions: 2.5.1
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.8.0
>
> Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
> YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
> apache-yarn-2821.1.patch
>
>
> We've noticed that once in a while the distributed shell app master becomes 
> unresponsive and is eventually killed by the RM. snippet of the logs -
> {noformat}
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
> appattempt_1415123350094_0017_01 received 0 previous attempts' running 
> containers on AM registration.
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez2:45454
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_02, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez3:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez4:45454
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=3
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_03, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_04, 
> containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_05, 
> containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.Ap

[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552214#comment-14552214
 ] 

Hudson commented on YARN-3677:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java


> Fix findbugs warnings in yarn-server-resourcemanager
> 
>
> Key: YARN-3677
> URL: https://issues.apache.org/jira/browse/YARN-3677
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Akira AJISAKA
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>  Labels: newbie
> Fix For: 2.7.1
>
> Attachments: YARN-3677-20150519.txt
>
>
> There is 1 findbugs warning in FileSystemRMStateStore.java.
> {noformat}
> Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
> time
> Unsynchronized access at FileSystemRMStateStore.java: [line 156]
> Field 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> Synchronized 66% of the time
> Synchronized access at FileSystemRMStateStore.java: [line 148]
> Synchronized access at FileSystemRMStateStore.java: [line 859]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552212#comment-14552212
 ] 

Hudson commented on YARN-3302:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #933 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/933/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


> TestDockerContainerExecutor should run automatically if it can detect docker 
> in the usual place
> ---
>
> Key: YARN-3302
> URL: https://issues.apache.org/jira/browse/YARN-3302
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
> YARN-3302-trunk.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-20 Thread Raju Bairishetti (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raju Bairishetti updated YARN-3646:
---
Attachment: YARN-3646.002.patch

[~rohithsharma] Thanks for the review and comments. Attached a new patch

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.001.patch, YARN-3646.002.patch, YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875163 Retry#0
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-20 Thread Rohith (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552225#comment-14552225
 ] 

Rohith commented on YARN-3646:
--

+1 lgtm (non-binding)..  wait for jenkins report!!

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.001.patch, YARN-3646.002.patch, YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
>   at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
> 
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 47 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875163 Retry#0
> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-trunk.002.patch)

> procfs stat file is not in the expected format warning
> --
>
> Key: YARN-3344
> URL: https://issues.apache.org/jira/browse/YARN-3344
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jon Bringhurst
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3344-branch-trunk.003.patch, 
> YARN-3344-trunk.004.patch
>
>
> Although this doesn't appear to be causing any functional issues, it is 
> spamming our log files quite a bit. :)
> It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
> /proc//stat files.
> Here's the error I'm seeing:
> {noformat}
> "source_host": "asdf",
> "method": "constructProcessInfo",
> "level": "WARN",
> "message": "Unexpected: procfs stat file is not in the expected format 
> for process with pid 6953"
> "file": "ProcfsBasedProcessTree.java",
> "line_number": "514",
> "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree",
> {noformat}
> And here's the basic info on process with pid 6953:
> {noformat}
> [asdf ~]$ cat /proc/6953/stat
> 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
> 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
> 2 18446744073709551615 0 0 17 13 0 0 0 0 0
> [asdf ~]$ ps aux|grep 6953
> root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
> /export/apps/salt/minion-scripts/module-sync.py
> jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
> [asdf ~]$ 
> {noformat}
> This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-trunk.001.patch)

> procfs stat file is not in the expected format warning
> --
>
> Key: YARN-3344
> URL: https://issues.apache.org/jira/browse/YARN-3344
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jon Bringhurst
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3344-branch-trunk.003.patch, 
> YARN-3344-trunk.004.patch
>
>
> Although this doesn't appear to be causing any functional issues, it is 
> spamming our log files quite a bit. :)
> It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
> /proc//stat files.
> Here's the error I'm seeing:
> {noformat}
> "source_host": "asdf",
> "method": "constructProcessInfo",
> "level": "WARN",
> "message": "Unexpected: procfs stat file is not in the expected format 
> for process with pid 6953"
> "file": "ProcfsBasedProcessTree.java",
> "line_number": "514",
> "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree",
> {noformat}
> And here's the basic info on process with pid 6953:
> {noformat}
> [asdf ~]$ cat /proc/6953/stat
> 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
> 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
> 2 18446744073709551615 0 0 17 13 0 0 0 0 0
> [asdf ~]$ ps aux|grep 6953
> root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
> /export/apps/salt/minion-scripts/module-sync.py
> jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
> [asdf ~]$ 
> {noformat}
> This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: YARN-3344-trunk.005.patch

updated patch with checkstyle issue handled

> procfs stat file is not in the expected format warning
> --
>
> Key: YARN-3344
> URL: https://issues.apache.org/jira/browse/YARN-3344
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jon Bringhurst
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3344-trunk.005.patch
>
>
> Although this doesn't appear to be causing any functional issues, it is 
> spamming our log files quite a bit. :)
> It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
> /proc//stat files.
> Here's the error I'm seeing:
> {noformat}
> "source_host": "asdf",
> "method": "constructProcessInfo",
> "level": "WARN",
> "message": "Unexpected: procfs stat file is not in the expected format 
> for process with pid 6953"
> "file": "ProcfsBasedProcessTree.java",
> "line_number": "514",
> "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree",
> {noformat}
> And here's the basic info on process with pid 6953:
> {noformat}
> [asdf ~]$ cat /proc/6953/stat
> 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
> 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
> 2 18446744073709551615 0 0 17 13 0 0 0 0 0
> [asdf ~]$ ps aux|grep 6953
> root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
> /export/apps/salt/minion-scripts/module-sync.py
> jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
> [asdf ~]$ 
> {noformat}
> This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-trunk.004.patch)

> procfs stat file is not in the expected format warning
> --
>
> Key: YARN-3344
> URL: https://issues.apache.org/jira/browse/YARN-3344
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jon Bringhurst
>Assignee: Ravindra Kumar Naik
>
> Although this doesn't appear to be causing any functional issues, it is 
> spamming our log files quite a bit. :)
> It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
> /proc//stat files.
> Here's the error I'm seeing:
> {noformat}
> "source_host": "asdf",
> "method": "constructProcessInfo",
> "level": "WARN",
> "message": "Unexpected: procfs stat file is not in the expected format 
> for process with pid 6953"
> "file": "ProcfsBasedProcessTree.java",
> "line_number": "514",
> "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree",
> {noformat}
> And here's the basic info on process with pid 6953:
> {noformat}
> [asdf ~]$ cat /proc/6953/stat
> 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
> 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
> 2 18446744073709551615 0 0 17 13 0 0 0 0 0
> [asdf ~]$ ps aux|grep 6953
> root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
> /export/apps/salt/minion-scripts/module-sync.py
> jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
> [asdf ~]$ 
> {noformat}
> This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Ravindra Kumar Naik (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Kumar Naik updated YARN-3344:
--
Attachment: (was: YARN-3344-branch-trunk.003.patch)

> procfs stat file is not in the expected format warning
> --
>
> Key: YARN-3344
> URL: https://issues.apache.org/jira/browse/YARN-3344
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jon Bringhurst
>Assignee: Ravindra Kumar Naik
>
> Although this doesn't appear to be causing any functional issues, it is 
> spamming our log files quite a bit. :)
> It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
> /proc//stat files.
> Here's the error I'm seeing:
> {noformat}
> "source_host": "asdf",
> "method": "constructProcessInfo",
> "level": "WARN",
> "message": "Unexpected: procfs stat file is not in the expected format 
> for process with pid 6953"
> "file": "ProcfsBasedProcessTree.java",
> "line_number": "514",
> "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree",
> {noformat}
> And here's the basic info on process with pid 6953:
> {noformat}
> [asdf ~]$ cat /proc/6953/stat
> 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
> 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
> 2 18446744073709551615 0 0 17 13 0 0 0 0 0
> [asdf ~]$ ps aux|grep 6953
> root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
> /export/apps/salt/minion-scripts/module-sync.py
> jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
> [asdf ~]$ 
> {noformat}
> This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1902) Allocation of too many containers when a second request is done with the same resource capability

2015-05-20 Thread MENG DING (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552375#comment-14552375
 ] 

MENG DING commented on YARN-1902:
-

I have been experimenting with the idea of changing AppSchedulingInfo to 
maintain a total request table, a fulfilled allocation table, and then 
calculate the difference of the two tables as the real outstanding request 
table used for scheduling. All is fine until I realized that this cannot handle 
one use case where a AMRMClient, right before sending the allocation heartbeat, 
removes all container requests, and add new container requests at the same 
priority and location (possibly with different resource capability).  
AppSchedulingInfo does not know about this, and may not treat the newly added 
container requests as outstanding requests.

I agree that currently I do not see a clean solution without affecting backward 
compatibility. 

> Allocation of too many containers when a second request is done with the same 
> resource capability
> -
>
> Key: YARN-1902
> URL: https://issues.apache.org/jira/browse/YARN-1902
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.2.0, 2.3.0, 2.4.0
>Reporter: Sietse T. Au
>Assignee: Sietse T. Au
>  Labels: client
> Attachments: YARN-1902.patch, YARN-1902.v2.patch, YARN-1902.v3.patch
>
>
> Regarding AMRMClientImpl
> Scenario 1:
> Given a ContainerRequest x with Resource y, when addContainerRequest is 
> called z times with x, allocate is called and at least one of the z allocated 
> containers is started, then if another addContainerRequest call is done and 
> subsequently an allocate call to the RM, (z+1) containers will be allocated, 
> where 1 container is expected.
> Scenario 2:
> No containers are started between the allocate calls. 
> Analyzing debug logs of the AMRMClientImpl, I have found that indeed a (z+1) 
> are requested in both scenarios, but that only in the second scenario, the 
> correct behavior is observed.
> Looking at the implementation I have found that this (z+1) request is caused 
> by the structure of the remoteRequestsTable. The consequence of Map ResourceRequestInfo> is that ResourceRequestInfo does not hold any 
> information about whether a request has been sent to the RM yet or not.
> There are workarounds for this, such as releasing the excess containers 
> received.
> The solution implemented is to initialize a new ResourceRequest in 
> ResourceRequestInfo when a request has been successfully sent to the RM.
> The patch includes a test in which scenario one is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552391#comment-14552391
 ] 

Hudson commented on YARN-3565:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2131/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java


> NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
> instead of String
> -
>
> Key: YARN-3565
> URL: https://issues.apache.org/jira/browse/YARN-3565
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
> YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch
>
>
> Now NM HB/Register uses Set, it will be hard to add new fields if we 
> want to support specifying NodeLabel type such as exclusivity/constraints, 
> etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552387#comment-14552387
 ] 

Hudson commented on YARN-3601:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2131/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java
* hadoop-yarn-project/CHANGES.txt


> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Fix For: 2.7.1
>
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552397#comment-14552397
 ] 

Hudson commented on YARN-3677:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2131/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


> Fix findbugs warnings in yarn-server-resourcemanager
> 
>
> Key: YARN-3677
> URL: https://issues.apache.org/jira/browse/YARN-3677
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Akira AJISAKA
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>  Labels: newbie
> Fix For: 2.7.1
>
> Attachments: YARN-3677-20150519.txt
>
>
> There is 1 findbugs warning in FileSystemRMStateStore.java.
> {noformat}
> Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
> time
> Unsynchronized access at FileSystemRMStateStore.java: [line 156]
> Field 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> Synchronized 66% of the time
> Synchronized access at FileSystemRMStateStore.java: [line 148]
> Synchronized access at FileSystemRMStateStore.java: [line 859]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552395#comment-14552395
 ] 

Hudson commented on YARN-3302:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2131/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


> TestDockerContainerExecutor should run automatically if it can detect docker 
> in the usual place
> ---
>
> Key: YARN-3302
> URL: https://issues.apache.org/jira/browse/YARN-3302
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
> YARN-3302-trunk.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552386#comment-14552386
 ] 

Hudson commented on YARN-3583:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2131/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java


> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552392#comment-14552392
 ] 

Hudson commented on YARN-2821:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2131 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2131/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java


> Distributed shell app master becomes unresponsive sometimes
> ---
>
> Key: YARN-2821
> URL: https://issues.apache.org/jira/browse/YARN-2821
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Affects Versions: 2.5.1
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.8.0
>
> Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
> YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
> apache-yarn-2821.1.patch
>
>
> We've noticed that once in a while the distributed shell app master becomes 
> unresponsive and is eventually killed by the RM. snippet of the logs -
> {noformat}
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
> appattempt_1415123350094_0017_01 received 0 previous attempts' running 
> containers on AM registration.
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez2:45454
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_02, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez3:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez4:45454
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=3
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_03, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_04, 
> containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_05, 
> containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.

[jira] [Commented] (YARN-3646) Applications are getting stuck some times in case of retry policy forever

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552404#comment-14552404
 ] 

Hadoop QA commented on YARN-3646:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 34s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 38s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m  6s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   6m 51s | Tests passed in 
hadoop-yarn-client. |
| {color:green}+1{color} | yarn tests |   1m 55s | Tests passed in 
hadoop-yarn-common. |
| | |  45m 47s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734115/YARN-3646.002.patch |
| Optional Tests | javac unit findbugs checkstyle javadoc |
| git revision | trunk / 4aa730c |
| hadoop-yarn-client test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8023/artifact/patchprocess/testrun_hadoop-yarn-client.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8023/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8023/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8023/console |


This message was automatically generated.

> Applications are getting stuck some times in case of retry policy forever
> -
>
> Key: YARN-3646
> URL: https://issues.apache.org/jira/browse/YARN-3646
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Raju Bairishetti
> Attachments: YARN-3646.001.patch, YARN-3646.002.patch, YARN-3646.patch
>
>
> We have set  *yarn.resourcemanager.connect.wait-ms* to -1 to use  FOREVER 
> retry policy.
> Yarn client is infinitely retrying in case of exceptions from the RM as it is 
> using retrying policy as FOREVER. The problem is it is retrying for all kinds 
> of exceptions (like ApplicationNotFoundException), even though it is not a 
> connection failure. Due to this my application is not progressing further.
> *Yarn client should not retry infinitely in case of non connection failures.*
> We have written a simple yarn-client which is trying to get an application 
> report for an invalid  or older appId. ResourceManager is throwing an 
> ApplicationNotFoundException as this is an invalid or older appId.  But 
> because of retry policy FOREVER, client is keep on retrying for getting the 
> application report and ResourceManager is throwing 
> ApplicationNotFoundException continuously.
> {code}
> private void testYarnClientRetryPolicy() throws  Exception{
> YarnConfiguration conf = new YarnConfiguration();
> conf.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 
> -1);
> YarnClient yarnClient = YarnClient.createYarnClient();
> yarnClient.init(conf);
> yarnClient.start();
> ApplicationId appId = ApplicationId.newInstance(1430126768987L, 
> 10645);
> ApplicationReport report = yarnClient.getApplicationReport(appId);
> }
> {code}
> *RM logs:*
> {noformat}
> 15/05/14 16:33:24 INFO ipc.Server: IPC Server handler 21 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from 10.14.120.231:61621 Call#875162 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1430126768987_10645' doesn't exist in RM.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:284)
>   at 
> org.apache.hadoop.yarn.api.impl.pb.servic

[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552416#comment-14552416
 ] 

Karthik Kambatla commented on YARN-314:
---

Discussed this with [~asuresh] offline. We were wondering if AppSchedulingInfo 
should be supplemented (or replaced) by another singleton data structure that 
captures pending requests and maintains multiple maps - to index these requests 
by both apps and nodes/racks. We should of course add other convenience methods 
to add/remove or query these requests. 

> Schedulers should allow resource requests of different sizes at the same 
> priority and location
> --
>
> Key: YARN-314
> URL: https://issues.apache.org/jira/browse/YARN-314
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
> Attachments: yarn-314-prelim.patch
>
>
> Currently, resource requests for the same container and locality are expected 
> to all be the same size.
> While it it doesn't look like it's needed for apps currently, and can be 
> circumvented by specifying different priorities if absolutely necessary, it 
> seems to me that the ability to request containers with different resource 
> requirements at the same priority level should be there for the future and 
> for completeness sake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552442#comment-14552442
 ] 

Hudson commented on YARN-2821:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #191 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/191/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java


> Distributed shell app master becomes unresponsive sometimes
> ---
>
> Key: YARN-2821
> URL: https://issues.apache.org/jira/browse/YARN-2821
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Affects Versions: 2.5.1
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.8.0
>
> Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
> YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
> apache-yarn-2821.1.patch
>
>
> We've noticed that once in a while the distributed shell app master becomes 
> unresponsive and is eventually killed by the RM. snippet of the logs -
> {noformat}
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
> appattempt_1415123350094_0017_01 received 0 previous attempts' running 
> containers on AM registration.
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez2:45454
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_02, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez3:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez4:45454
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=3
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_03, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_04, 
> containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_05, 
> containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distrib

[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552437#comment-14552437
 ] 

Hudson commented on YARN-3601:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #191 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/191/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java
* hadoop-yarn-project/CHANGES.txt


> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Fix For: 2.7.1
>
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552445#comment-14552445
 ] 

Hudson commented on YARN-3302:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #191 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/191/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


> TestDockerContainerExecutor should run automatically if it can detect docker 
> in the usual place
> ---
>
> Key: YARN-3302
> URL: https://issues.apache.org/jira/browse/YARN-3302
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
> YARN-3302-trunk.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552441#comment-14552441
 ] 

Hudson commented on YARN-3565:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #191 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/191/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto


> NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
> instead of String
> -
>
> Key: YARN-3565
> URL: https://issues.apache.org/jira/browse/YARN-3565
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
> YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch
>
>
> Now NM HB/Register uses Set, it will be hard to add new fields if we 
> want to support specifying NodeLabel type such as exclusivity/constraints, 
> etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3344) procfs stat file is not in the expected format warning

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552431#comment-14552431
 ] 

Hadoop QA commented on YARN-3344:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  2s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  10m  1s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 23s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   1m 57s | Tests passed in 
hadoop-yarn-common. |
| | |  39m  2s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734126/YARN-3344-trunk.005.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8024/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8024/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8024/console |


This message was automatically generated.

> procfs stat file is not in the expected format warning
> --
>
> Key: YARN-3344
> URL: https://issues.apache.org/jira/browse/YARN-3344
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Jon Bringhurst
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3344-trunk.005.patch
>
>
> Although this doesn't appear to be causing any functional issues, it is 
> spamming our log files quite a bit. :)
> It appears that the regex in ProcfsBasedProcessTree doesn't work for all 
> /proc//stat files.
> Here's the error I'm seeing:
> {noformat}
> "source_host": "asdf",
> "method": "constructProcessInfo",
> "level": "WARN",
> "message": "Unexpected: procfs stat file is not in the expected format 
> for process with pid 6953"
> "file": "ProcfsBasedProcessTree.java",
> "line_number": "514",
> "class": "org.apache.hadoop.yarn.util.ProcfsBasedProcessTree",
> {noformat}
> And here's the basic info on process with pid 6953:
> {noformat}
> [asdf ~]$ cat /proc/6953/stat
> 6953 (python2.6 /expo) S 1871 1871 1871 0 -1 4202496 9364 1080 0 0 25 3 0 0 
> 20 0 1 0 144918696 205295616 5856 18446744073709551615 1 1 0 0 0 0 0 16781312 
> 2 18446744073709551615 0 0 17 13 0 0 0 0 0
> [asdf ~]$ ps aux|grep 6953
> root  6953  0.0  0.0 200484 23424 ?S21:44   0:00 python2.6 
> /export/apps/salt/minion-scripts/module-sync.py
> jbringhu 13481  0.0  0.0 105312   872 pts/0S+   22:13   0:00 grep -i 6953
> [asdf ~]$ 
> {noformat}
> This is using 2.6.32-431.11.2.el6.x86_64 in RHEL 6.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552447#comment-14552447
 ] 

Hudson commented on YARN-3677:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #191 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/191/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* hadoop-yarn-project/CHANGES.txt


> Fix findbugs warnings in yarn-server-resourcemanager
> 
>
> Key: YARN-3677
> URL: https://issues.apache.org/jira/browse/YARN-3677
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Akira AJISAKA
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>  Labels: newbie
> Fix For: 2.7.1
>
> Attachments: YARN-3677-20150519.txt
>
>
> There is 1 findbugs warning in FileSystemRMStateStore.java.
> {noformat}
> Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
> time
> Unsynchronized access at FileSystemRMStateStore.java: [line 156]
> Field 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> Synchronized 66% of the time
> Synchronized access at FileSystemRMStateStore.java: [line 148]
> Synchronized access at FileSystemRMStateStore.java: [line 859]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552436#comment-14552436
 ] 

Hudson commented on YARN-3583:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #191 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/191/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java


> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3051) [Storage abstraction] Create backing storage read interface for ATS readers

2015-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552534#comment-14552534
 ] 

Varun Saxena commented on YARN-3051:


Well, I am still stuck on trying to get the attribute set via 
HttpServer2#setAttribute in WebServices class. Will update patch once that is 
done.

> [Storage abstraction] Create backing storage read interface for ATS readers
> ---
>
> Key: YARN-3051
> URL: https://issues.apache.org/jira/browse/YARN-3051
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: YARN-2928
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
> Attachments: YARN-3051.wip.02.YARN-2928.patch, YARN-3051.wip.patch, 
> YARN-3051_temp.patch
>
>
> Per design in YARN-2928, create backing storage read interface that can be 
> implemented by multiple backing storage implementations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552557#comment-14552557
 ] 

Hudson commented on YARN-3302:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java


> TestDockerContainerExecutor should run automatically if it can detect docker 
> in the usual place
> ---
>
> Key: YARN-3302
> URL: https://issues.apache.org/jira/browse/YARN-3302
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
> YARN-3302-trunk.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552559#comment-14552559
 ] 

Hudson commented on YARN-3677:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt


> Fix findbugs warnings in yarn-server-resourcemanager
> 
>
> Key: YARN-3677
> URL: https://issues.apache.org/jira/browse/YARN-3677
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Akira AJISAKA
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>  Labels: newbie
> Fix For: 2.7.1
>
> Attachments: YARN-3677-20150519.txt
>
>
> There is 1 findbugs warning in FileSystemRMStateStore.java.
> {noformat}
> Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
> time
> Unsynchronized access at FileSystemRMStateStore.java: [line 156]
> Field 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> Synchronized 66% of the time
> Synchronized access at FileSystemRMStateStore.java: [line 148]
> Synchronized access at FileSystemRMStateStore.java: [line 859]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552548#comment-14552548
 ] 

Hudson commented on YARN-3583:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java


> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552553#comment-14552553
 ] 

Hudson commented on YARN-3565:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java


> NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
> instead of String
> -
>
> Key: YARN-3565
> URL: https://issues.apache.org/jira/browse/YARN-3565
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
> YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch
>
>
> Now NM HB/Register uses Set, it will be hard to add new fields if we 
> want to support specifying NodeLabel type such as exclusivity/constraints, 
> etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552554#comment-14552554
 ] 

Hudson commented on YARN-2821:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java
* hadoop-yarn-project/CHANGES.txt


> Distributed shell app master becomes unresponsive sometimes
> ---
>
> Key: YARN-2821
> URL: https://issues.apache.org/jira/browse/YARN-2821
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Affects Versions: 2.5.1
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.8.0
>
> Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
> YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
> apache-yarn-2821.1.patch
>
>
> We've noticed that once in a while the distributed shell app master becomes 
> unresponsive and is eventually killed by the RM. snippet of the logs -
> {noformat}
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
> appattempt_1415123350094_0017_01 received 0 previous attempts' running 
> containers on AM registration.
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez2:45454
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_02, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez3:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez4:45454
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=3
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_03, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_04, 
> containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_05, 
> containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 IN

[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552549#comment-14552549
 ] 

Hudson commented on YARN-3601:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #201 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/201/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java


> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Fix For: 2.7.1
>
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3601) Fix UT TestRMFailover.testRMWebAppRedirect

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552592#comment-14552592
 ] 

Hudson commented on YARN-3601:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3601. Fix UT TestRMFailover.testRMWebAppRedirect. Contributed by Weiwei 
Yang (xgong: rev 5009ad4a7f712fc578b461ecec53f7f97eaaed0c)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/TestRMFailover.java
* hadoop-yarn-project/CHANGES.txt


> Fix UT TestRMFailover.testRMWebAppRedirect
> --
>
> Key: YARN-3601
> URL: https://issues.apache.org/jira/browse/YARN-3601
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager, webapp
> Environment: Red Hat Enterprise Linux Workstation release 6.5 
> (Santiago)
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Critical
>  Labels: test
> Fix For: 2.7.1
>
> Attachments: YARN-3601.001.patch
>
>
> This test case was not working since the commit from YARN-2605. It failed 
> with NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3565) NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object instead of String

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552596#comment-14552596
 ] 

Hudson commented on YARN-3565:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3565. NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel 
object instead of String. (Naganarasimha G R via wangda) (wangda: rev 
b37da52a1c4fb3da2bd21bfadc5ec61c5f953a59)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdaterForLabels.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/NodeHeartbeatRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/nodelabels/NodeLabelsProvider.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/nodelabels/NodeLabelTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/NodeStatusUpdaterImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/NodeHeartbeatRequest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/proto/yarn_server_common_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/RegisterNodeManagerRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/TestYarnServerApiClasses.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/RegisterNodeManagerRequest.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java


> NodeHeartbeatRequest/RegisterNodeManagerRequest should use NodeLabel object 
> instead of String
> -
>
> Key: YARN-3565
> URL: https://issues.apache.org/jira/browse/YARN-3565
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
>Priority: Blocker
> Fix For: 2.8.0
>
> Attachments: YARN-3565-20150502-1.patch, YARN-3565.20150515-1.patch, 
> YARN-3565.20150516-1.patch, YARN-3565.20150519-1.patch
>
>
> Now NM HB/Register uses Set, it will be hard to add new fields if we 
> want to support specifying NodeLabel type such as exclusivity/constraints, 
> etc. We need to make sure rolling upgrade works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2821) Distributed shell app master becomes unresponsive sometimes

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552597#comment-14552597
 ] 

Hudson commented on YARN-2821:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-2821. Fixed a problem that DistributedShell AM may hang if restarted. 
Contributed by Varun Vasudev (jianhe: rev 
7438966586f1896ab3e8b067d47a4af28a894106)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDSAppMaster.java


> Distributed shell app master becomes unresponsive sometimes
> ---
>
> Key: YARN-2821
> URL: https://issues.apache.org/jira/browse/YARN-2821
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications/distributed-shell
>Affects Versions: 2.5.1
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: 2.8.0
>
> Attachments: YARN-2821.002.patch, YARN-2821.003.patch, 
> YARN-2821.004.patch, YARN-2821.005.patch, apache-yarn-2821.0.patch, 
> apache-yarn-2821.1.patch
>
>
> We've noticed that once in a while the distributed shell app master becomes 
> unresponsive and is eventually killed by the RM. snippet of the logs -
> {noformat}
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: 
> appattempt_1415123350094_0017_01 received 0 previous attempts' running 
> containers on AM registration.
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:37 INFO distributedshell.ApplicationMaster: Requested 
> container ask: Capability[]Priority[0]
> 14/11/04 18:21:38 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez2:45454
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_02, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:38 INFO distributedshell.ApplicationMaster: Setting up 
> container launch container for 
> containerid=container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> START_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.NMClientAsyncImpl: Processing Event EventType: 
> QUERY_CONTAINER for Container container_1415123350094_0017_01_02
> 14/11/04 18:21:39 INFO impl.ContainerManagementProtocolProxy: Opening proxy : 
> onprem-tez2:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez3:45454
> 14/11/04 18:21:39 INFO impl.AMRMClientImpl: Received new token for : 
> onprem-tez4:45454
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Got response from 
> RM for container ask, allocatedCnt=3
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_03, 
> containerNode=onprem-tez2:45454, containerNodeURI=onprem-tez2:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_04, 
> containerNode=onprem-tez3:45454, containerNodeURI=onprem-tez3:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distributedshell.ApplicationMaster: Launching shell 
> command on a new container., 
> containerId=container_1415123350094_0017_01_05, 
> containerNode=onprem-tez4:45454, containerNodeURI=onprem-tez4:50060, 
> containerResourceMemory1024, containerResourceVirtualCores1
> 14/11/04 18:21:39 INFO distrib

[jira] [Commented] (YARN-3302) TestDockerContainerExecutor should run automatically if it can detect docker in the usual place

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552601#comment-14552601
 ] 

Hudson commented on YARN-3302:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3302. TestDockerContainerExecutor should run automatically if it can 
detect docker in the usual place (Ravindra Kumar Naik via raviprak) (raviprak: 
rev c97f32e7b9d9e1d4c80682cc01741579166174d1)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDockerContainerExecutor.java
* hadoop-yarn-project/CHANGES.txt


> TestDockerContainerExecutor should run automatically if it can detect docker 
> in the usual place
> ---
>
> Key: YARN-3302
> URL: https://issues.apache.org/jira/browse/YARN-3302
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.6.0
>Reporter: Ravi Prakash
>Assignee: Ravindra Kumar Naik
> Attachments: YARN-3302-trunk.001.patch, YARN-3302-trunk.002.patch, 
> YARN-3302-trunk.003.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3583) Support of NodeLabel object instead of plain String in YarnClient side.

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552591#comment-14552591
 ] 

Hudson commented on YARN-3583:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3583. Support of NodeLabel object instead of plain String in YarnClient 
side. (Sunil G via wangda) (wangda: rev 
563eb1ad2ae848a23bbbf32ebfaf107e8fa14e87)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_service_protos.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetLabelsToNodesResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/impl/pb/GetNodesToLabelsResponsePBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetLabelsToNodesResponse.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestClientRMService.java
* 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/impl/pb/ReplaceLabelsOnNodeRequestPBImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/GetNodesToLabelsResponse.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/server/yarn_server_resourcemanager_service_protos.proto


> Support of NodeLabel object instead of plain String in YarnClient side.
> ---
>
> Key: YARN-3583
> URL: https://issues.apache.org/jira/browse/YARN-3583
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: client
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Fix For: 2.8.0
>
> Attachments: 0001-YARN-3583.patch, 0002-YARN-3583.patch, 
> 0003-YARN-3583.patch, 0004-YARN-3583.patch
>
>
> Similar to YARN-3521, use NodeLabel objects in YarnClient side apis.
> getLabelsToNodes/getNodeToLabels api's can use NodeLabel object instead of 
> using plain label name.
> This will help to bring other label details such as Exclusivity to client 
> side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3677) Fix findbugs warnings in yarn-server-resourcemanager

2015-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552603#comment-14552603
 ] 

Hudson commented on YARN-3677:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2149 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2149/])
YARN-3677. Fix findbugs warnings in yarn-server-resourcemanager. Contributed by 
Vinod Kumar Vavilapalli. (ozawa: rev 7401e5b5e8060b6b027d714b5ceb641fcfe5b598)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMAppManager.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java


> Fix findbugs warnings in yarn-server-resourcemanager
> 
>
> Key: YARN-3677
> URL: https://issues.apache.org/jira/browse/YARN-3677
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Reporter: Akira AJISAKA
>Assignee: Vinod Kumar Vavilapalli
>Priority: Minor
>  Labels: newbie
> Fix For: 2.7.1
>
> Attachments: YARN-3677-20150519.txt
>
>
> There is 1 findbugs warning in FileSystemRMStateStore.java.
> {noformat}
> Inconsistent synchronization of FileSystemRMStateStore.isHDFS; locked 66% of 
> time
> Unsynchronized access at FileSystemRMStateStore.java: [line 156]
> Field 
> org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore.isHDFS
> Synchronized 66% of the time
> Synchronized access at FileSystemRMStateStore.java: [line 148]
> Synchronized access at FileSystemRMStateStore.java: [line 859]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3647) RMWebServices api's should use updated api from CommonNodeLabelsManager to get NodeLabel object

2015-05-20 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552619#comment-14552619
 ] 

Sunil G commented on YARN-3647:
---

Test case failure and findbugs error are not related to this patch.

> RMWebServices api's should use updated api from CommonNodeLabelsManager to 
> get NodeLabel object
> ---
>
> Key: YARN-3647
> URL: https://issues.apache.org/jira/browse/YARN-3647
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.6.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-3647.patch, 0002-YARN-3647.patch
>
>
> After YARN-3579, RMWebServices apis can use the updated version of apis in 
> CommonNodeLabelsManager which gives full NodeLabel object instead of creating 
> NodeLabel object from plain label name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-2005) Blacklisting support for scheduling AMs

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot reassigned YARN-2005:
---

Assignee: Anubhav Dhoot

> Blacklisting support for scheduling AMs
> ---
>
> Key: YARN-2005
> URL: https://issues.apache.org/jira/browse/YARN-2005
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 0.23.10, 2.4.0
>Reporter: Jason Lowe
>Assignee: Anubhav Dhoot
>
> It would be nice if the RM supported blacklisting a node for an AM launch 
> after the same node fails a configurable number of AM attempts.  This would 
> be similar to the blacklisting support for scheduling task attempts in the 
> MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2005) Blacklisting support for scheduling AMs

2015-05-20 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552669#comment-14552669
 ] 

Anubhav Dhoot commented on YARN-2005:
-

Assigning to myself to as I am starting work on this. [~sunilg] let me know if 
you have made progress on this already.

> Blacklisting support for scheduling AMs
> ---
>
> Key: YARN-2005
> URL: https://issues.apache.org/jira/browse/YARN-2005
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 0.23.10, 2.4.0
>Reporter: Jason Lowe
>
> It would be nice if the RM supported blacklisting a node for an AM launch 
> after the same node fails a configurable number of AM attempts.  This would 
> be similar to the blacklisting support for scheduling task attempts in the 
> MapReduce AM but for scheduling AM attempts on the RM side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3675:

Attachment: YARN-3675.002.patch

Fixed checkstyle issue 

> FairScheduler: RM quits when node removal races with continousscheduling on 
> the same node
> -
>
> Key: YARN-3675
> URL: https://issues.apache.org/jira/browse/YARN-3675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3675.001.patch, YARN-3675.002.patch
>
>
> With continuous scheduling, scheduling can be done on a node thats just 
> removed causing errors like below.
> {noformat}
> 12:28:53.782 AM FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
> Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
>   at java.lang.Thread.run(Thread.java:745)
> 12:28:53.783 AMINFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3686) CapacityScheduler should trim default_node_label_expression

2015-05-20 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-3686:
--
Attachment: 0002-YARN-3686.patch

Uploading another patch covering a negative scenario.

> CapacityScheduler should trim default_node_label_expression
> ---
>
> Key: YARN-3686
> URL: https://issues.apache.org/jira/browse/YARN-3686
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Sunil G
>Priority: Critical
> Attachments: 0001-YARN-3686.patch, 0002-YARN-3686.patch
>
>
> We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3685) NodeManager unnecessarily knows about classpath-jars due to Windows limitations

2015-05-20 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552680#comment-14552680
 ] 

Chris Nauroth commented on YARN-3685:
-

[~vinodkv], thanks for the notification.  I was not aware of this design goal 
at the time of YARN-316.

Perhaps it's possible to move the classpath jar generation to the MR client or 
AM.  It's not immediately obvious to me which of those 2 choices is better.  
We'd need to change the manifest to use relative paths in the Class-Path 
attribute instead of absolute paths.  (The client and AM are not aware of the 
exact layout of the NodeManager's {{yarn.nodemanager.local-dirs}}, so the 
client can't predict the absolute paths at time of container launch.)

There is one piece of logic that I don't see how to handle though.  Some 
classpath entries are defined in terms of environment variables.  These 
environment variables are expanded at the NodeManager via the container launch 
scripts.  This was true of Linux even before YARN-316, so in that sense, YARN 
did already have some classpath logic indirectly.  Environment variables cannot 
be used inside a manifest's Class-Path, so for Windows, NodeManager expands the 
environment variables before populating Class-Path.  It would be incorrect to 
do the environment variable expansion at the MR client, because it might be 
running with different configuration than the NodeManager.  I suppose if the AM 
did the expansion, then that would work in most cases, but it creates an 
assumption that the AM container is running with configuration that matches all 
NodeManagers in the cluster.  I don't believe that assumption exists today.

If we do move classpath handling out of the NodeManager, then it would be a 
backwards-incompatible change, and so it could not be shipped in the 2.x 
release line.

> NodeManager unnecessarily knows about classpath-jars due to Windows 
> limitations
> ---
>
> Key: YARN-3685
> URL: https://issues.apache.org/jira/browse/YARN-3685
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
>
> Found this while looking at cleaning up ContainerExecutor via YARN-3648, 
> making it a sub-task.
> YARN *should not* know about classpaths. Our original design modeled around 
> this. But when we added windows suppport, due to classpath issues, we ended 
> up breaking this abstraction via YARN-316. We should clean this up.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3626) On Windows localized resources are not moved to the front of the classpath when they should be

2015-05-20 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552686#comment-14552686
 ] 

Craig Welch commented on YARN-3626:
---

Checkstyle looks insignificant.

[~cnauroth], [~vinodkv], I've changed the approach to use the environment 
instead of configuration as suggested, can one of you review pls?

> On Windows localized resources are not moved to the front of the classpath 
> when they should be
> --
>
> Key: YARN-3626
> URL: https://issues.apache.org/jira/browse/YARN-3626
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
> Environment: Windows
>Reporter: Craig Welch
>Assignee: Craig Welch
> Fix For: 2.7.1
>
> Attachments: YARN-3626.0.patch, YARN-3626.11.patch, 
> YARN-3626.14.patch, YARN-3626.4.patch, YARN-3626.6.patch, YARN-3626.9.patch
>
>
> In response to the mapreduce.job.user.classpath.first setting the classpath 
> is ordered differently so that localized resources will appear before system 
> classpath resources when tasks execute.  On Windows this does not work 
> because the localized resources are not linked into their final location when 
> the classpath jar is created.  To compensate for that localized jar resources 
> are added directly to the classpath generated for the jar rather than being 
> discovered from the localized directories.  Unfortunately, they are always 
> appended to the classpath, and so are never preferred over system resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3467:

Attachment: ApplicationAttemptPage.png

> Expose allocatedMB, allocatedVCores, and runningContainers metrics on running 
> Applications in RM Web UI
> ---
>
> Key: YARN-3467
> URL: https://issues.apache.org/jira/browse/YARN-3467
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: webapp, yarn
>Affects Versions: 2.5.0
>Reporter: Anthony Rojas
>Assignee: Anubhav Dhoot
>Priority: Minor
> Attachments: ApplicationAttemptPage.png
>
>
> The YARN REST API can report on the following properties:
> *allocatedMB*: The sum of memory in MB allocated to the application's running 
> containers
> *allocatedVCores*: The sum of virtual cores allocated to the application's 
> running containers
> *runningContainers*: The number of containers currently running for the 
> application
> Currently, the RM Web UI does not report on these items (at least I couldn't 
> find any entries within the Web UI).
> It would be useful for YARN Application and Resource troubleshooting to have 
> these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-20 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3681:
--
Attachment: YARN-3681.0.patch

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: windows, yarn-client
> Attachments: YARN-3681.0.patch, YARN-3681.01.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-20 Thread Craig Welch (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552700#comment-14552700
 ] 

Craig Welch commented on YARN-3681:
---

[~varun_saxena] the patch you had doesn't apply properly for me, I've uploaded 
a patch which does the same things which does, and which I've had the 
opportunity to test.

@xgong, can you take a look at this one (.0.patch)?  Thanks.

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: windows, yarn-client
> Attachments: YARN-3681.0.patch, YARN-3681.01.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2015-05-20 Thread Anubhav Dhoot (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552699#comment-14552699
 ] 

Anubhav Dhoot commented on YARN-3467:
-

Attaching the ApplicationAttempt page. It does show the number of running 
containers. But it does not show actual allocated resources overall for the 
application attempt. 

> Expose allocatedMB, allocatedVCores, and runningContainers metrics on running 
> Applications in RM Web UI
> ---
>
> Key: YARN-3467
> URL: https://issues.apache.org/jira/browse/YARN-3467
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: webapp, yarn
>Affects Versions: 2.5.0
>Reporter: Anthony Rojas
>Assignee: Anubhav Dhoot
>Priority: Minor
> Attachments: ApplicationAttemptPage.png
>
>
> The YARN REST API can report on the following properties:
> *allocatedMB*: The sum of memory in MB allocated to the application's running 
> containers
> *allocatedVCores*: The sum of virtual cores allocated to the application's 
> running containers
> *runningContainers*: The number of containers currently running for the 
> application
> Currently, the RM Web UI does not report on these items (at least I couldn't 
> find any entries within the Web UI).
> It would be useful for YARN Application and Resource troubleshooting to have 
> these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3686) CapacityScheduler should trim default_node_label_expression

2015-05-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552711#comment-14552711
 ] 

Wangda Tan commented on YARN-3686:
--

[~sunilg], thanks for working on this, comments:
- I think you can try to add to 
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeNodeLabelExpressionInRequest(ResourceRequest,
 QueueInfo)}}, which needs trim node-label-expression as well
- Actually this is a regression, in 2.6 queue's node label expression with 
spaces can setup without any issue. It's better to add test to make sure 1. 
spaces in resource request will be trimmed 2. spaces in queue configuration 
(default-node-label-expression) will be trimmed.

> CapacityScheduler should trim default_node_label_expression
> ---
>
> Key: YARN-3686
> URL: https://issues.apache.org/jira/browse/YARN-3686
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Sunil G
>Priority: Critical
> Attachments: 0001-YARN-3686.patch, 0002-YARN-3686.patch
>
>
> We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552710#comment-14552710
 ] 

Hadoop QA commented on YARN-3681:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734163/YARN-3681.0.patch |
| Optional Tests |  |
| git revision | trunk / 4aa730c |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8026/console |


This message was automatically generated.

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: windows, yarn-client
> Attachments: YARN-3681.0.patch, YARN-3681.01.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-3691) Limit number of reservations for an app

2015-05-20 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh reassigned YARN-3691:
-

Assignee: Arun Suresh

> Limit number of reservations for an app
> ---
>
> Key: YARN-3691
> URL: https://issues.apache.org/jira/browse/YARN-3691
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> Currently, It is possible to reserve resource for an app on all nodes. 
> Limiting this to possibly just a number of nodes (or a ratio of the total 
> cluster size) would improve utilization of the cluster and will reduce the 
> possibility of starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3691) Limit number of reservations for an app

2015-05-20 Thread Arun Suresh (JIRA)
Arun Suresh created YARN-3691:
-

 Summary: Limit number of reservations for an app
 Key: YARN-3691
 URL: https://issues.apache.org/jira/browse/YARN-3691
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler
Reporter: Arun Suresh


Currently, It is possible to reserve resource for an app on all nodes. Limiting 
this to possibly just a number of nodes (or a ratio of the total cluster size) 
would improve utilization of the cluster and will reduce the possibility of 
starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-20 Thread Craig Welch (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Craig Welch updated YARN-3681:
--
Attachment: YARN-3681.1.patch

Oh the irony, neither did my own.  Updated to one which does.

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: windows, yarn-client
> Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
> YARN-3681.1.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552736#comment-14552736
 ] 

Hadoop QA commented on YARN-3681:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:red}-1{color} | patch |   0m  0s | The patch command could not apply 
the patch during dryrun. |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734165/YARN-3681.1.patch |
| Optional Tests |  |
| git revision | trunk / 4aa730c |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8027/console |


This message was automatically generated.

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: windows, yarn-client
> Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
> YARN-3681.1.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-20 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552738#comment-14552738
 ] 

Varun Saxena commented on YARN-3681:


[~cwelch], it has to do with line endings.
I have to run {{unix2dos}} to convert line endings for Jenkins to accept it. 
Windows batch files patches do not always apply depending on settings of line 
endings done by the user. I think my patch did not apply for you because of 
that reason.

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: windows, yarn-client
> Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
> YARN-3681.1.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2355) MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container

2015-05-20 Thread Darrell Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darrell Taylor updated YARN-2355:
-
Attachment: YARN-2355.001.patch

> MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container
> --
>
> Key: YARN-2355
> URL: https://issues.apache.org/jira/browse/YARN-2355
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Darrell Taylor
>  Labels: newbie
> Attachments: YARN-2355.001.patch
>
>
> After YARN-2074, YARN-614 and YARN-611, the application cannot judge whether 
> it has the chance to try based on MAX_APP_ATTEMPTS_ENV alone. We should be 
> able to notify the application of the up-to-date remaining retry quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location

2015-05-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552800#comment-14552800
 ] 

Wangda Tan commented on YARN-314:
-

[~kasha],
Actually I'm not quite sure about this proposal, what's the benefit of putting 
all apps' requests together comparing to hold one data structure per app, is 
there any use case?

> Schedulers should allow resource requests of different sizes at the same 
> priority and location
> --
>
> Key: YARN-314
> URL: https://issues.apache.org/jira/browse/YARN-314
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
> Attachments: yarn-314-prelim.patch
>
>
> Currently, resource requests for the same container and locality are expected 
> to all be the same size.
> While it it doesn't look like it's needed for apps currently, and can be 
> circumvented by specifying different priorities if absolutely necessary, it 
> seems to me that the ability to request containers with different resource 
> requirements at the same priority level should be there for the future and 
> for completeness sake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-3692) Allow REST API to set a user generated message when killing an application

2015-05-20 Thread Rajat Jain (JIRA)
Rajat Jain created YARN-3692:


 Summary: Allow REST API to set a user generated message when 
killing an application
 Key: YARN-3692
 URL: https://issues.apache.org/jira/browse/YARN-3692
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Rajat Jain


Currently YARN's REST API supports killing an application without setting a 
diagnostic message. It would be good to provide that support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552841#comment-14552841
 ] 

Hadoop QA commented on YARN-3675:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 44s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 47s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 16s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 51s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  87m 29s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734156/YARN-3675.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8025/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8025/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8025/console |


This message was automatically generated.

> FairScheduler: RM quits when node removal races with continousscheduling on 
> the same node
> -
>
> Key: YARN-3675
> URL: https://issues.apache.org/jira/browse/YARN-3675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3675.001.patch, YARN-3675.002.patch
>
>
> With continuous scheduling, scheduling can be done on a node thats just 
> removed causing errors like below.
> {noformat}
> 12:28:53.782 AM FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
> Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
>   at java.lang.Thread.run(Thread.java:745)
> 12:28:53.783 AMINFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2408) Resource Request REST API for YARN

2015-05-20 Thread Renan DelValle (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552849#comment-14552849
 ] 

Renan DelValle commented on YARN-2408:
--


[~leftnoteasy], thanks for taking a look at the patch, really appreciate it.

1) I agree, the original patch I had was very verbose so I shrunk down the 
amount of data being transferred by clustering resource requests together. 
Seems to be the best alternative to keeping original ResourceRequest structures.

2) I will take a look at that and implement it that way. (Thank you for 
pointing me in the right direction). On the resource-by-label inclusion, do you 
think it would be better to wait until it is patched into the trunk in order to 
make the process easier?


> Resource Request REST API for YARN
> --
>
> Key: YARN-2408
> URL: https://issues.apache.org/jira/browse/YARN-2408
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: webapp
>Reporter: Renan DelValle
>  Labels: features
> Attachments: YARN-2408-6.patch
>
>
> I’m proposing a new REST API for YARN which exposes a snapshot of the 
> Resource Requests that exist inside of the Scheduler. My motivation behind 
> this new feature is to allow external software to monitor the amount of 
> resources being requested to gain more insightful information into cluster 
> usage than is already provided. The API can also be used by external software 
> to detect a starved application and alert the appropriate users and/or sys 
> admin so that the problem may be remedied.
> Here is the proposed API (a JSON counterpart is also available):
> {code:xml}
> 
>   7680
>   7
>   
> application_1412191664217_0001
> 
> appattempt_1412191664217_0001_01
> default
> 6144
> 6
> 3
> 
>   
> 1024
> 1
> 6
> true
> 20
> 
>   localMachine
>   /default-rack
>   *
> 
>   
> 
>   
>   
>   ...
>   
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3675) FairScheduler: RM quits when node removal races with continousscheduling on the same node

2015-05-20 Thread Anubhav Dhoot (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Dhoot updated YARN-3675:

Attachment: YARN-3675.003.patch

Removed spurious changes and changed visibility of attemptScheduling

> FairScheduler: RM quits when node removal races with continousscheduling on 
> the same node
> -
>
> Key: YARN-3675
> URL: https://issues.apache.org/jira/browse/YARN-3675
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Anubhav Dhoot
>Assignee: Anubhav Dhoot
> Attachments: YARN-3675.001.patch, YARN-3675.002.patch, 
> YARN-3675.003.patch
>
>
> With continuous scheduling, scheduling can be done on a node thats just 
> removed causing errors like below.
> {noformat}
> 12:28:53.782 AM FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
> Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.unreserve(FSAppAttempt.java:469)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:815)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:763)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1217)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:111)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684)
>   at java.lang.Thread.run(Thread.java:745)
> 12:28:53.783 AMINFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager Exiting, bbye..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2355) MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552958#comment-14552958
 ] 

Hadoop QA commented on YARN-2355:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   1m 45s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   2m 39s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |   0m 26s | Tests passed in 
hadoop-yarn-api. |
| {color:green}+1{color} | yarn tests |  50m  1s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  89m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734179/YARN-2355.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| hadoop-yarn-api test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8028/artifact/patchprocess/testrun_hadoop-yarn-api.txt
 |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8028/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8028/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8028/console |


This message was automatically generated.

> MAX_APP_ATTEMPTS_ENV may no longer be a useful env var for a container
> --
>
> Key: YARN-2355
> URL: https://issues.apache.org/jira/browse/YARN-2355
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Zhijie Shen
>Assignee: Darrell Taylor
>  Labels: newbie
> Attachments: YARN-2355.001.patch
>
>
> After YARN-2074, YARN-614 and YARN-611, the application cannot judge whether 
> it has the chance to try based on MAX_APP_ATTEMPTS_ENV alone. We should be 
> able to notify the application of the up-to-date remaining retry quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3467) Expose allocatedMB, allocatedVCores, and runningContainers metrics on running Applications in RM Web UI

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14552959#comment-14552959
 ] 

Karthik Kambatla commented on YARN-3467:


We should add this information to ApplicationAttempt page, and also preferably 
to the RM Web UI. I have heard asks for both number of containers and allocated 
resources on the RM applications page, so people can sort applications by that. 

> Expose allocatedMB, allocatedVCores, and runningContainers metrics on running 
> Applications in RM Web UI
> ---
>
> Key: YARN-3467
> URL: https://issues.apache.org/jira/browse/YARN-3467
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: webapp, yarn
>Affects Versions: 2.5.0
>Reporter: Anthony Rojas
>Assignee: Anubhav Dhoot
>Priority: Minor
> Attachments: ApplicationAttemptPage.png
>
>
> The YARN REST API can report on the following properties:
> *allocatedMB*: The sum of memory in MB allocated to the application's running 
> containers
> *allocatedVCores*: The sum of virtual cores allocated to the application's 
> running containers
> *runningContainers*: The number of containers currently running for the 
> application
> Currently, the RM Web UI does not report on these items (at least I couldn't 
> find any entries within the Web UI).
> It would be useful for YARN Application and Resource troubleshooting to have 
> these properties and their corresponding values exposed on the RM WebUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3691) FairScheduler: Limit number of reservations for a container

2015-05-20 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-3691:
---
Summary: FairScheduler: Limit number of reservations for a container  (was: 
Limit number of reservations for an app)

> FairScheduler: Limit number of reservations for a container
> ---
>
> Key: YARN-3691
> URL: https://issues.apache.org/jira/browse/YARN-3691
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> Currently, It is possible to reserve resource for an app on all nodes. 
> Limiting this to possibly just a number of nodes (or a ratio of the total 
> cluster size) would improve utilization of the cluster and will reduce the 
> possibility of starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3691) FairScheduler: Limit number of reservations for a container

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553001#comment-14553001
 ] 

Karthik Kambatla commented on YARN-3691:


The number of reservations should be per component and not per application? If 
an app is looking to get resources for 10 containers, it should be able to make 
reservations independently for each container. 

> FairScheduler: Limit number of reservations for a container
> ---
>
> Key: YARN-3691
> URL: https://issues.apache.org/jira/browse/YARN-3691
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> Currently, It is possible to reserve resource for an app on all nodes. 
> Limiting this to possibly just a number of nodes (or a ratio of the total 
> cluster size) would improve utilization of the cluster and will reduce the 
> possibility of starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3686) CapacityScheduler should trim default_node_label_expression

2015-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553000#comment-14553000
 ] 

Hadoop QA commented on YARN-3686:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 29s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | javac |   7m 31s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 38s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 24s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 38s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 16s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | yarn tests |  50m 20s | Tests passed in 
hadoop-yarn-server-resourcemanager. |
| | |  86m 14s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734160/0002-YARN-3686.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 4aa730c |
| hadoop-yarn-server-resourcemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/8029/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/8029/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/8029/console |


This message was automatically generated.

> CapacityScheduler should trim default_node_label_expression
> ---
>
> Key: YARN-3686
> URL: https://issues.apache.org/jira/browse/YARN-3686
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, client, resourcemanager
>Reporter: Wangda Tan
>Assignee: Sunil G
>Priority: Critical
> Attachments: 0001-YARN-3686.patch, 0002-YARN-3686.patch
>
>
> We should trim default_node_label_expression for queue before using it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-314) Schedulers should allow resource requests of different sizes at the same priority and location

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553015#comment-14553015
 ] 

Karthik Kambatla commented on YARN-314:
---

I am essentially proposing an efficient way to index the pending requests 
across multiple axes. Each of these indices are captured by a map. The only 
reason to colocate them is to not disperse this indexing (mapping) logic across 
multiple classes. 

We should able to quickly look up all requests for an app for reporting etc., 
and also look up all node-local requests across applications at schedule time 
without having to iterate through all the applications. 

The maps could be - >>, >>. Current {{AppSchedulingInfo}} 
could stay as is and use the former map to get the corresponding requests.

> Schedulers should allow resource requests of different sizes at the same 
> priority and location
> --
>
> Key: YARN-314
> URL: https://issues.apache.org/jira/browse/YARN-314
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: scheduler
>Affects Versions: 2.0.2-alpha
>Reporter: Sandy Ryza
> Attachments: yarn-314-prelim.patch
>
>
> Currently, resource requests for the same container and locality are expected 
> to all be the same size.
> While it it doesn't look like it's needed for apps currently, and can be 
> circumvented by specifying different priorities if absolutely necessary, it 
> seems to me that the ability to request containers with different resource 
> requirements at the same priority level should be there for the future and 
> for completeness sake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (YARN-3691) FairScheduler: Limit number of reservations for a container

2015-05-20 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553001#comment-14553001
 ] 

Karthik Kambatla edited comment on YARN-3691 at 5/20/15 8:09 PM:
-

The number of reservations should be per container and not per application? If 
an app is looking to get resources for 10 containers, it should be able to make 
reservations independently for each container. 


was (Author: kasha):
The number of reservations should be per component and not per application? If 
an app is looking to get resources for 10 containers, it should be able to make 
reservations independently for each container. 

> FairScheduler: Limit number of reservations for a container
> ---
>
> Key: YARN-3691
> URL: https://issues.apache.org/jira/browse/YARN-3691
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> Currently, It is possible to reserve resource for an app on all nodes. 
> Limiting this to possibly just a number of nodes (or a ratio of the total 
> cluster size) would improve utilization of the cluster and will reduce the 
> possibility of starving other apps.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2556) Tool to measure the performance of the timeline server

2015-05-20 Thread Chang Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chang Li updated YARN-2556:
---
Attachment: YARN-2556.10.patch

Add JobHistoryFileReplayMapper mapper

> Tool to measure the performance of the timeline server
> --
>
> Key: YARN-2556
> URL: https://issues.apache.org/jira/browse/YARN-2556
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Jonathan Eagles
>Assignee: Chang Li
>  Labels: BB2015-05-TBR
> Attachments: YARN-2556-WIP.patch, YARN-2556-WIP.patch, 
> YARN-2556.1.patch, YARN-2556.10.patch, YARN-2556.2.patch, YARN-2556.3.patch, 
> YARN-2556.4.patch, YARN-2556.5.patch, YARN-2556.6.patch, YARN-2556.7.patch, 
> YARN-2556.8.patch, YARN-2556.9.patch, YARN-2556.patch, yarn2556.patch, 
> yarn2556.patch, yarn2556_wip.patch
>
>
> We need to be able to understand the capacity model for the timeline server 
> to give users the tools they need to deploy a timeline server with the 
> correct capacity.
> I propose we create a mapreduce job that can measure timeline server write 
> and read performance. Transactions per second, I/O for both read and write 
> would be a good start.
> This could be done as an example or test job that could be tied into gridmix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-20 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-2918:
-
Fix Version/s: 2.7.1

> Don't fail RM if queue's configured labels are not existed in 
> cluster-node-labels
> -
>
> Key: YARN-2918
> URL: https://issues.apache.org/jira/browse/YARN-2918
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Wangda Tan
> Fix For: 2.8.0, 2.7.1
>
> Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch
>
>
> Currently, if admin setup labels on queues 
> {{.accessible-node-labels = ...}}. And the label is not added to 
> RM, queue's initialization will fail and RM will fail too:
> {noformat}
> 2014-12-03 20:11:50,126 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> ...
> Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
> please check.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.(AbstractCSQueue.java:109)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> {noformat}
> This is not a good user experience, we should stop fail RM so that admin can 
> configure queue/labels in following steps:
> - Configure queue (with label)
> - Start RM
> - Add labels to RM
> - Submit applications
> Now admin has to:
> - Configure queue (without label)
> - Start RM
> - Add labels to RM
> - Refresh queue's config (with label)
> - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2918) Don't fail RM if queue's configured labels are not existed in cluster-node-labels

2015-05-20 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553086#comment-14553086
 ] 

Wangda Tan commented on YARN-2918:
--

Back-ported this patch to 2.7.1, updating fix version.

> Don't fail RM if queue's configured labels are not existed in 
> cluster-node-labels
> -
>
> Key: YARN-2918
> URL: https://issues.apache.org/jira/browse/YARN-2918
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Rohith
>Assignee: Wangda Tan
> Fix For: 2.8.0, 2.7.1
>
> Attachments: YARN-2918.1.patch, YARN-2918.2.patch, YARN-2918.3.patch
>
>
> Currently, if admin setup labels on queues 
> {{.accessible-node-labels = ...}}. And the label is not added to 
> RM, queue's initialization will fail and RM will fail too:
> {noformat}
> 2014-12-03 20:11:50,126 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting 
> ResourceManager
> ...
> Caused by: java.io.IOException: NodeLabelManager doesn't include label = x, 
> please check.
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.checkIfLabelInClusterNodeLabels(SchedulerUtils.java:287)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractCSQueue.(AbstractCSQueue.java:109)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.(LeafQueue.java:120)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:567)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:587)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:462)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:294)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:324)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
> {noformat}
> This is not a good user experience, we should stop fail RM so that admin can 
> configure queue/labels in following steps:
> - Configure queue (with label)
> - Start RM
> - Add labels to RM
> - Submit applications
> Now admin has to:
> - Configure queue (without label)
> - Start RM
> - Add labels to RM
> - Refresh queue's config (with label)
> - Submit applications



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3681) yarn cmd says "could not find main class 'queue'" in windows

2015-05-20 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14553114#comment-14553114
 ] 

Xuan Gong commented on YARN-3681:
-

Use git apply -p0 --whitespace=fix could apply the patch.
The patch looks good to me.
+1 will commit

> yarn cmd says "could not find main class 'queue'" in windows
> 
>
> Key: YARN-3681
> URL: https://issues.apache.org/jira/browse/YARN-3681
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Affects Versions: 2.7.0
> Environment: Windows Only
>Reporter: Sumana Sathish
>Assignee: Varun Saxena
>Priority: Blocker
>  Labels: windows, yarn-client
> Attachments: YARN-3681.0.patch, YARN-3681.01.patch, 
> YARN-3681.1.patch, yarncmd.png
>
>
> Attached the screenshot of the command prompt in windows running yarn queue 
> command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >