[jira] [Commented] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180279#comment-16180279
 ] 

Hadoop QA commented on YARN-2497:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 19 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
45s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
10s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
41s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  0s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 36 new + 978 unchanged - 25 fixed = 1014 total (was 1003) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 6 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
12s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
29s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch 
failed. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
47s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}513m 47s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
19s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}574m 21s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | 
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
|  |  Load of known null value in 

[jira] [Commented] (YARN-6626) Embed REST API service into RM

2017-09-25 Thread Eric Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180246#comment-16180246
 ] 

Eric Yang commented on YARN-6626:
-

[~jianhe] Resource Manager jetty server is setup to chain a series of 
WebFilters together.  Every web filter is extended from WebServices.  This is 
how YARN web application written to work.  Resource.java was changed to return 
cloned object instead of internal data structure to prevent the internal 
structure to be modified by code outside of YARN jar file.  This was flagged by 
Findbugs.  @XmlType annotation is for Jersey to validate the data structure 
schema.  Without, XmlType annotation, jersey will write exceptions in Resource 
Manager server log file.

I will update the patch with default to false, and update QuickStart.md 
tomorrow.

> Embed REST API service into RM
> --
>
> Key: YARN-6626
> URL: https://issues.apache.org/jira/browse/YARN-6626
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
> Fix For: yarn-native-services
>
> Attachments: YARN-6626.yarn-native-services.001.patch, 
> YARN-6626.yarn-native-services.002.patch, 
> YARN-6626.yarn-native-services.003.patch, 
> YARN-6626.yarn-native-services.004.patch, 
> YARN-6626.yarn-native-services.005.patch
>
>
> As of now the deployment model of the Native Services REST API service is 
> standalone. There are several cross-cutting solutions that can be inherited 
> for free (kerberos, HA, ACLs, trusted proxy support, etc.) by the REST API 
> service if it is embedded into the RM process. In fact we can expose the REST 
> API via the same port as RM UI (8088 default). The URI path 
> /services/v1/applications will distinguish the REST API calls from other RM 
> APIs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread kartheek muthyala (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180198#comment-16180198
 ] 

kartheek muthyala commented on YARN-7240:
-

Thank you for [~asuresh], for rebasing and the checkstyle fixes in the patch. 

> Add more states and transitions to stabilize the NM Container state machine
> ---
>
> Key: YARN-7240
> URL: https://issues.apache.org/jira/browse/YARN-7240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: kartheek muthyala
> Fix For: 2.9.0, 3.1.0
>
> Attachments: YARN-7240.001.patch, YARN-7240.002.patch, 
> YARN-7240.002.patch
>
>
> There seem to be a few intermediate states that can be added to improve the 
> stability of the NM container state machine.
> For. eg:
> * The REINITIALIZING should probably be split into REINITIALIZING and 
> REINITIALIZING_AWAITING_KILL. 
> * Container updates are currently handled in the ContainerScheduler, but it 
> would probably be better to have it plumbed through the container state 
> machine as a new state, say UPDATING and a new container event.
> The plan is to add some extra tests too to try and test every transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7247) Assign multiple will lead to hot point problems of physical resource consumption

2017-09-25 Thread balloons (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180148#comment-16180148
 ] 

balloons commented on YARN-7247:


Thanks [~yufeigu],  the *YARN-1042* issue can completely solve my problem.

> Assign multiple will lead to hot point problems of physical resource 
> consumption
> 
>
> Key: YARN-7247
> URL: https://issues.apache.org/jira/browse/YARN-7247
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: balloons
>Assignee: Daniel Templeton
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7241) Merge YARN-5734 to trunk/branch-2

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180131#comment-16180131
 ] 

Hadoop QA commented on YARN-7241:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m 
24s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m  
5s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 13s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 11 new + 636 unchanged - 1 fixed = 647 total (was 637) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
24s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
11s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 42s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
46s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 46m 10s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 29s{color} 
| {color:red} hadoop-yarn-client in the patch failed. 

[jira] [Commented] (YARN-7252) Removing queue then failing over results in exception

2017-09-25 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180072#comment-16180072
 ] 

Jonathan Hung commented on YARN-7252:
-

One approach I see is adding a reinitialize wrapper in CapacityScheduler with a 
flag to tell it whether to check queue hierarchy or not. Assumption being, on 
failover, you don't need to check the queue hierarchy since the previous RM did 
it.

Another less intrusive approach would be to replay the logs not read by the 
standby-now-active RM. This seems wasteful though, and there's a lot more logic 
required for this change.

> Removing queue then failing over results in exception
> -
>
> Key: YARN-7252
> URL: https://issues.apache.org/jira/browse/YARN-7252
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>
> Scenario: rm1 and rm2, starting configuration with root.default, root.a. rm1 
> is active. First, put root.a into STOPPED state, then remove it. Then put rm1 
> in standby and rm2 in active. Here's the exception: {noformat}Operation 
> failed: Error on refreshAll during transition to Active
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:315)
>   at 
> org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
>   at 
> org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: RefreshAll operation 
> failed
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:747)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:307)
>   ... 10 more
> Caused by: java.io.IOException: Failed to re-init queues : root.a is deleted 
> from the new capacity scheduler configuration, but the queue is not yet in 
> stopped state. Current State : RUNNING
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:436)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:405)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:736)
>   ... 11 more
> Caused by: java.io.IOException: root.a is deleted from the new capacity 
> scheduler configuration, but the queue is not yet in stopped state. Current 
> State : RUNNING
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.validateQueueHierarchy(CapacitySchedulerQueueManager.java:312)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.reinitializeQueues(CapacitySchedulerQueueManager.java:174)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:648)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:432)
>   ... 13 more{noformat}
> Seems rm2 does not think root.a was STOPPED, so when it can't find root.a and 
> sees it is deleted, it throws exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7253) Shared Cache Manager daemon command listed as admin subcmd in yarn script

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180062#comment-16180062
 ] 

Hadoop QA commented on YARN-7253:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
27s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
13s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}  1m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-7253 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888981/YARN-7253-trunk-001.patch
 |
| Optional Tests |  asflicense  shellcheck  shelldocs  |
| uname | Linux cfa80916c806 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / a2b31e3 |
| shellcheck | v0.4.6 |
| modules | C: hadoop-yarn-project/hadoop-yarn U: 
hadoop-yarn-project/hadoop-yarn |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17635/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Shared Cache Manager daemon command listed as admin subcmd in yarn script
> -
>
> Key: YARN-7253
> URL: https://issues.apache.org/jira/browse/YARN-7253
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1, 3.0.0-alpha4
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
> Attachments: YARN-7253-trunk-001.patch
>
>
> Currently the command to start the shared cache manager daemon is listed as 
> an admin command in the yarn script usage:
> {noformat}
>   SUBCOMMAND is one of:
> Admin Commands:
> daemonlogget/set the log level for each daemon
> node prints node report(s)
> rmadmin  admin tools
> scmadmin SharedCacheManager admin tools
> sharedcachemanager   run the SharedCacheManager daemon
> {noformat}
> It should be a daemon command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7191) Improve yarn-service documentation

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180060#comment-16180060
 ] 

Hadoop QA commented on YARN-7191:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} yarn-native-services Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
14s{color} | {color:green} yarn-native-services passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} yarn-native-services passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 21 line(s) that end in whitespace. Use 
git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 20m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-7191 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888980/YARN-7191.yarn-native-services.02.patch
 |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 74532c17a063 3.13.0-116-generic #163-Ubuntu SMP Fri Mar 31 
14:13:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | yarn-native-services / 3f7a50d |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/17634/artifact/patchprocess/whitespace-eol.txt
 |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17634/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Improve yarn-service documentation
> --
>
> Key: YARN-7191
> URL: https://issues.apache.org/jira/browse/YARN-7191
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7191.yarn-native-services.01.patch, 
> YARN-7191.yarn-native-services.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7253) Shared Cache Manager daemon command listed as admin subcmd in yarn script

2017-09-25 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-7253:
---
Affects Version/s: 3.0.0-alpha4

> Shared Cache Manager daemon command listed as admin subcmd in yarn script
> -
>
> Key: YARN-7253
> URL: https://issues.apache.org/jira/browse/YARN-7253
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1, 3.0.0-alpha4
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
> Attachments: YARN-7253-trunk-001.patch
>
>
> Currently the command to start the shared cache manager daemon is listed as 
> an admin command in the yarn script usage:
> {noformat}
>   SUBCOMMAND is one of:
> Admin Commands:
> daemonlogget/set the log level for each daemon
> node prints node report(s)
> rmadmin  admin tools
> scmadmin SharedCacheManager admin tools
> sharedcachemanager   run the SharedCacheManager daemon
> {noformat}
> It should be a daemon command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7253) Shared Cache Manager daemon command listed as admin subcmd in yarn script

2017-09-25 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-7253:
---
Affects Version/s: 3.0.0-beta1

> Shared Cache Manager daemon command listed as admin subcmd in yarn script
> -
>
> Key: YARN-7253
> URL: https://issues.apache.org/jira/browse/YARN-7253
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-beta1
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
> Attachments: YARN-7253-trunk-001.patch
>
>
> Currently the command to start the shared cache manager daemon is listed as 
> an admin command in the yarn script usage:
> {noformat}
>   SUBCOMMAND is one of:
> Admin Commands:
> daemonlogget/set the log level for each daemon
> node prints node report(s)
> rmadmin  admin tools
> scmadmin SharedCacheManager admin tools
> sharedcachemanager   run the SharedCacheManager daemon
> {noformat}
> It should be a daemon command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7253) Shared Cache Manager daemon command listed as admin subcmd in yarn script

2017-09-25 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo reassigned YARN-7253:
--

Assignee: Chris Trezzo

> Shared Cache Manager daemon command listed as admin subcmd in yarn script
> -
>
> Key: YARN-7253
> URL: https://issues.apache.org/jira/browse/YARN-7253
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Trivial
> Attachments: YARN-7253-trunk-001.patch
>
>
> Currently the command to start the shared cache manager daemon is listed as 
> an admin command in the yarn script usage:
> {noformat}
>   SUBCOMMAND is one of:
> Admin Commands:
> daemonlogget/set the log level for each daemon
> node prints node report(s)
> rmadmin  admin tools
> scmadmin SharedCacheManager admin tools
> sharedcachemanager   run the SharedCacheManager daemon
> {noformat}
> It should be a daemon command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7253) Shared Cache Manager daemon command listed as admin subcmd in yarn script

2017-09-25 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-7253:
---
Attachment: YARN-7253-trunk-001.patch

Trunk v1 patch attached.

> Shared Cache Manager daemon command listed as admin subcmd in yarn script
> -
>
> Key: YARN-7253
> URL: https://issues.apache.org/jira/browse/YARN-7253
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Priority: Trivial
> Attachments: YARN-7253-trunk-001.patch
>
>
> Currently the command to start the shared cache manager daemon is listed as 
> an admin command in the yarn script usage:
> {noformat}
>   SUBCOMMAND is one of:
> Admin Commands:
> daemonlogget/set the log level for each daemon
> node prints node report(s)
> rmadmin  admin tools
> scmadmin SharedCacheManager admin tools
> sharedcachemanager   run the SharedCacheManager daemon
> {noformat}
> It should be a daemon command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7253) Shared Cache Manager daemon command listed as admin subcmd in yarn script

2017-09-25 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-7253:
---
Description: 
Currently the command to start the shared cache manager daemon is listed as an 
admin command in the yarn script usage:
{noformat}
  SUBCOMMAND is one of:


Admin Commands:

daemonlogget/set the log level for each daemon
node prints node report(s)
rmadmin  admin tools
scmadmin SharedCacheManager admin tools
sharedcachemanager   run the SharedCacheManager daemon
{noformat}

It should be a daemon command.

  was:
Currently the command to start the shared cache manager daemon is listed as an 
admin command in the yarn script:
{noformat}
  SUBCOMMAND is one of:


Admin Commands:

daemonlogget/set the log level for each daemon
node prints node report(s)
rmadmin  admin tools
scmadmin SharedCacheManager admin tools
sharedcachemanager   run the SharedCacheManager daemon
{noformat}

It should be a daemon command.


> Shared Cache Manager daemon command listed as admin subcmd in yarn script
> -
>
> Key: YARN-7253
> URL: https://issues.apache.org/jira/browse/YARN-7253
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Priority: Trivial
>
> Currently the command to start the shared cache manager daemon is listed as 
> an admin command in the yarn script usage:
> {noformat}
>   SUBCOMMAND is one of:
> Admin Commands:
> daemonlogget/set the log level for each daemon
> node prints node report(s)
> rmadmin  admin tools
> scmadmin SharedCacheManager admin tools
> sharedcachemanager   run the SharedCacheManager daemon
> {noformat}
> It should be a daemon command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7253) Shared Cache Manager daemon command listed as admin subcmd in yarn script

2017-09-25 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-7253:
---
Priority: Trivial  (was: Minor)

> Shared Cache Manager daemon command listed as admin subcmd in yarn script
> -
>
> Key: YARN-7253
> URL: https://issues.apache.org/jira/browse/YARN-7253
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Chris Trezzo
>Priority: Trivial
>
> Currently the command to start the shared cache manager daemon is listed as 
> an admin command in the yarn script:
> {noformat}
>   SUBCOMMAND is one of:
> Admin Commands:
> daemonlogget/set the log level for each daemon
> node prints node report(s)
> rmadmin  admin tools
> scmadmin SharedCacheManager admin tools
> sharedcachemanager   run the SharedCacheManager daemon
> {noformat}
> It should be a daemon command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7153) Remove duplicated code in AMRMClientAsyncImpl.java

2017-09-25 Thread Sen Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180044#comment-16180044
 ] 

Sen Zhao commented on YARN-7153:


Thanks for your review, [~ajisakaa]!

> Remove duplicated code in AMRMClientAsyncImpl.java
> --
>
> Key: YARN-7153
> URL: https://issues.apache.org/jira/browse/YARN-7153
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Sen Zhao
>Assignee: Sen Zhao
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: YARN-7153.001.patch
>
>
> I notice there are some codes in AMRMClientAsyncImpl#setHeartbeatInterval 
> duplicate that in AMRMClientAsync#setHeartbeatInterval. The two methods are 
> handled the same way. Submit a patch to fix it!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7253) Shared Cache Manager daemon command listed as admin subcmd in yarn script

2017-09-25 Thread Chris Trezzo (JIRA)
Chris Trezzo created YARN-7253:
--

 Summary: Shared Cache Manager daemon command listed as admin 
subcmd in yarn script
 Key: YARN-7253
 URL: https://issues.apache.org/jira/browse/YARN-7253
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chris Trezzo
Priority: Minor


Currently the command to start the shared cache manager daemon is listed as an 
admin command in the yarn script:
{noformat}
  SUBCOMMAND is one of:


Admin Commands:

daemonlogget/set the log level for each daemon
node prints node report(s)
rmadmin  admin tools
scmadmin SharedCacheManager admin tools
sharedcachemanager   run the SharedCacheManager daemon
{noformat}

It should be a daemon command.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7191) Improve yarn-service documentation

2017-09-25 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180030#comment-16180030
 ] 

Jian He commented on YARN-7191:
---

Updated the doc again

> Improve yarn-service documentation
> --
>
> Key: YARN-7191
> URL: https://issues.apache.org/jira/browse/YARN-7191
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7191.yarn-native-services.01.patch, 
> YARN-7191.yarn-native-services.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7191) Improve yarn-service documentation

2017-09-25 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-7191:
--
Attachment: YARN-7191.yarn-native-services.02.patch

> Improve yarn-service documentation
> --
>
> Key: YARN-7191
> URL: https://issues.apache.org/jira/browse/YARN-7191
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-7191.yarn-native-services.01.patch, 
> YARN-7191.yarn-native-services.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung resolved YARN-7251.
-
Resolution: Fixed

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7251-YARN-5734.001.patch, 
> YARN-7251-YARN-5734.002.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-7251:

Fix Version/s: YARN-5734

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Fix For: YARN-5734
>
> Attachments: YARN-7251-YARN-5734.001.patch, 
> YARN-7251-YARN-5734.002.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7238) Documentation for API based scheduler configuration management

2017-09-25 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-7238:

Fix Version/s: YARN-5734

> Documentation for API based scheduler configuration management
> --
>
> Key: YARN-7238
> URL: https://issues.apache.org/jira/browse/YARN-7238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Fix For: YARN-5734
>
> Attachments: YARN-7238-YARN-5734.001.patch, 
> YARN-7238-YARN-5734.002.patch
>
>
> Documentation for configurations to set / how to use scheduler configuration 
> mutation API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7238) Documentation for API based scheduler configuration management

2017-09-25 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180016#comment-16180016
 ] 

Jonathan Hung commented on YARN-7238:
-

Committing based on discussion in YARN-7241 and YARN-7251

> Documentation for API based scheduler configuration management
> --
>
> Key: YARN-7238
> URL: https://issues.apache.org/jira/browse/YARN-7238
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7238-YARN-5734.001.patch, 
> YARN-7238-YARN-5734.002.patch
>
>
> Documentation for configurations to set / how to use scheduler configuration 
> mutation API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6626) Embed REST API service into RM

2017-09-25 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180010#comment-16180010
 ] 

Jian He commented on YARN-6626:
---

Thanks [~eyang] for the patch, some questions and comments on the patch:

- why is ApiServer needed to be extended from WebServices
- why is the change in Resource.java ? If it’s not related to this patch, we 
should not include it.
- why is hadoop-yarn-server-common dependency added in api-server pom.xml?
- By default, YARN_API_SERVICES_ENABLE should be false, because the feature is 
still experimental 
- “yarn.webapp.api-services.enable“: “api-services” can be “api-service” ?
- can you also update the QuickStart.md to metion the option of starting 
api-service in embedded mode? 
- for the “@XmlType(name)“ annotation, what is the implication of that? Does 
this change the filed name of the API ?

> Embed REST API service into RM
> --
>
> Key: YARN-6626
> URL: https://issues.apache.org/jira/browse/YARN-6626
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
> Fix For: yarn-native-services
>
> Attachments: YARN-6626.yarn-native-services.001.patch, 
> YARN-6626.yarn-native-services.002.patch, 
> YARN-6626.yarn-native-services.003.patch, 
> YARN-6626.yarn-native-services.004.patch, 
> YARN-6626.yarn-native-services.005.patch
>
>
> As of now the deployment model of the Native Services REST API service is 
> standalone. There are several cross-cutting solutions that can be inherited 
> for free (kerberos, HA, ACLs, trusted proxy support, etc.) by the REST API 
> service if it is embedded into the RM process. In fact we can expose the REST 
> API via the same port as RM UI (8088 default). The URI path 
> /services/v1/applications will distinguish the REST API calls from other RM 
> APIs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180007#comment-16180007
 ] 

Hadoop QA commented on YARN-7249:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
51s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 91m 56s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}109m 10s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
| Timed out junit tests | 
org.apache.hadoop.yarn.server.resourcemanager.TestRMHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:c2d96dd |
| JIRA Issue | YARN-7249 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888909/YARN-7249.branch-2.8.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2db572d674f0 3.13.0-129-generic #178-Ubuntu SMP Fri Aug 11 
12:48:20 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | branch-2.8 / ea7e655 |
| Default Java | 1.7.0_151 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/17632/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/17632/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17632/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |



[jira] [Created] (YARN-7252) Removing queue then failing over results in exception

2017-09-25 Thread Jonathan Hung (JIRA)
Jonathan Hung created YARN-7252:
---

 Summary: Removing queue then failing over results in exception
 Key: YARN-7252
 URL: https://issues.apache.org/jira/browse/YARN-7252
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jonathan Hung
Assignee: Jonathan Hung


Scenario: rm1 and rm2, starting configuration with root.default, root.a. rm1 is 
active. First, put root.a into STOPPED state, then remove it. Then put rm1 in 
standby and rm2 in active. Here's the exception: {noformat}Operation failed: 
Error on refreshAll during transition to Active
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:315)
at 
org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.transitionToActive(HAServiceProtocolServerSideTranslatorPB.java:107)
at 
org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:4460)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1962)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)
Caused by: org.apache.hadoop.ha.ServiceFailedException: RefreshAll operation 
failed
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:747)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:307)
... 10 more
Caused by: java.io.IOException: Failed to re-init queues : root.a is deleted 
from the new capacity scheduler configuration, but the queue is not yet in 
stopped state. Current State : RUNNING
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:436)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:405)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshAll(AdminService.java:736)
... 11 more
Caused by: java.io.IOException: root.a is deleted from the new capacity 
scheduler configuration, but the queue is not yet in stopped state. Current 
State : RUNNING
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.validateQueueHierarchy(CapacitySchedulerQueueManager.java:312)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.reinitializeQueues(CapacitySchedulerQueueManager.java:174)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:648)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:432)
... 13 more{noformat}
Seems rm2 does not think root.a was STOPPED, so when it can't find root.a and 
sees it is deleted, it throws exception.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7241) Merge YARN-5734 to trunk/branch-2

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179992#comment-16179992
 ] 

Hadoop QA commented on YARN-7241:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
28s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
43s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m  
4s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
25s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 15s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 11 new + 636 unchanged - 1 fixed = 647 total (was 637) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  5m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} shellcheck {color} | {color:green}  0m 
24s{color} | {color:green} There were no new shellcheck issues. {color} |
| {color:green}+1{color} | {color:green} shelldocs {color} | {color:green}  0m 
13s{color} | {color:green} There were no new shelldocs issues. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m  6s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
43s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
44s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 50m 48s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 59s{color} 
| {color:red} hadoop-yarn-client in the patch failed. 

[jira] [Commented] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179981#comment-16179981
 ] 

Hadoop QA commented on YARN-7248:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  7s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 301 unchanged - 1 fixed = 302 total (was 302) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
40s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
47s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m  
7s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 45m 18s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}124m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
|   | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-7248 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888948/YARN-7248.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux b21dee8f9330 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (YARN-6570) No logs were found for running application, running container

2017-09-25 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179971#comment-16179971
 ] 

Arun Suresh commented on YARN-6570:
---

[~wangda], It looks like prior to YARN-6570, The nodemanager never used to 
report api.records.ContainerState.NEW, even though the Enum existed. So after 
YARN-6570, yes, we need to handle NEW as well in the RMNodeImpl.


> No logs were found for running application, running container
> -
>
> Key: YARN-6570
> URL: https://issues.apache.org/jira/browse/YARN-6570
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: YARN-6570-branch-2.8.001.patch, 
> YARN-6570-branch-2.8.002.patch, YARN-6570.poc.patch, YARN-6570-v2.patch, 
> YARN-6570-v3.patch
>
>
> 1.Obtain running containers from the following CLI for running application:
>  yarn  container -list appattempt
> 2. Couldnot fetch logs 
> {code}
> Can not find any log file matching the pattern: ALL for the container
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7207) Cache the local host name when getting application list in RM

2017-09-25 Thread Robert Kanter (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179957#comment-16179957
 ] 

Robert Kanter commented on YARN-7207:
-

Looks good overall, a few comments:
# In {{TestRMAppAttemptTransitions}}, it's fine to replace the duplicate code 
with a call to {{getAppProxyUrl}}, but the original code here had an 
{{Assert.fail()}} when there was a {{URISyntaxException}}.  Now, that's ignored 
and it returns "N/A".  We should have the test code do something like this to 
maintain that check:
{code:java}
String url = rmContext.getAppProxyUrl(conf, 
appAttempt.getAppAttemptId().getApplicationId());
Assert.assertNotEquals("N/A", url);
{code}
# Instead of calling {{getProxyHostAndPort}} every time we call 
{{getAppProxyUrl}}, perhaps it would be better to simply populate the value of 
{{proxyHostAndPort}} in the constructor for {{RMContextImpl}}?  
{{getAppProxyUrl}} could then use the variable directly, and we don't have to 
worry about any race conditions.
#- This would also make it very easy to add a warning message about the host 
name being slow because we could just do it when populating 
{{proxyHostAndPort}}, which should happen during startup when the 
{{RMContextImpl}} is being created.  

> Cache the local host name when getting application list in RM
> -
>
> Key: YARN-7207
> URL: https://issues.apache.org/jira/browse/YARN-7207
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: RM
>Affects Versions: 3.1.0
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Attachments: YARN-7207.001.patch
>
>
> {{getLocalHostName()}} is invoked for generating the report for each 
> application, which means it is called 1000 times for each 
> {{getApplications()}} if there are 1000 apps in RM. Some user got a 
> performance issue when {{getLocalHostName()}} is slow under some network envs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179953#comment-16179953
 ] 

Hudson commented on YARN-7240:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12973 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12973/])
YARN-7240. Add more states and transitions to stabilize the NM Container (arun 
suresh: rev df800f6cf3ea663daf4081ebe784808b08d9366d)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerEventType.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/UpdateContainerTokenEvent.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/BaseContainerManagerTest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/UpdateContainerSchedulerEvent.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerState.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/TestContainerSchedulerQueuing.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java


> Add more states and transitions to stabilize the NM Container state machine
> ---
>
> Key: YARN-7240
> URL: https://issues.apache.org/jira/browse/YARN-7240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: kartheek muthyala
> Attachments: YARN-7240.001.patch, YARN-7240.002.patch, 
> YARN-7240.002.patch
>
>
> There seem to be a few intermediate states that can be added to improve the 
> stability of the NM container state machine.
> For. eg:
> * The REINITIALIZING should probably be split into REINITIALIZING and 
> REINITIALIZING_AWAITING_KILL. 
> * Container updates are currently handled in the ContainerScheduler, but it 
> would probably be better to have it plumbed through the container state 
> machine as a new state, say UPDATING and a new container event.
> The plan is to add some extra tests too to try and test every transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6570) No logs were found for running application, running container

2017-09-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179951#comment-16179951
 ] 

Wangda Tan commented on YARN-6570:
--

[~jlowe] / [~xgong], 

[~ssath...@hortonworks.com] just reported similar issue happens on 3.0 bits as 
well, I just checked RMNodeImpl:
{code}
  // Process running containers
  if (remoteContainer.getState() == ContainerState.RUNNING ||
  remoteContainer.getState() == ContainerState.SCHEDULED) {
++numRemoteRunningContainers;
if (!launchedContainers.contains(containerId)) {
  // Just launched container. RM knows about it the first time.
  launchedContainers.add(containerId);
  newlyLaunchedContainers.add(remoteContainer);
  // Unregister from containerAllocationExpirer.
  containerAllocationExpirer
  .unregister(new AllocationExpirationInfo(containerId));
}
  } else {
// A finished container
launchedContainers.remove(containerId);
if (completedContainers.add(containerId)) {
  newlyCompletedContainers.add(remoteContainer);
}
// Unregister from containerAllocationExpirer.
containerAllocationExpirer
.unregister(new AllocationExpirationInfo(containerId));
  }
}
{code} 

It looks like the code doesn't properly handle NEW state for branch-2/trunk 
patch. Do you think if add the NEW check in trunk/branch-2 is enough?

> No logs were found for running application, running container
> -
>
> Key: YARN-6570
> URL: https://issues.apache.org/jira/browse/YARN-6570
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: YARN-6570-branch-2.8.001.patch, 
> YARN-6570-branch-2.8.002.patch, YARN-6570.poc.patch, YARN-6570-v2.patch, 
> YARN-6570-v3.patch
>
>
> 1.Obtain running containers from the following CLI for running application:
>  yarn  container -list appattempt
> 2. Couldnot fetch logs 
> {code}
> Can not find any log file matching the pattern: ALL for the container
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179941#comment-16179941
 ] 

Hadoop QA commented on YARN-7226:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 20s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 1 new + 129 unchanged - 3 fixed = 130 total (was 132) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m  
7s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-7226 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888961/YARN-7226.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 4cc913f74b36 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cde804b |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/17631/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/17631/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17631/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Whitelisted variables do not support delayed variable expansion
> ---
>
> Key: YARN-7226
> URL: 

[jira] [Commented] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179939#comment-16179939
 ] 

Jonathan Hung commented on YARN-7251:
-

Thanks [~leftnoteasy], will commit this and YARN-7238.

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7251-YARN-5734.001.patch, 
> YARN-7251-YARN-5734.002.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179915#comment-16179915
 ] 

Wangda Tan commented on YARN-7251:
--

Patch LGTM, thanks [~jhung]!

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7251-YARN-5734.001.patch, 
> YARN-7251-YARN-5734.002.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179907#comment-16179907
 ] 

Wangda Tan commented on YARN-7249:
--

[~eepayne], 

bq. doesn't make sense if node is null, but if queue.completedContainer isn't 
called, won't that leave references to the container still inside internal 
structures? And, for example, won't reserved incrimination counters 
un-decremented?

I think it should be fine: containers are properly released when 
CapacityScheduler#removeNode is called. And if parallel threads access 
scheduler: queue#completedContainer get invoked with non-null but already 
removed node, it becomes a no-op. Please let me know if you think different.

> Fix CapacityScheduler NPE issue when a container preempted while the node is 
> being removed
> --
>
> Key: YARN-7249
> URL: https://issues.apache.org/jira/browse/YARN-7249
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7249.branch-2.8.001.patch
>
>
> This issue could happen when 3 conditions satisfied:
> 1) A node is removing from scheduler.
> 2) A container running on the node is being preempted. 
> 3) A rare race condition causes scheduler pass a null node to leaf queue.
> Fix of the problem is to add a null node check inside CapacityScheduler.
> Stack trace:
> {code}
> 2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(714)) - Error in handling event type 
> KILL_RESERVED_CONTAINER to the scheduler 
> java.lang.NullPointerException 
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)
>  
> {code}
> This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179904#comment-16179904
 ] 

Arun Suresh commented on YARN-7240:
---

+1. The latest patch looks good. (I will take care of the unused imports while 
committing)



> Add more states and transitions to stabilize the NM Container state machine
> ---
>
> Key: YARN-7240
> URL: https://issues.apache.org/jira/browse/YARN-7240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: kartheek muthyala
> Attachments: YARN-7240.001.patch, YARN-7240.002.patch, 
> YARN-7240.002.patch
>
>
> There seem to be a few intermediate states that can be added to improve the 
> stability of the NM container state machine.
> For. eg:
> * The REINITIALIZING should probably be split into REINITIALIZING and 
> REINITIALIZING_AWAITING_KILL. 
> * Container updates are currently handled in the ContainerScheduler, but it 
> would probably be better to have it plumbed through the container state 
> machine as a new state, say UPDATING and a new container event.
> The plan is to add some extra tests too to try and test every transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-7240:
--
Target Version/s: 2.9.0, 3.0.0-beta1

> Add more states and transitions to stabilize the NM Container state machine
> ---
>
> Key: YARN-7240
> URL: https://issues.apache.org/jira/browse/YARN-7240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: kartheek muthyala
> Attachments: YARN-7240.001.patch, YARN-7240.002.patch, 
> YARN-7240.002.patch
>
>
> There seem to be a few intermediate states that can be added to improve the 
> stability of the NM container state machine.
> For. eg:
> * The REINITIALIZING should probably be split into REINITIALIZING and 
> REINITIALIZING_AWAITING_KILL. 
> * Container updates are currently handled in the ContainerScheduler, but it 
> would probably be better to have it plumbed through the container state 
> machine as a new state, say UPDATING and a new container event.
> The plan is to add some extra tests too to try and test every transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7241) Merge YARN-5734 to trunk/branch-2

2017-09-25 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179901#comment-16179901
 ] 

Jonathan Hung commented on YARN-7241:
-

Updating with 003 patch to put some quotes around SchedConfCLI examples, since 
the semi-colons might be interpreted as new commands.

> Merge YARN-5734 to trunk/branch-2
> -
>
> Key: YARN-7241
> URL: https://issues.apache.org/jira/browse/YARN-7241
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7241.001.patch, YARN-7241.002.patch, 
> YARN-7241.003.patch
>
>
> Ticket for jenkins pre-commit for full diff.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7241) Merge YARN-5734 to trunk/branch-2

2017-09-25 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-7241:

Attachment: YARN-7241.003.patch

> Merge YARN-5734 to trunk/branch-2
> -
>
> Key: YARN-7241
> URL: https://issues.apache.org/jira/browse/YARN-7241
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7241.001.patch, YARN-7241.002.patch, 
> YARN-7241.003.patch
>
>
> Ticket for jenkins pre-commit for full diff.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179899#comment-16179899
 ] 

Jonathan Hung commented on YARN-7251:
-

Attaching 002 to put quotes around SchedConfCLI examples, since the semicolon 
might be interpreted as a new command.

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7251-YARN-5734.001.patch, 
> YARN-7251-YARN-5734.002.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2037) Add restart support for Unmanaged AMs

2017-09-25 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179898#comment-16179898
 ] 

Subru Krishnan commented on YARN-2037:
--

Thanks [~botong] for the patch. +1 from my side.

I'll wait for a day to see if anyone has any comments/feedback and commit 
accordingly.

> Add restart support for Unmanaged AMs
> -
>
> Key: YARN-2037
> URL: https://issues.apache.org/jira/browse/YARN-2037
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Botong Huang
> Attachments: YARN-2037.v1.patch
>
>
> It would be nice to allow Unmanaged AMs also to restart in a work-preserving 
> way. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-7251:

Attachment: YARN-7251-YARN-5734.002.patch

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7251-YARN-5734.001.patch, 
> YARN-7251-YARN-5734.002.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-2037) Add restart support for Unmanaged AMs

2017-09-25 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179887#comment-16179887
 ] 

Botong Huang edited comment on YARN-2037 at 9/25/17 10:39 PM:
--

Unit test failure is irrelevant and being tracked under YARN-7044. The patch is 
tested e2e in our Federated cluster, where work preserving UAM is used to span 
across sub-clusters. 


was (Author: botong):
Unit test failure is irrelevant and being tracked under YARN-7044. The patch is 
tested e2e in our cluster. 

> Add restart support for Unmanaged AMs
> -
>
> Key: YARN-2037
> URL: https://issues.apache.org/jira/browse/YARN-2037
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Botong Huang
> Attachments: YARN-2037.v1.patch
>
>
> It would be nice to allow Unmanaged AMs also to restart in a work-preserving 
> way. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-2037) Add restart support for Unmanaged AMs

2017-09-25 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179887#comment-16179887
 ] 

Botong Huang edited comment on YARN-2037 at 9/25/17 10:38 PM:
--

Unit test failure is irrelevant and being tracked under YARN-7044. The patch is 
tested e2e in our cluster. 


was (Author: botong):
Unit test failure is irrelevant and being tracked under YARN-7044. 

> Add restart support for Unmanaged AMs
> -
>
> Key: YARN-2037
> URL: https://issues.apache.org/jira/browse/YARN-2037
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Botong Huang
> Attachments: YARN-2037.v1.patch
>
>
> It would be nice to allow Unmanaged AMs also to restart in a work-preserving 
> way. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2037) Add restart support for Unmanaged AMs

2017-09-25 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179887#comment-16179887
 ] 

Botong Huang commented on YARN-2037:


Unit test failure is irrelevant and being tracked under YARN-7044. 

> Add restart support for Unmanaged AMs
> -
>
> Key: YARN-2037
> URL: https://issues.apache.org/jira/browse/YARN-2037
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Botong Huang
> Attachments: YARN-2037.v1.patch
>
>
> It would be nice to allow Unmanaged AMs also to restart in a work-preserving 
> way. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2037) Add restart support for Unmanaged AMs

2017-09-25 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan updated YARN-2037:
-
Target Version/s: 2.9.0  (was: 2.7.5)

> Add restart support for Unmanaged AMs
> -
>
> Key: YARN-2037
> URL: https://issues.apache.org/jira/browse/YARN-2037
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.4.0
>Reporter: Karthik Kambatla
>Assignee: Botong Huang
> Attachments: YARN-2037.v1.patch
>
>
> It would be nice to allow Unmanaged AMs also to restart in a work-preserving 
> way. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-09-25 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-7226:
-
Attachment: YARN-7226.003.patch

Attaching a patch that implements the ignore-whitelist-vars-for-Docker approach.

> Whitelisted variables do not support delayed variable expansion
> ---
>
> Key: YARN-7226
> URL: https://issues.apache.org/jira/browse/YARN-7226
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 2.8.1, 3.0.0-alpha4
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: YARN-7226.001.patch, YARN-7226.002.patch, 
> YARN-7226.003.patch
>
>
> The nodemanager supports a configurable list of environment variables, via 
> yarn.nodemanager.env-whitelist, that will be propagated to the container's 
> environment unless those variables were specified in the container launch 
> context.  Unfortunately the handling of these whitelisted variables prevents 
> using delayed variable expansion.  For example, if a user shipped their own 
> version of hadoop with their job via the distributed cache and specified:
> {noformat}
> HADOOP_COMMON_HOME={{PWD}}/my-private-hadoop/
> {noformat}
>  as part of their job, the variable will be set as the *literal* string:
> {noformat}
> $PWD/my-private-hadoop/
> {noformat}
> rather than having $PWD expand to the container's current directory as it 
> does for any other, non-whitelisted variable being set to the same value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179851#comment-16179851
 ] 

Hadoop QA commented on YARN-7240:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 18s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 44 new + 288 unchanged - 8 fixed = 332 total (was 296) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
46s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 34m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-7240 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888954/YARN-7240.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 1970297a95d8 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0889e5a |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/17629/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/17629/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17629/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add more states and transitions to stabilize the NM Container state machine
> ---
>
> Key: YARN-7240
> URL: 

[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179841#comment-16179841
 ] 

Hadoop QA commented on YARN-1014:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
27s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
15s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-1014 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12876535/YARN-1014.02.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 256d46d797df 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 0889e5a |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/17628/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17628/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Configure OOM Killer to kill OPPORTUNISTIC containers first
> ---
>
> Key: YARN-1014
> URL: https://issues.apache.org/jira/browse/YARN-1014
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Arun C Murthy
>Assignee: Haibo Chen
> Attachments: YARN-1014.00.patch, YARN-1014.01.patch, 
> YARN-1014.02.patch
>
>
> YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers 
> should 

[jira] [Commented] (YARN-7250) Update Shared cache client api to use URLs

2017-09-25 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179819#comment-16179819
 ] 

Chris Trezzo commented on YARN-7250:


The TestNMClient failure is happening on trunk as well, and is unrelated to 
this patch. The same goes for the TestAMRMClient timeout.

The patch should be good to go. The patch is technically an incompatible 
change, but this api is marked unstable and this feature is still in an alpha 
state, so there should be no issue. My intention is to, pending review, check 
this into trunk, branch-3.0 and branch-2.

> Update Shared cache client api to use URLs
> --
>
> Key: YARN-7250
> URL: https://issues.apache.org/jira/browse/YARN-7250
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: YARN-7250-trunk-001.patch
>
>
> We should make the SharedCacheClient use api more consistent with other YARN 
> api methods. We can do this by doing two things:
> # Update the SharedCacheClient#use api so that it returns a URL instead of a 
> Path. Currently yarn developers have to convert the path to a URL when 
> creating a LocalResources. It would be much smoother if they could just use a 
> URL passed to them by the shared cache client.
> # Remove the portion of the client that deals with fragments as this is not 
> consistent with the rest of YARN. This functionality is bleeding in from the 
> MapReduce layer, which uses fragments to keep track of destination file 
> names. YARN's api does not use fragments. Instead  the ContainerLaunchContext 
> expects a Map localResources, where the strings are 
> the destination file names. We should let the YARN application handle 
> destination file names however it wants instead of pushing this into the 
> shared cache api. Additionally, fragments are a clunky way to handle this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179806#comment-16179806
 ] 

Arun Suresh commented on YARN-7240:
---

bq. Requeing the containers to the front of the queuedGuaranteed or 
queuedOpprotunistic queues would require to change the datastructures or we 
have to put use some temporary auxiliary queues to move this element.
Yeah, this should be fine for now. Lets tackle this in a separate JIRA if 
really required.

> Add more states and transitions to stabilize the NM Container state machine
> ---
>
> Key: YARN-7240
> URL: https://issues.apache.org/jira/browse/YARN-7240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: kartheek muthyala
> Attachments: YARN-7240.001.patch, YARN-7240.002.patch, 
> YARN-7240.002.patch
>
>
> There seem to be a few intermediate states that can be added to improve the 
> stability of the NM container state machine.
> For. eg:
> * The REINITIALIZING should probably be split into REINITIALIZING and 
> REINITIALIZING_AWAITING_KILL. 
> * Container updates are currently handled in the ContainerScheduler, but it 
> would probably be better to have it plumbed through the container state 
> machine as a new state, say UPDATING and a new container event.
> The plan is to add some extra tests too to try and test every transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-7240:
--
Attachment: YARN-7240.002.patch

[~kartheek], looks like the patch has not been rebased.
Re-attaching your last patch.

> Add more states and transitions to stabilize the NM Container state machine
> ---
>
> Key: YARN-7240
> URL: https://issues.apache.org/jira/browse/YARN-7240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: kartheek muthyala
> Attachments: YARN-7240.001.patch, YARN-7240.002.patch, 
> YARN-7240.002.patch
>
>
> There seem to be a few intermediate states that can be added to improve the 
> stability of the NM container state machine.
> For. eg:
> * The REINITIALIZING should probably be split into REINITIALIZING and 
> REINITIALIZING_AWAITING_KILL. 
> * Container updates are currently handled in the ContainerScheduler, but it 
> would probably be better to have it plumbed through the container state 
> machine as a new state, say UPDATING and a new container event.
> The plan is to add some extra tests too to try and test every transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-2915) Enable YARN RM scale out via federation using multiple RM's

2017-09-25 Thread Subru Krishnan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subru Krishnan resolved YARN-2915.
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-beta1
   2.9.0

> Enable YARN RM scale out via federation using multiple RM's
> ---
>
> Key: YARN-2915
> URL: https://issues.apache.org/jira/browse/YARN-2915
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, resourcemanager
>Reporter: Sriram Rao
>Assignee: Subru Krishnan
>  Labels: federation
> Fix For: 2.9.0, 3.0.0-beta1
>
> Attachments: Federation-BoF.pdf, 
> FEDERATION_CAPACITY_ALLOCATION_JIRA.pdf, federation-prototype.patch, 
> Yarn_federation_design_v1.pdf, YARN-Federation-Hadoop-Summit_final.pptx
>
>
> This is an umbrella JIRA that proposes to scale out YARN to support large 
> clusters comprising of tens of thousands of nodes.   That is, rather than 
> limiting a YARN managed cluster to about 4k in size, the proposal is to 
> enable the YARN managed cluster to be elastically scalable.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1014) Configure OOM Killer to kill OPPORTUNISTIC containers first

2017-09-25 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179775#comment-16179775
 ] 

Miklos Szegedi commented on YARN-1014:
--

Thank you for the patch [~haibochen]. I have a few comments:
{code}
731 try (PrintWriter pw = new 
PrintWriter(oomAdjFile.getAbsolutePath(),
{code}
It would probably be useful to use the autoclosable feature of PrintWriter
Technically oom_adj is ASCII, I think.
Per Linux man page: Since Linux 2.6.36, use of oom_adj file is deprecated in 
favor of /proc/\[pid\]/oom_score_adj.
I would put all this code into LinuxContainerExecutor.java instead of 
ContainerExecutor.java
{code}
629  * Set OOM Killer Priority for Opportunistic containers.
630  */
631 private void setContainerOomKillerPriority(ContainerId containerId,
632String containerPid) {
{code}
There are missing parameter specifications in the javadoc.
Also, I would call this function setPriorityMaximum, setOpportunisticPriority 
or something similar, since we are setting to a specific value, not to a 
parameter.
Also I would create a final static variable to explain the value of 15. The 
value could be retrieved from the container executor.
I do not see the unit test actually testing whether the priority was set or not.


> Configure OOM Killer to kill OPPORTUNISTIC containers first
> ---
>
> Key: YARN-1014
> URL: https://issues.apache.org/jira/browse/YARN-1014
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.0.0-alpha1
>Reporter: Arun C Murthy
>Assignee: Haibo Chen
> Attachments: YARN-1014.00.patch, YARN-1014.01.patch, 
> YARN-1014.02.patch
>
>
> YARN-2882 introduces the notion of OPPORTUNISTIC containers. These containers 
> should be killed first should the system run out of memory. 
> -
> Previous description:
> Once RM allocates 'speculative containers' we need to get LCE to schedule 
> them at lower priorities via cgroups.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung reassigned YARN-7251:
---

Assignee: Jonathan Hung

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7251-YARN-5734.001.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-25 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh updated YARN-7248:
--
Attachment: YARN-7248.001.patch

Attaching initial patch based on discussions above.
* Removed the SCHEDULED and PAUSED states from api.records.ContainerState.
* ContainerStatus already seems to have a map of container 
attributes used to send the IP and host etc. I added an extra string key called 
"ExtraStateInfo" which contains the stringified internal conatiner state as 
well. [~jlowe], I guess this should make it more extensible that having a new 
enum ?
* Updated all the test cases.


> NM returns new SCHEDULED container status to older clients
> --
>
> Key: YARN-7248
> URL: https://issues.apache.org/jira/browse/YARN-7248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Arun Suresh
>Priority: Blocker
> Attachments: YARN-7248.001.patch
>
>
> YARN-4597 added a new SCHEDULED container state and that state is returned to 
> clients when the container is localizing, etc.  However the client may be 
> running on an older software version that does not have the new SCHEDULED 
> state which could lead the client to crash on the unexpected container state 
> value or make incorrect assumptions like any state != NEW and != RUNNING must 
> be COMPLETED which was true in the older version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-25 Thread Arun Suresh (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun Suresh reassigned YARN-7248:
-

Assignee: Arun Suresh

> NM returns new SCHEDULED container status to older clients
> --
>
> Key: YARN-7248
> URL: https://issues.apache.org/jira/browse/YARN-7248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Assignee: Arun Suresh
>Priority: Blocker
>
> YARN-4597 added a new SCHEDULED container state and that state is returned to 
> clients when the container is localizing, etc.  However the client may be 
> running on an older software version that does not have the new SCHEDULED 
> state which could lead the client to crash on the unexpected container state 
> value or make incorrect assumptions like any state != NEW and != RUNNING must 
> be COMPLETED which was true in the older version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179720#comment-16179720
 ] 

Hadoop QA commented on YARN-7249:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
55s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2.8 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
37s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} branch-2.8 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} branch-2.8 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}154m 28s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}184m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.monitor.TestSchedulingMonitor |
|   | hadoop.yarn.server.resourcemanager.TestAMAuthorization |
|   | hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector |
|   | hadoop.yarn.server.resourcemanager.TestRMDispatcher |
|   | hadoop.yarn.server.resourcemanager.ahs.TestRMApplicationHistoryWriter |
|   | hadoop.yarn.server.resourcemanager.scheduler.fair.TestSchedulingPolicy |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
|   | hadoop.yarn.server.resourcemanager.scheduler.TestAbstractYarnScheduler |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSurgicalPreemption
 |
|   | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
|   | hadoop.yarn.server.resourcemanager.TestApplicationMasterLauncher |
| Timed out junit tests | 
org.apache.hadoop.yarn.server.resourcemanager.TestRMHA |
|   | org.apache.hadoop.yarn.server.resourcemanager.recovery.TestFSRMStateStore 
|
|   | org.apache.hadoop.yarn.server.resourcemanager.TestRMAdminService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:c2d96dd |
| JIRA Issue | YARN-7249 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888909/YARN-7249.branch-2.8.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux bc7a1463639c 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 
12:18:55 UTC 2017 x86_64 x86_64 

[jira] [Commented] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-09-25 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179698#comment-16179698
 ] 

Sidharta Seethana commented on YARN-7226:
-

{quote}
But how does this work today for Docker containers? The launch script is using 
the var:-value syntax, which means if the image specified the value then it 
will not take the value the user desires. In other words, it looks like for 
Docker containers the semantics of the whitelist is the list of variables that 
cannot be overridden by the container.
I hope we can all agree that if the container explicitly sets an environment 
variable then that variable should be set to the value the user specified. I 
think the only issue then is what to do about variables that are getting 
implicitly set via the NM whitelist that end up overriding the Docker image 
variables unintentionally.
{quote}

[~jlowe], you are right on both counts - short of moving more of the common 
functionality into the executors/container runtimes, there wasn't a better way 
of handling this at the time. The goal was to ensure that non-docker 
containers' behavior is not affected (however, based on this JIRA it looks like 
there was a scenario that wasn't considered). 

The approach of moving the implementation into the executors/runtimes makes 
sense to me as well - though keeping the semantics in sync (to the extent 
possible) is something we'll need to be careful about in the future. Maybe, at 
some point in the future, the docker implementation could use a different 
approach altogether (e.g specifying these env vars as a part of the docker run 
command). 





> Whitelisted variables do not support delayed variable expansion
> ---
>
> Key: YARN-7226
> URL: https://issues.apache.org/jira/browse/YARN-7226
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 2.8.1, 3.0.0-alpha4
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: YARN-7226.001.patch, YARN-7226.002.patch
>
>
> The nodemanager supports a configurable list of environment variables, via 
> yarn.nodemanager.env-whitelist, that will be propagated to the container's 
> environment unless those variables were specified in the container launch 
> context.  Unfortunately the handling of these whitelisted variables prevents 
> using delayed variable expansion.  For example, if a user shipped their own 
> version of hadoop with their job via the distributed cache and specified:
> {noformat}
> HADOOP_COMMON_HOME={{PWD}}/my-private-hadoop/
> {noformat}
>  as part of their job, the variable will be set as the *literal* string:
> {noformat}
> $PWD/my-private-hadoop/
> {noformat}
> rather than having $PWD expand to the container's current directory as it 
> does for any other, non-whitelisted variable being set to the same value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7226) Whitelisted variables do not support delayed variable expansion

2017-09-25 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179676#comment-16179676
 ] 

Eric Badger commented on YARN-7226:
---

[~jlowe], this approach makes sense to me. I don't see any case where the 
docker container would want to use the NM's env vars over its own specified env 
vars, since the layout of the docker container is completely separate from that 
of the NM. So basically in the docker case, the docker container will either 
get what the user explicitly sets or whatever is in the docker image (in that 
order).

And if we want to have variables that won't be overridden by anything (i.e. we 
take whatever the docker container sets no matter what), then we should do that 
in a different place than the whitelist and should file a followup JIRA. 

> Whitelisted variables do not support delayed variable expansion
> ---
>
> Key: YARN-7226
> URL: https://issues.apache.org/jira/browse/YARN-7226
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 2.8.1, 3.0.0-alpha4
>Reporter: Jason Lowe
>Assignee: Jason Lowe
> Attachments: YARN-7226.001.patch, YARN-7226.002.patch
>
>
> The nodemanager supports a configurable list of environment variables, via 
> yarn.nodemanager.env-whitelist, that will be propagated to the container's 
> environment unless those variables were specified in the container launch 
> context.  Unfortunately the handling of these whitelisted variables prevents 
> using delayed variable expansion.  For example, if a user shipped their own 
> version of hadoop with their job via the distributed cache and specified:
> {noformat}
> HADOOP_COMMON_HOME={{PWD}}/my-private-hadoop/
> {noformat}
>  as part of their job, the variable will be set as the *literal* string:
> {noformat}
> $PWD/my-private-hadoop/
> {noformat}
> rather than having $PWD expand to the container's current directory as it 
> does for any other, non-whitelisted variable being set to the same value.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7241) Merge YARN-5734 to trunk/branch-2

2017-09-25 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179675#comment-16179675
 ] 

Jonathan Hung commented on YARN-7241:
-

Hi [~leftnoteasy], attached a patch to YARN-7251 addressing these comments 
other than the ones I mentioned earlier.

Also attaching a 002 full diff here based on these changes.



> Merge YARN-5734 to trunk/branch-2
> -
>
> Key: YARN-7241
> URL: https://issues.apache.org/jira/browse/YARN-7241
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7241.001.patch, YARN-7241.002.patch
>
>
> Ticket for jenkins pre-commit for full diff.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7241) Merge YARN-5734 to trunk/branch-2

2017-09-25 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-7241:

Attachment: YARN-7241.002.patch

> Merge YARN-5734 to trunk/branch-2
> -
>
> Key: YARN-7241
> URL: https://issues.apache.org/jira/browse/YARN-7241
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7241.001.patch, YARN-7241.002.patch
>
>
> Ticket for jenkins pre-commit for full diff.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels

2017-09-25 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179674#comment-16179674
 ] 

Daniel Templeton commented on YARN-2497:


[~leftnoteasy], it looks to me like {{NODE_LABEL_EXPRESSION_NOT_SET}} has to be 
{{""}}, because it gets submitted as the label expression for the app... Unless 
we want to add a bunch of checks for {{NODE_LABEL_EXPRESSION_NOT_SET}} in the 
scheduler.  I think the better approach would be to do the formatting when we 
print the label expression.  If you don't want duplicate code, we can add a 
helper method.

> Changes for fair scheduler to support allocate resource respect labels
> --
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Wangda Tan
>Assignee: Daniel Templeton
> Attachments: YARN-2497.001.patch, YARN-2497.002.patch, 
> YARN-2497.003.patch, YARN-2497.004.patch, YARN-2499.WIP01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179670#comment-16179670
 ] 

Jonathan Hung commented on YARN-7251:
-

Attaching 001 patch based on comments in 
https://issues.apache.org/jira/browse/YARN-7241?focusedCommentId=16179430=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16179430.
 Also added schedulerconf in yarn/yarn.cmd printUsage.

Not triggering jenkins since this patch depends on documentation in YARN-7238, 
which is not committed yet.

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
> Attachments: YARN-7251-YARN-5734.001.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2497) Changes for fair scheduler to support allocate resource respect labels

2017-09-25 Thread Daniel Templeton (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Templeton updated YARN-2497:
---
Attachment: YARN-2497.004.patch

I found the bug that was causing the CS test failures.  I also added some more 
functionality and tests.  Let's see how this one goes.

> Changes for fair scheduler to support allocate resource respect labels
> --
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Wangda Tan
>Assignee: Daniel Templeton
> Attachments: YARN-2497.001.patch, YARN-2497.002.patch, 
> YARN-2497.003.patch, YARN-2497.004.patch, YARN-2499.WIP01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hung updated YARN-7251:

Attachment: YARN-7251-YARN-5734.001.patch

> Misc changes to YARN-5734
> -
>
> Key: YARN-7251
> URL: https://issues.apache.org/jira/browse/YARN-7251
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
> Attachments: YARN-7251-YARN-5734.001.patch
>
>
> Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7250) Update Shared cache client api to use URLs

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179656#comment-16179656
 ] 

Hadoop QA commented on YARN-7250:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 31m 20s{color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.api.impl.TestNMClient |
| Timed out junit tests | org.apache.hadoop.yarn.client.api.impl.TestAMRMClient 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-7250 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888926/YARN-7250-trunk-001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0fb59ac33e3c 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e928ee5 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/17625/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/17625/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/17625/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Update Shared cache client api to use URLs
> --
>
> Key: YARN-7250
> URL: https://issues.apache.org/jira/browse/YARN-7250
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris 

[jira] [Commented] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179653#comment-16179653
 ] 

Eric Payne commented on YARN-7249:
--

[~leftnoteasy], I recognize that calling {{queue.completedContainer}} in 
{{CapacityScheduler#completedContainerInternal}} doesn't make sense if {{node}} 
is null, but if {{queue.completedContainer}} isn't called, won't that leave 
references to the container still inside internal structures? And, for example, 
won't reserved incrimination counters un-decremented?

> Fix CapacityScheduler NPE issue when a container preempted while the node is 
> being removed
> --
>
> Key: YARN-7249
> URL: https://issues.apache.org/jira/browse/YARN-7249
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7249.branch-2.8.001.patch
>
>
> This issue could happen when 3 conditions satisfied:
> 1) A node is removing from scheduler.
> 2) A container running on the node is being preempted. 
> 3) A rare race condition causes scheduler pass a null node to leaf queue.
> Fix of the problem is to add a null node check inside CapacityScheduler.
> Stack trace:
> {code}
> 2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(714)) - Error in handling event type 
> KILL_RESERVED_CONTAINER to the scheduler 
> java.lang.NullPointerException 
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)
>  
> {code}
> This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7250) Update Shared cache client api to use URLs

2017-09-25 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-7250:
---
Description: 
We should make the SharedCacheClient use api more consistent with other YARN 
api methods. We can do this by doing two things:
# Update the SharedCacheClient#use api so that it returns a URL instead of a 
Path. Currently yarn developers have to convert the path to a URL when creating 
a LocalResources. It would be much smoother if they could just use a URL passed 
to them by the shared cache client.
# Remove the portion of the client that deals with fragments as this is not 
consistent with the rest of YARN. This functionality is bleeding in from the 
MapReduce layer, which uses fragments to keep track of destination file names. 
YARN's api does not use fragments. Instead  the ContainerLaunchContext expects 
a Map localResources, where the strings are the 
destination file names. We should let the YARN application handle destination 
file names however it wants instead of pushing this into the shared cache api. 
Additionally, fragments are a clunky way to handle this.

  was:
We should make the SharedCacheClient use api more consistent with other YARN 
api methods. We can do this by doing two things:
# Update the SharedCacheClient#use api so that it returns a URL instead of a 
Path. Currently yarn developers have to convert the path to a URL when creating 
a LocalResources. It would be much smoother if they could just use a URL passed 
to them by the shared cache client.
# Remove the portion of the client that deals with fragments as this is not 
consistent with the rest of YARN. This functionality is bleeding in from the 
MapReduce layer, which uses fragments to keep track of destination file names. 
YARN's api does not use fragments. Instead  the ContainerLaunchContext expects 
a Map localResources, where the strings are the 
destination file names. We should let the YARN application handle destination 
file names however it wants instead of pushing this into the shared cache api. 
Additionally, fragments is a clunky way to handle this.


> Update Shared cache client api to use URLs
> --
>
> Key: YARN-7250
> URL: https://issues.apache.org/jira/browse/YARN-7250
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: YARN-7250-trunk-001.patch
>
>
> We should make the SharedCacheClient use api more consistent with other YARN 
> api methods. We can do this by doing two things:
> # Update the SharedCacheClient#use api so that it returns a URL instead of a 
> Path. Currently yarn developers have to convert the path to a URL when 
> creating a LocalResources. It would be much smoother if they could just use a 
> URL passed to them by the shared cache client.
> # Remove the portion of the client that deals with fragments as this is not 
> consistent with the rest of YARN. This functionality is bleeding in from the 
> MapReduce layer, which uses fragments to keep track of destination file 
> names. YARN's api does not use fragments. Instead  the ContainerLaunchContext 
> expects a Map localResources, where the strings are 
> the destination file names. We should let the YARN application handle 
> destination file names however it wants instead of pushing this into the 
> shared cache api. Additionally, fragments are a clunky way to handle this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Eric Payne (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179553#comment-16179553
 ] 

Eric Payne commented on YARN-7249:
--

[~leftnoteasy]: Sure. Looking now.

> Fix CapacityScheduler NPE issue when a container preempted while the node is 
> being removed
> --
>
> Key: YARN-7249
> URL: https://issues.apache.org/jira/browse/YARN-7249
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7249.branch-2.8.001.patch
>
>
> This issue could happen when 3 conditions satisfied:
> 1) A node is removing from scheduler.
> 2) A container running on the node is being preempted. 
> 3) A rare race condition causes scheduler pass a null node to leaf queue.
> Fix of the problem is to add a null node check inside CapacityScheduler.
> Stack trace:
> {code}
> 2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(714)) - Error in handling event type 
> KILL_RESERVED_CONTAINER to the scheduler 
> java.lang.NullPointerException 
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)
>  
> {code}
> This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179550#comment-16179550
 ] 

Hadoop QA commented on YARN-7240:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m  
3s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
17s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 19s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 22s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 43 new + 288 unchanged - 8 fixed = 331 total (was 296) 
{color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
19s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
16s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
19s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 4 new + 103 unchanged - 0 fixed = 107 total (was 103) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 19s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 22m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-7240 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888922/YARN-7240.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux e2d3ccb54916 3.13.0-123-generic #172-Ubuntu SMP Mon Jun 26 
18:04:35 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / e928ee5 |
| Default Java | 1.8.0_144 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-YARN-Build/17624/artifact/patchprocess/patch-mvninstall-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-YARN-Build/17624/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-YARN-Build/17624/artifact/patchprocess/patch-compile-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
| checkstyle | 

[jira] [Commented] (YARN-6623) Add support to turn off launching privileged containers in the container-executor

2017-09-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179552#comment-16179552
 ] 

Hadoop QA commented on YARN-6623:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 13 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
43s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
23s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 23 unchanged - 4 fixed = 23 total (was 27) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 59s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
35s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
18s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:71bbb86 |
| JIRA Issue | YARN-6623 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12888901/YARN-6623.013.patch |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  javadoc  
mvninstall  

[jira] [Created] (YARN-7251) Misc changes to YARN-5734

2017-09-25 Thread Jonathan Hung (JIRA)
Jonathan Hung created YARN-7251:
---

 Summary: Misc changes to YARN-5734
 Key: YARN-7251
 URL: https://issues.apache.org/jira/browse/YARN-7251
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Jonathan Hung


Documentation/style changes to YARN-5734 before merge.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7250) Update Shared cache client api to use URLs

2017-09-25 Thread Chris Trezzo (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Trezzo updated YARN-7250:
---
Attachment: YARN-7250-trunk-001.patch

Attached v1 patch.

> Update Shared cache client api to use URLs
> --
>
> Key: YARN-7250
> URL: https://issues.apache.org/jira/browse/YARN-7250
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Chris Trezzo
>Assignee: Chris Trezzo
>Priority: Minor
> Attachments: YARN-7250-trunk-001.patch
>
>
> We should make the SharedCacheClient use api more consistent with other YARN 
> api methods. We can do this by doing two things:
> # Update the SharedCacheClient#use api so that it returns a URL instead of a 
> Path. Currently yarn developers have to convert the path to a URL when 
> creating a LocalResources. It would be much smoother if they could just use a 
> URL passed to them by the shared cache client.
> # Remove the portion of the client that deals with fragments as this is not 
> consistent with the rest of YARN. This functionality is bleeding in from the 
> MapReduce layer, which uses fragments to keep track of destination file 
> names. YARN's api does not use fragments. Instead  the ContainerLaunchContext 
> expects a Map localResources, where the strings are 
> the destination file names. We should let the YARN application handle 
> destination file names however it wants instead of pushing this into the 
> shared cache api. Additionally, fragments is a clunky way to handle this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7250) Update Shared cache client api to use URLs

2017-09-25 Thread Chris Trezzo (JIRA)
Chris Trezzo created YARN-7250:
--

 Summary: Update Shared cache client api to use URLs
 Key: YARN-7250
 URL: https://issues.apache.org/jira/browse/YARN-7250
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Chris Trezzo
Assignee: Chris Trezzo
Priority: Minor


We should make the SharedCacheClient use api more consistent with other YARN 
api methods. We can do this by doing two things:
# Update the SharedCacheClient#use api so that it returns a URL instead of a 
Path. Currently yarn developers have to convert the path to a URL when creating 
a LocalResources. It would be much smoother if they could just use a URL passed 
to them by the shared cache client.
# Remove the portion of the client that deals with fragments as this is not 
consistent with the rest of YARN. This functionality is bleeding in from the 
MapReduce layer, which uses fragments to keep track of destination file names. 
YARN's api does not use fragments. Instead  the ContainerLaunchContext expects 
a Map localResources, where the strings are the 
destination file names. We should let the YARN application handle destination 
file names however it wants instead of pushing this into the shared cache api. 
Additionally, fragments is a clunky way to handle this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7241) Merge YARN-5734 to trunk/branch-2

2017-09-25 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179520#comment-16179520
 ] 

Jonathan Hung commented on YARN-7241:
-

Thanks for the feedback [~leftnoteasy] - most of these changes make sense to 
me, had a few concerns:
bq. for CLI: schedconf => scheduler-conf
How about schedulerconf? For consistency with the other yarn cmds
bq. It looks like we may not need to add anything to 
CapacitySchedulerConfiguration. CS_CONF_PROVIDER can be replaced by 
SCHEDULER_CONFIGURATION_STORE_CLASS, correct?
Not sure about this, currently CS_CONF_PROVIDER used for enabling the feature 
(file vs API), then SCHEDULER_CONFIGURATION_STORE_CLASS for the store 
implementation. I suppose we could do something like: if 
SCHEDULER_CONFIGURATION_STORE_CLASS is null, then use file based conf provider, 
else use store-based conf provider. What do you think?
bq. "Memory" should not be a default option value, we should put "zk" as 
default. (Need to change CapacityScheduler.md)
Currently default implementation of RMStateStore is also memory. Also if they 
enable this feature and it defaults to zk, startup will fail since they also 
need to configure other zookeeper related configs. Not sure if this is the 
behavior we want. To me it makes sense to default to a naive (memory) 
implementation, and if they have specific requirements (i.e. RM HA) then they 
can configure it to use zk. What do you think?

> Merge YARN-5734 to trunk/branch-2
> -
>
> Key: YARN-7241
> URL: https://issues.apache.org/jira/browse/YARN-7241
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7241.001.patch
>
>
> Ticket for jenkins pre-commit for full diff.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7240) Add more states and transitions to stabilize the NM Container state machine

2017-09-25 Thread kartheek muthyala (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kartheek muthyala updated YARN-7240:

Attachment: YARN-7240.002.patch

[~asuresh], Thank you for the quick feedback.

The updated patch has the following changes
1. New testcases for TestContainerSchdedulerQueuing for the paused and promoted 
containers which checks the container states and their events flow through 
listener class.
2. Requeing the containers to the front of the queuedGuaranteed or 
queuedOpprotunistic queues would require to change the datastructures or we 
have to put use some temporary auxiliary queues to move this element. I am not 
sure if it can be optimal. Let me know what you think.
3. Removed the TODO comment.

> Add more states and transitions to stabilize the NM Container state machine
> ---
>
> Key: YARN-7240
> URL: https://issues.apache.org/jira/browse/YARN-7240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Arun Suresh
>Assignee: kartheek muthyala
> Attachments: YARN-7240.001.patch, YARN-7240.002.patch
>
>
> There seem to be a few intermediate states that can be added to improve the 
> stability of the NM container state machine.
> For. eg:
> * The REINITIALIZING should probably be split into REINITIALIZING and 
> REINITIALIZING_AWAITING_KILL. 
> * Container updates are currently handled in the ContainerScheduler, but it 
> would probably be better to have it plumbed through the container state 
> machine as a new state, say UPDATING and a new container event.
> The plan is to add some extra tests too to try and test every transition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179469#comment-16179469
 ] 

Jason Lowe commented on YARN-7248:
--

Sounds good.  We'll be all set until someone comes along in Hadoop 3.4 and 
wants to add another substate. ;-)

> NM returns new SCHEDULED container status to older clients
> --
>
> Key: YARN-7248
> URL: https://issues.apache.org/jira/browse/YARN-7248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Priority: Blocker
>
> YARN-4597 added a new SCHEDULED container state and that state is returned to 
> clients when the container is localizing, etc.  However the client may be 
> running on an older software version that does not have the new SCHEDULED 
> state which could lead the client to crash on the unexpected container state 
> value or make incorrect assumptions like any state != NEW and != RUNNING must 
> be COMPLETED which was true in the older version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-25 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179458#comment-16179458
 ] 

Arun Suresh commented on YARN-7248:
---

Ah yes, we had included LOCALIZING in the reported SCHEDULED state as well.

bq. Another way to tackle this is to move the subset of states related to 
RUNNING into another enum rather than extending the existing one
I like this, Let me give this a shot. Essentially, I will:
# remove the api.records.ContainerState.SCHEDULED state completely
# Add a new api record, say  {{api.records.ContainerSubState}} ? which can have 
LOCALIZING, LOCALIZING_FAILED, SCHEDULED, PAUSED, RESUMING etc. which will be 
returned along with the container status, which clients can choose to ignore.

Sounds good ? 


> NM returns new SCHEDULED container status to older clients
> --
>
> Key: YARN-7248
> URL: https://issues.apache.org/jira/browse/YARN-7248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Priority: Blocker
>
> YARN-4597 added a new SCHEDULED container state and that state is returned to 
> clients when the container is localizing, etc.  However the client may be 
> running on an older software version that does not have the new SCHEDULED 
> state which could lead the client to crash on the unexpected container state 
> value or make incorrect assumptions like any state != NEW and != RUNNING must 
> be COMPLETED which was true in the older version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6570) No logs were found for running application, running container

2017-09-25 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179434#comment-16179434
 ] 

Arun Suresh commented on YARN-6570:
---

Got it.. thanks

> No logs were found for running application, running container
> -
>
> Key: YARN-6570
> URL: https://issues.apache.org/jira/browse/YARN-6570
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: YARN-6570-branch-2.8.001.patch, 
> YARN-6570-branch-2.8.002.patch, YARN-6570.poc.patch, YARN-6570-v2.patch, 
> YARN-6570-v3.patch
>
>
> 1.Obtain running containers from the following CLI for running application:
>  yarn  container -list appattempt
> 2. Couldnot fetch logs 
> {code}
> Can not find any log file matching the pattern: ALL for the container
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7241) Merge YARN-5734 to trunk/branch-2

2017-09-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179430#comment-16179430
 ] 

Wangda Tan commented on YARN-7241:
--

1) for CLI: {{schedconf}} => {{scheduler-conf}}
- Need to change yarn/yarn.cmd, also {{SchedConfCLI#printUsage}}

2) SchedConfCLI
- Annotation: {{Evolving}} => {{Unstable}}
- In the {{printUsage}}, please note that this is an alpha feature, all options 
could be changed in the future and do not use it in production.
- Do you think is it better to use ":" to separate queues and their parameters, 
such as: 
{code}
-add root.a:capacity=10,user-limit=50;root.b...
-update root.a:capacity=10 ...
{code}
- To make it consistent, it's better to use ";" to separate queues in 
{{-remove}}
- Also, please give some example in the printUsage:
{code}
231 System.out.println("yarn schedconf [-add 
queueAddPath1,confKey1=confVal1,"
232 + "confKey2=confVal2;queueAddPath2,confKey3=confVal3] "
233 + "[-remove queueRemovePath1,queueRemovePath2] "
234 + "[-update queueUpdatePath1,confKey1=confVal1] "
235 + "[-global globalConfKey1=globalConfVal1,"
236 + "globalConfKey2=globalConfVal2]");
{code}
It is too abstract for people to understand. (Example like {{-add 
root.a:capacity=100}})
- You may need to update YarnCommands.md as well.

3) Configurations:
- It looks like we may not need to add anything to 
CapacitySchedulerConfiguration. {{CS_CONF_PROVIDER}} can be replaced by 
{{SCHEDULER_CONFIGURATION_STORE_CLASS}}, correct?
- Also, please mark all new added {{public static final String}} in 
YarnConfiguration to {{@Private/@Unstable}}.
- Inside yarn-default.xml:
Please add a note to {{yarn.scheduler.configuration.store.class}}, it only take 
effect when scheduler support mutable conf, and please add note that currently 
only CapacityScheduler supports mutable conf.
- "Memory" should not be a default option value, we should put "zk" as default. 
(Need to change CapacityScheduler.md)

4) Rest endpoint: 
- {{/sched-conf}} => {{/scheduler-conf}}


> Merge YARN-5734 to trunk/branch-2
> -
>
> Key: YARN-7241
> URL: https://issues.apache.org/jira/browse/YARN-7241
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
> Attachments: YARN-7241.001.patch
>
>
> Ticket for jenkins pre-commit for full diff.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6570) No logs were found for running application, running container

2017-09-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179408#comment-16179408
 ] 

Jason Lowe commented on YARN-6570:
--

Yes, which is why I didn't revert this from branch-2 or trunk, just branch-2.8. 
 YARN-4597 is not in branch-2.8, so this JIRA's branch-2.8 patch added a new 
SCHEDULED state without updating all the places that handle the ContainerState 
enum.  Even if it had, we would still have the backwards compatibility issues 
being tracked by YARN-7248.

> No logs were found for running application, running container
> -
>
> Key: YARN-6570
> URL: https://issues.apache.org/jira/browse/YARN-6570
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: YARN-6570-branch-2.8.001.patch, 
> YARN-6570-branch-2.8.002.patch, YARN-6570.poc.patch, YARN-6570-v2.patch, 
> YARN-6570-v3.patch
>
>
> 1.Obtain running containers from the following CLI for running application:
>  yarn  container -list appattempt
> 2. Couldnot fetch logs 
> {code}
> Can not find any log file matching the pattern: ALL for the container
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179401#comment-16179401
 ] 

Jason Lowe commented on YARN-7248:
--

bq. Is it ok if for the fix, we check and return RUNNING to the client if the 
container is GUARANTEED ?

Probably not.  YARN-6570 appears to have built upon this new container state 
and likely will lose their bugfix.

> NM returns new SCHEDULED container status to older clients
> --
>
> Key: YARN-7248
> URL: https://issues.apache.org/jira/browse/YARN-7248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Priority: Blocker
>
> YARN-4597 added a new SCHEDULED container state and that state is returned to 
> clients when the container is localizing, etc.  However the client may be 
> running on an older software version that does not have the new SCHEDULED 
> state which could lead the client to crash on the unexpected container state 
> value or make incorrect assumptions like any state != NEW and != RUNNING must 
> be COMPLETED which was true in the older version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-25 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179392#comment-16179392
 ] 

Jason Lowe commented on YARN-7248:
--

Maybe I'm missing something, but it looks like containers can be in the 
SCHEDULED state for quite a long time depending upon how long it takes for them 
to localize:
{noformat}
  public org.apache.hadoop.yarn.api.records.ContainerState getCurrentState() {
switch (stateMachine.getCurrentState()) {
case NEW:
  return org.apache.hadoop.yarn.api.records.ContainerState.NEW;
case LOCALIZING:
case LOCALIZATION_FAILED:
case SCHEDULED:
case PAUSED:
case RESUMING:
  return org.apache.hadoop.yarn.api.records.ContainerState.SCHEDULED;
{noformat}

Another way to tackle this is to move the subset of states related to RUNNING 
into another enum rather than extending the existing one. Of course when 
someone later wants to modify that to add a new substate we'll run into this 
type of problem again.  In general any time we modify an enum that appears in a 
protobuf we could have issues like this.


> NM returns new SCHEDULED container status to older clients
> --
>
> Key: YARN-7248
> URL: https://issues.apache.org/jira/browse/YARN-7248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Priority: Blocker
>
> YARN-4597 added a new SCHEDULED container state and that state is returned to 
> clients when the container is localizing, etc.  However the client may be 
> running on an older software version that does not have the new SCHEDULED 
> state which could lead the client to crash on the unexpected container state 
> value or make incorrect assumptions like any state != NEW and != RUNNING must 
> be COMPLETED which was true in the older version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7105) Refactor of Queue Visualization

2017-09-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179357#comment-16179357
 ] 

Wangda Tan commented on YARN-7105:
--

Thanks [~dingda6] for this prototype, also this is design doc for this UI: 
https://docs.google.com/document/d/1QaQQEwp-_ZMUo8pNjPjjdRMSTWKNLe-36JICGUpvWFo/edit?usp=sharing

Please feel free to leave your comments/suggestions on the doc.

> Refactor of Queue Visualization
> ---
>
> Key: YARN-7105
> URL: https://issues.apache.org/jira/browse/YARN-7105
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Da Ding
>Assignee: Da Ding
> Attachments: Screen Shot 2017-09-24 at 8.19.36 PM.png, Screen Shot 
> 2017-09-24 at 8.50.29 PM.png
>
>
> Based on the discussion with [~wangda], current implementation of Queue is 
> not easy to use if number of nodes is large, and visualization is not very 
> straightforward in terms of node hierarchy and details. Therefore, we want to 
> refactor queue visualization to be more user friendly for different use cases.
> [^Screen Shot 2017-09-24 at 8.19.36 PM.png]
>  !Screen Shot 2017-09-24 at 8.19.36 PM.png|thumbnail! 
> The image above is the mockup for the new queue visualization, basically it 
> should have the following features:
> 1. Each node would be visualized by a horizontal bar, and hierarchy would be 
> distinguished by indentation of each node. (e.g. If one node is a child node 
> of another parent node, there would be indentation before the child node.)
> 2. Each node would show usage info on its bar, e.g. displaying used usage 
> like a progress bar along with the percentage.
> 3. Click +/- to expand/collapse the tree to show/hide nodes.
> 4. Click on bar would show detail charts on the bottom.
> 5. A filter on the top-right would filter the nodes based on options. 
> [^Screen Shot 2017-09-24 at 8.50.29 PM.png]
>  !Screen Shot 2017-09-24 at 8.50.29 PM.png|thumbnail! 
> Currently I finished feature 1 and feature 2. See screenshot. 
> Since this ticket is relatively big and would take a while to complete, I 
> will keep the ticket updated. Don't hesitate to comment your idea or 
> suggestion. Thanks!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7105) Refactor of Queue Visualization

2017-09-25 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7105:
-
Description: 
Based on the discussion with [~wangda], current implementation of Queue is not 
easy to use if number of nodes is large, and visualization is not very 
straightforward in terms of node hierarchy and details. Therefore, we want to 
refactor queue visualization to be more user friendly for different use cases.

[^Screen Shot 2017-09-24 at 8.19.36 PM.png]
 !Screen Shot 2017-09-24 at 8.19.36 PM.png|thumbnail! 

The image above is the mockup for the new queue visualization, basically it 
should have the following features:
1. Each node would be visualized by a horizontal bar, and hierarchy would be 
distinguished by indentation of each node. (e.g. If one node is a child node of 
another parent node, there would be indentation before the child node.)
2. Each node would show usage info on its bar, e.g. displaying used usage like 
a progress bar along with the percentage.
3. Click +/- to expand/collapse the tree to show/hide nodes.
4. Click on bar would show detail charts on the bottom.
5. A filter on the top-right would filter the nodes based on options. 

[^Screen Shot 2017-09-24 at 8.50.29 PM.png]
 !Screen Shot 2017-09-24 at 8.50.29 PM.png|thumbnail! 

Currently I finished feature 1 and feature 2. See screenshot. 

Since this ticket is relatively big and would take a while to complete, I will 
keep the ticket updated. Don't hesitate to comment your idea or suggestion. 
Thanks!

{color:red}{color:#205081}Design see: 
https://docs.google.com/document/d/1QaQQEwp-_ZMUo8pNjPjjdRMSTWKNLe-36JICGUpvWFo/edit?usp=sharing
Please feel free to leave your comments on the design.{color}{color}


  was:
Based on the discussion with [~wangda], current implementation of Queue is not 
easy to use if number of nodes is large, and visualization is not very 
straightforward in terms of node hierarchy and details. Therefore, we want to 
refactor queue visualization to be more user friendly for different use cases.

[^Screen Shot 2017-09-24 at 8.19.36 PM.png]
 !Screen Shot 2017-09-24 at 8.19.36 PM.png|thumbnail! 

The image above is the mockup for the new queue visualization, basically it 
should have the following features:
1. Each node would be visualized by a horizontal bar, and hierarchy would be 
distinguished by indentation of each node. (e.g. If one node is a child node of 
another parent node, there would be indentation before the child node.)
2. Each node would show usage info on its bar, e.g. displaying used usage like 
a progress bar along with the percentage.
3. Click +/- to expand/collapse the tree to show/hide nodes.
4. Click on bar would show detail charts on the bottom.
5. A filter on the top-right would filter the nodes based on options. 

[^Screen Shot 2017-09-24 at 8.50.29 PM.png]
 !Screen Shot 2017-09-24 at 8.50.29 PM.png|thumbnail! 

Currently I finished feature 1 and feature 2. See screenshot. 

Since this ticket is relatively big and would take a while to complete, I will 
keep the ticket updated. Don't hesitate to comment your idea or suggestion. 
Thanks!



> Refactor of Queue Visualization
> ---
>
> Key: YARN-7105
> URL: https://issues.apache.org/jira/browse/YARN-7105
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Da Ding
>Assignee: Da Ding
> Attachments: Screen Shot 2017-09-24 at 8.19.36 PM.png, Screen Shot 
> 2017-09-24 at 8.50.29 PM.png
>
>
> Based on the discussion with [~wangda], current implementation of Queue is 
> not easy to use if number of nodes is large, and visualization is not very 
> straightforward in terms of node hierarchy and details. Therefore, we want to 
> refactor queue visualization to be more user friendly for different use cases.
> [^Screen Shot 2017-09-24 at 8.19.36 PM.png]
>  !Screen Shot 2017-09-24 at 8.19.36 PM.png|thumbnail! 
> The image above is the mockup for the new queue visualization, basically it 
> should have the following features:
> 1. Each node would be visualized by a horizontal bar, and hierarchy would be 
> distinguished by indentation of each node. (e.g. If one node is a child node 
> of another parent node, there would be indentation before the child node.)
> 2. Each node would show usage info on its bar, e.g. displaying used usage 
> like a progress bar along with the percentage.
> 3. Click +/- to expand/collapse the tree to show/hide nodes.
> 4. Click on bar would show detail charts on the bottom.
> 5. A filter on the top-right would filter the nodes based on options. 
> [^Screen Shot 2017-09-24 at 8.50.29 PM.png]
>  !Screen Shot 2017-09-24 at 8.50.29 PM.png|thumbnail! 
> Currently I finished feature 1 and feature 2. See screenshot. 
> Since this ticket is relatively big and would take a while to complete, I 
> will keep the ticket 

[jira] [Commented] (YARN-7153) Remove duplicated code in AMRMClientAsyncImpl.java

2017-09-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179356#comment-16179356
 ] 

Hudson commented on YARN-7153:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #12967 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/12967/])
YARN-7153. Remove duplicated code in AMRMClientAsyncImpl.java. (aajisaka: rev 
e928ee583c5a1367e24eab34057f8d8496891452)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/async/impl/AMRMClientAsyncImpl.java


> Remove duplicated code in AMRMClientAsyncImpl.java
> --
>
> Key: YARN-7153
> URL: https://issues.apache.org/jira/browse/YARN-7153
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Sen Zhao
>Assignee: Sen Zhao
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: YARN-7153.001.patch
>
>
> I notice there are some codes in AMRMClientAsyncImpl#setHeartbeatInterval 
> duplicate that in AMRMClientAsync#setHeartbeatInterval. The two methods are 
> handled the same way. Submit a patch to fix it!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179348#comment-16179348
 ] 

Wangda Tan commented on YARN-7249:
--

[~eepayne] could you help with review? It is a straightforward patch.

> Fix CapacityScheduler NPE issue when a container preempted while the node is 
> being removed
> --
>
> Key: YARN-7249
> URL: https://issues.apache.org/jira/browse/YARN-7249
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7249.branch-2.8.001.patch
>
>
> This issue could happen when 3 conditions satisfied:
> 1) A node is removing from scheduler.
> 2) A container running on the node is being preempted. 
> 3) A rare race condition causes scheduler pass a null node to leaf queue.
> Fix of the problem is to add a null node check inside CapacityScheduler.
> Stack trace:
> {code}
> 2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(714)) - Error in handling event type 
> KILL_RESERVED_CONTAINER to the scheduler 
> java.lang.NullPointerException 
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)
>  
> {code}
> This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-7249:
-
Attachment: YARN-7249.branch-2.8.001.patch

> Fix CapacityScheduler NPE issue when a container preempted while the node is 
> being removed
> --
>
> Key: YARN-7249
> URL: https://issues.apache.org/jira/browse/YARN-7249
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
> Attachments: YARN-7249.branch-2.8.001.patch
>
>
> This issue could happen when 3 conditions satisfied:
> 1) A node is removing from scheduler.
> 2) A container running on the node is being preempted. 
> 3) A rare race condition causes scheduler pass a null node to leaf queue.
> Fix of the problem is to add a null node check inside CapacityScheduler.
> Stack trace:
> {code}
> 2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(714)) - Error in handling event type 
> KILL_RESERVED_CONTAINER to the scheduler 
> java.lang.NullPointerException 
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)
>  
> {code}
> This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan reopened YARN-7249:
--

Double checked, it is still a problem: 

{code}
  @Override
  public void completedContainer(Resource clusterResource, 
  FiCaSchedulerApp application, FiCaSchedulerNode node, RMContainer 
rmContainer, 
  ContainerStatus containerStatus, RMContainerEventType event, CSQueue 
childQueue,
  boolean sortQueues) {
// Update SchedulerHealth for released / preempted container
updateSchedulerHealthForCompletedContainer(rmContainer, containerStatus);

if (application != null) {
  // unreserve container increase request if it previously reserved.
  if (rmContainer.hasIncreaseReservation()) {
unreserveIncreasedContainer(clusterResource, application, node,
rmContainer);
  }
  
  // Remove container increase request if it exists
  application.removeIncreaseRequest(node.getNodeID(),
  rmContainer.getAllocatedPriority(), rmContainer.getContainerId());
{code} 

Since {{node}} passed in could be null, and {{application}} cannot be null, so 
the line:
{code}
  application.removeIncreaseRequest(node.getNodeID(),
  rmContainer.getAllocatedPriority(), rmContainer.getContainerId());
{code} 

Will be invoked and NPE can be thrown.

> Fix CapacityScheduler NPE issue when a container preempted while the node is 
> being removed
> --
>
> Key: YARN-7249
> URL: https://issues.apache.org/jira/browse/YARN-7249
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
>
> This issue could happen when 3 conditions satisfied:
> 1) A node is removing from scheduler.
> 2) A container running on the node is being preempted. 
> 3) A rare race condition causes scheduler pass a null node to leaf queue.
> Fix of the problem is to add a null node check inside CapacityScheduler.
> Stack trace:
> {code}
> 2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(714)) - Error in handling event type 
> KILL_RESERVED_CONTAINER to the scheduler 
> java.lang.NullPointerException 
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)
>  
> {code}
> This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6570) No logs were found for running application, running container

2017-09-25 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179343#comment-16179343
 ] 

Arun Suresh commented on YARN-6570:
---

Sorry, am just a bit confused.
In both branch-2 and trunk, we have:
{code}
 // Process running containers
  if (remoteContainer.getState() == ContainerState.RUNNING ||
  remoteContainer.getState() == ContainerState.SCHEDULED) {
[..]
  } else {
// A finished container
{code}
Which is what we want right ?

> No logs were found for running application, running container
> -
>
> Key: YARN-6570
> URL: https://issues.apache.org/jira/browse/YARN-6570
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sumana Sathish
>Assignee: Junping Du
>Priority: Critical
> Fix For: 2.9.0, 3.0.0-beta1, 3.1.0
>
> Attachments: YARN-6570-branch-2.8.001.patch, 
> YARN-6570-branch-2.8.002.patch, YARN-6570.poc.patch, YARN-6570-v2.patch, 
> YARN-6570-v3.patch
>
>
> 1.Obtain running containers from the following CLI for running application:
>  yarn  container -list appattempt
> 2. Couldnot fetch logs 
> {code}
> Can not find any log file matching the pattern: ALL for the container
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7247) Assign multiple will lead to hot point problems of physical resource consumption

2017-09-25 Thread Yufei Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179340#comment-16179340
 ] 

Yufei Gu commented on YARN-7247:


[~balloons], this is not a problem of multi assignment. You are basically 
asking for anti-affinity. Please check YARN-1042 (add ability to specify 
affinity/anti-affinity in container requests).

> Assign multiple will lead to hot point problems of physical resource 
> consumption
> 
>
> Key: YARN-7247
> URL: https://issues.apache.org/jira/browse/YARN-7247
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: balloons
>Assignee: Daniel Templeton
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7248) NM returns new SCHEDULED container status to older clients

2017-09-25 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179337#comment-16179337
 ] 

Arun Suresh commented on YARN-7248:
---

Thanks for raising this [~jlowe]. So, given that old clients only ask for (and 
are issued from the RM) GUARANTEED containers, and given that GUARANTEED 
containers do not stay in the scheduled state for long (actually, they are 
there just momentarily), we thought it should be fine. But yes, there is a 
small window where a client can ask and for the status and SCHEDULED can be 
returned. Is it ok if for the fix, we check and return RUNNING to the client if 
the container is GUARANTEED ?

> NM returns new SCHEDULED container status to older clients
> --
>
> Key: YARN-7248
> URL: https://issues.apache.org/jira/browse/YARN-7248
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.9.0, 3.0.0-alpha2
>Reporter: Jason Lowe
>Priority: Blocker
>
> YARN-4597 added a new SCHEDULED container state and that state is returned to 
> clients when the container is localizing, etc.  However the client may be 
> running on an older software version that does not have the new SCHEDULED 
> state which could lead the client to crash on the unexpected container state 
> value or make incorrect assumptions like any state != NEW and != RUNNING must 
> be COMPLETED which was true in the older version.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan resolved YARN-7249.
--
Resolution: Invalid

Sorry for the noise, it is not an issue for 2.8 as well. Closing as invalid.

> Fix CapacityScheduler NPE issue when a container preempted while the node is 
> being removed
> --
>
> Key: YARN-7249
> URL: https://issues.apache.org/jira/browse/YARN-7249
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Wangda Tan
>Assignee: Wangda Tan
>Priority: Blocker
>
> This issue could happen when 3 conditions satisfied:
> 1) A node is removing from scheduler.
> 2) A container running on the node is being preempted. 
> 3) A rare race condition causes scheduler pass a null node to leaf queue.
> Fix of the problem is to add a null node check inside CapacityScheduler.
> Stack trace:
> {code}
> 2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager 
> (ResourceManager.java:run(714)) - Error in handling event type 
> KILL_RESERVED_CONTAINER to the scheduler 
> java.lang.NullPointerException 
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
>  
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)
>  
> {code}
> This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7153) Remove duplicated code in AMRMClientAsyncImpl.java

2017-09-25 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated YARN-7153:

Priority: Minor  (was: Major)

> Remove duplicated code in AMRMClientAsyncImpl.java
> --
>
> Key: YARN-7153
> URL: https://issues.apache.org/jira/browse/YARN-7153
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Sen Zhao
>Assignee: Sen Zhao
>Priority: Minor
> Attachments: YARN-7153.001.patch
>
>
> I notice there are some codes in AMRMClientAsyncImpl#setHeartbeatInterval 
> duplicate that in AMRMClientAsync#setHeartbeatInterval. The two methods are 
> handled the same way. Submit a patch to fix it!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7153) Remove duplicated code in AMRMClientAsyncImpl.java

2017-09-25 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179326#comment-16179326
 ] 

Akira Ajisaka commented on YARN-7153:
-

+1, checking this in.

> Remove duplicated code in AMRMClientAsyncImpl.java
> --
>
> Key: YARN-7153
> URL: https://issues.apache.org/jira/browse/YARN-7153
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 3.0.0-alpha4
>Reporter: Sen Zhao
>Assignee: Sen Zhao
> Attachments: YARN-7153.001.patch
>
>
> I notice there are some codes in AMRMClientAsyncImpl#setHeartbeatInterval 
> duplicate that in AMRMClientAsync#setHeartbeatInterval. The two methods are 
> handled the same way. Submit a patch to fix it!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7249) Fix CapacityScheduler NPE issue when a container preempted while the node is being removed

2017-09-25 Thread Wangda Tan (JIRA)
Wangda Tan created YARN-7249:


 Summary: Fix CapacityScheduler NPE issue when a container 
preempted while the node is being removed
 Key: YARN-7249
 URL: https://issues.apache.org/jira/browse/YARN-7249
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.8.1
Reporter: Wangda Tan
Assignee: Wangda Tan
Priority: Blocker


This issue could happen when 3 conditions satisfied:

1) A node is removing from scheduler.
2) A container running on the node is being preempted. 
3) A rare race condition causes scheduler pass a null node to leaf queue.

Fix of the problem is to add a null node check inside CapacityScheduler.

Stack trace:
{code}
2017-08-31 02:51:24,748 FATAL resourcemanager.ResourceManager 
(ResourceManager.java:run(714)) - Error in handling event type 
KILL_RESERVED_CONTAINER to the scheduler 
java.lang.NullPointerException 
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue.completedContainer(LeafQueue.java:1308)
 
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.completedContainerInternal(CapacityScheduler.java:1469)
 
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.completedContainer(AbstractYarnScheduler.java:497)
 
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.killReservedContainer(CapacityScheduler.java:1505)
 
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:1341)
 
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.handle(CapacityScheduler.java:127)
 
at 
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:705)
 
{code}

This is an issue only existed in 2.8.x



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6623) Add support to turn off launching privileged containers in the container-executor

2017-09-25 Thread Eric Badger (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16179268#comment-16179268
 ] 

Eric Badger commented on YARN-6623:
---

Looks good to me! +1 (non-binding) pending jenkins

> Add support to turn off launching privileged containers in the 
> container-executor
> -
>
> Key: YARN-6623
> URL: https://issues.apache.org/jira/browse/YARN-6623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Blocker
> Attachments: YARN-6623.001.patch, YARN-6623.002.patch, 
> YARN-6623.003.patch, YARN-6623.004.patch, YARN-6623.005.patch, 
> YARN-6623.006.patch, YARN-6623.007.patch, YARN-6623.008.patch, 
> YARN-6623.009.patch, YARN-6623.010.patch, YARN-6623.011.patch, 
> YARN-6623.012.patch, YARN-6623.013.patch
>
>
> Currently, launching privileged containers is controlled by the NM. We should 
> add a flag to the container-executor.cfg allowing admins to disable launching 
> privileged containers at the container-executor level.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6623) Add support to turn off launching privileged containers in the container-executor

2017-09-25 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-6623:

Attachment: YARN-6623.013.patch

Uploaded the wrong patch file by mistake. YARN-6623.013.patch is correct patch.

> Add support to turn off launching privileged containers in the 
> container-executor
> -
>
> Key: YARN-6623
> URL: https://issues.apache.org/jira/browse/YARN-6623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Blocker
> Attachments: YARN-6623.001.patch, YARN-6623.002.patch, 
> YARN-6623.003.patch, YARN-6623.004.patch, YARN-6623.005.patch, 
> YARN-6623.006.patch, YARN-6623.007.patch, YARN-6623.008.patch, 
> YARN-6623.009.patch, YARN-6623.010.patch, YARN-6623.011.patch, 
> YARN-6623.012.patch, YARN-6623.013.patch
>
>
> Currently, launching privileged containers is controlled by the NM. We should 
> add a flag to the container-executor.cfg allowing admins to disable launching 
> privileged containers at the container-executor level.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6623) Add support to turn off launching privileged containers in the container-executor

2017-09-25 Thread Varun Vasudev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-6623:

Attachment: (was: YARN-6623.001.patch)

> Add support to turn off launching privileged containers in the 
> container-executor
> -
>
> Key: YARN-6623
> URL: https://issues.apache.org/jira/browse/YARN-6623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
>Priority: Blocker
> Attachments: YARN-6623.001.patch, YARN-6623.002.patch, 
> YARN-6623.003.patch, YARN-6623.004.patch, YARN-6623.005.patch, 
> YARN-6623.006.patch, YARN-6623.007.patch, YARN-6623.008.patch, 
> YARN-6623.009.patch, YARN-6623.010.patch, YARN-6623.011.patch, 
> YARN-6623.012.patch
>
>
> Currently, launching privileged containers is controlled by the NM. We should 
> add a flag to the container-executor.cfg allowing admins to disable launching 
> privileged containers at the container-executor level.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >