[jira] [Commented] (YARN-5375) invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures

2016-11-15 Thread sandflee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669662#comment-15669662
 ] 

sandflee commented on YARN-5375:


update YARN-5375.12.new.patch to trigger jenkins

> invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures
> --
>
> Key: YARN-5375
> URL: https://issues.apache.org/jira/browse/YARN-5375
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: sandflee
>Assignee: sandflee
>  Labels: oct16-medium
> Attachments: YARN-5375.01.patch, YARN-5375.03.patch, 
> YARN-5375.04.patch, YARN-5375.05.patch, YARN-5375.06.patch, 
> YARN-5375.07-drain-statestore.patch, YARN-5375.07-sync-statestore.patch, 
> YARN-5375.08.patch, YARN-5375.09.patch, YARN-5375.10.patch, 
> YARN-5375.11.patch, YARN-5375.12.new.patch, YARN-5375.12.patch
>
>
> seen many test failures related to RMApp/RMAppattempt comes to some state but 
> some event are not processed in rm event queue or scheduler event queue, 
> cause test failure, seems we could implicitly invokes drainEvents(should also 
> drain sheduler event) in some mockRM method like waitForState



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5888) Add test cases in new YARN UI

2016-11-15 Thread Akhil PB (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-5888:
---
Description: 
Add missing test cases in new YARN UI
Fix test cases errors in new YARN UI 

  was:
Add missing test cases in new YARN UI
Fix test cases errors


> Add test cases in new YARN UI
> -
>
> Key: YARN-5888
> URL: https://issues.apache.org/jira/browse/YARN-5888
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-ui-v2
>Reporter: Akhil PB
>Assignee: Akhil PB
>
> Add missing test cases in new YARN UI
> Fix test cases errors in new YARN UI 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5888) Add test cases in new YARN UI

2016-11-15 Thread Akhil PB (JIRA)
Akhil PB created YARN-5888:
--

 Summary: Add test cases in new YARN UI
 Key: YARN-5888
 URL: https://issues.apache.org/jira/browse/YARN-5888
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn-ui-v2
Reporter: Akhil PB
Assignee: Akhil PB


Add missing test cases in new YARN UI
Fix test cases errors



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5877) Allow all nm-whitelist-env to get overridden during launch

2016-11-15 Thread Varun Saxena (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669571#comment-15669571
 ] 

Varun Saxena commented on YARN-5877:


[~bibinchundatt],
I was leaning towards the same. Better to have a new configuration. Because 
intention of NM whitelist configuration is different.

> Allow all nm-whitelist-env to get overridden during launch
> --
>
> Key: YARN-5877
> URL: https://issues.apache.org/jira/browse/YARN-5877
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: Dockerfile, bootstrap.sh, yarn-site.xml
>
>
> As per the {{yarn.nodemanager.env-whitelist}} for the configured values 
> should  containers may override rather than use NodeManager's default.
> {code}
>   
> Environment variables that containers may override rather 
> than use NodeManager's default.
> yarn.nodemanager.env-whitelist
> 
> JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME
>   
> {code}
> But only the following containers can override
> {code}
> whitelist.add(ApplicationConstants.Environment.HADOOP_YARN_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_COMMON_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_HDFS_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_CONF_DIR.name());
> whitelist.add(ApplicationConstants.Environment.JAVA_HOME.name());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5375) invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669348#comment-15669348
 ] 

Hadoop QA commented on YARN-5375:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 10 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
 4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 59s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 423 unchanged - 3 fixed = 425 total (was 426) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
31s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 49s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 92m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5375 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839103/YARN-5375.12.new.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 399f12ebab62 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 61c0bed |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13936/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/13936/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test 

[jira] [Updated] (YARN-5877) Allow all nm-whitelist-env to get overridden during launch

2016-11-15 Thread Bibin A Chundatt (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bibin A Chundatt updated YARN-5877:
---
Attachment: yarn-site.xml
bootstrap.sh
Dockerfile

Attaching dockefile and bootstrap file which i tried out to create the docker 
image in ubuntu. Using local tar file and jdk based on 
https://github.com/sequenceiq/hadoop-docker


> Allow all nm-whitelist-env to get overridden during launch
> --
>
> Key: YARN-5877
> URL: https://issues.apache.org/jira/browse/YARN-5877
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: Dockerfile, bootstrap.sh, yarn-site.xml
>
>
> As per the {{yarn.nodemanager.env-whitelist}} for the configured values 
> should  containers may override rather than use NodeManager's default.
> {code}
>   
> Environment variables that containers may override rather 
> than use NodeManager's default.
> yarn.nodemanager.env-whitelist
> 
> JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME
>   
> {code}
> But only the following containers can override
> {code}
> whitelist.add(ApplicationConstants.Environment.HADOOP_YARN_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_COMMON_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_HDFS_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_CONF_DIR.name());
> whitelist.add(ApplicationConstants.Environment.JAVA_HOME.name());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5877) Allow all nm-whitelist-env to get overridden during launch

2016-11-15 Thread Bibin A Chundatt (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669254#comment-15669254
 ] 

Bibin A Chundatt commented on YARN-5877:


[~sunilg]
{quote}
 Ideally a mismatch may occur only from docker point of view where env and 
launchcontext may differ.
{quote}
Have a query regarding the same. In case there is  mismatch its supposed to 
using env only rt? How about adding new config such as 
{{container-launch-whitelist}} instead of using the same {{NM whitelist}} to 
provide flexibility. 

> Allow all nm-whitelist-env to get overridden during launch
> --
>
> Key: YARN-5877
> URL: https://issues.apache.org/jira/browse/YARN-5877
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>
> As per the {{yarn.nodemanager.env-whitelist}} for the configured values 
> should  containers may override rather than use NodeManager's default.
> {code}
>   
> Environment variables that containers may override rather 
> than use NodeManager's default.
> yarn.nodemanager.env-whitelist
> 
> JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME
>   
> {code}
> But only the following containers can override
> {code}
> whitelist.add(ApplicationConstants.Environment.HADOOP_YARN_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_COMMON_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_HDFS_HOME.name());
> whitelist.add(ApplicationConstants.Environment.HADOOP_CONF_DIR.name());
> whitelist.add(ApplicationConstants.Environment.JAVA_HOME.name());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5375) invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures

2016-11-15 Thread sandflee (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sandflee updated YARN-5375:
---
Attachment: YARN-5375.12.new.patch

> invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures
> --
>
> Key: YARN-5375
> URL: https://issues.apache.org/jira/browse/YARN-5375
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: sandflee
>Assignee: sandflee
>  Labels: oct16-medium
> Attachments: YARN-5375.01.patch, YARN-5375.03.patch, 
> YARN-5375.04.patch, YARN-5375.05.patch, YARN-5375.06.patch, 
> YARN-5375.07-drain-statestore.patch, YARN-5375.07-sync-statestore.patch, 
> YARN-5375.08.patch, YARN-5375.09.patch, YARN-5375.10.patch, 
> YARN-5375.11.patch, YARN-5375.12.new.patch, YARN-5375.12.patch
>
>
> seen many test failures related to RMApp/RMAppattempt comes to some state but 
> some event are not processed in rm event queue or scheduler event queue, 
> cause test failure, seems we could implicitly invokes drainEvents(should also 
> drain sheduler event) in some mockRM method like waitForState



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669156#comment-15669156
 ] 

Hadoop QA commented on YARN-5600:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
0s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 557 unchanged - 21 fixed = 557 total (was 578) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
28s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
48s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 60m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5600 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839093/YARN-5600.014.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 20667dc03ebb 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 61c0bed |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13934/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 

[jira] [Commented] (YARN-5670) Add support for Docker image clean up

2016-11-15 Thread luhuichun (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669111#comment-15669111
 ] 

luhuichun commented on YARN-5670:
-

[~sidharta-s]  
Hi Sidharta, we have some issue for this JIRA
1.  When to clean these docker images ?  yarn deletion service will do 
clean up service for distributed cache , but as a framework, I think it should 
not to consider about the external service (docker),   it should be the 
docker’s responsibility 
2.  Which image should be clean up? 


> Add support for Docker image clean up
> -
>
> Key: YARN-5670
> URL: https://issues.apache.org/jira/browse/YARN-5670
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: luhuichun
>
> Regarding to Docker image localization, we also need a way to clean up the 
> old/stale Docker image to save storage space. We may extend deletion service 
> to utilize "docker rm" to do this.
> This is related to YARN-3854 and may depend on its implementation. Please 
> refer to YARN-3854 for Docker image localization details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5375) invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures

2016-11-15 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669107#comment-15669107
 ] 

Rohith Sharma K S commented on YARN-5375:
-

[~sandflee] latest patch *YARN-5375.12.patch*, HadoopQA has failed. Could you 
attach new patch? 

> invoke MockRM#drainEvents implicitly in MockRM methods to reduce test failures
> --
>
> Key: YARN-5375
> URL: https://issues.apache.org/jira/browse/YARN-5375
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Reporter: sandflee
>Assignee: sandflee
>  Labels: oct16-medium
> Attachments: YARN-5375.01.patch, YARN-5375.03.patch, 
> YARN-5375.04.patch, YARN-5375.05.patch, YARN-5375.06.patch, 
> YARN-5375.07-drain-statestore.patch, YARN-5375.07-sync-statestore.patch, 
> YARN-5375.08.patch, YARN-5375.09.patch, YARN-5375.10.patch, 
> YARN-5375.11.patch, YARN-5375.12.patch
>
>
> seen many test failures related to RMApp/RMAppattempt comes to some state but 
> some event are not processed in rm event queue or scheduler event queue, 
> cause test failure, seems we could implicitly invokes drainEvents(should also 
> drain sheduler event) in some mockRM method like waitForState



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5669) Add support for Docker pull

2016-11-15 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669088#comment-15669088
 ] 

Sidharta Seethana commented on YARN-5669:
-

Thanks for the clarification. YARN-3854 mentions that YARN-5669 and YARN-5670 
are the subtasks for implementing docker image localization. I (mistakenly) 
thought that YARN-3854 was the parent JIRA. 

> Add support for Docker pull
> ---
>
> Key: YARN-5669
> URL: https://issues.apache.org/jira/browse/YARN-5669
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: luhuichun
> Attachments: YARN-5669.001.patch
>
>
> We need to add docker pull to support Docker image localization. Refer to 
> YARN-3854 for the details. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5885) Cleanup YARN-4752 for merge

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669084#comment-15669084
 ] 

Hadoop QA commented on YARN-5885:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} YARN-5885 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-5885 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839095/yarn-5885.2.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13935/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Cleanup YARN-4752 for merge
> ---
>
> Key: YARN-5885
> URL: https://issues.apache.org/jira/browse/YARN-5885
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5885.1.patch, yarn-5885.2.patch
>
>
> JIRA to track changes necessary for branch merge. These include:
> # Remove names from TODOs (e.g. KK) and add JIRA numbers for follow-up work.
> # Fix tests that have been commented out in earlier patches on the branch.
> # Double check method and field visibility of newly added code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5669) Add support for Docker pull

2016-11-15 Thread luhuichun (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669064#comment-15669064
 ] 

luhuichun commented on YARN-5669:
-

[~sidharta-s]  we divide the localization support for three JIRAs,  YARN-3854, 
YARN-5669 and YARN-5670 

> Add support for Docker pull
> ---
>
> Key: YARN-5669
> URL: https://issues.apache.org/jira/browse/YARN-5669
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: luhuichun
> Attachments: YARN-5669.001.patch
>
>
> We need to add docker pull to support Docker image localization. Refer to 
> YARN-3854 for the details. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5762) Summarize ApplicationNotFoundException in the RM log

2016-11-15 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669054#comment-15669054
 ] 

Ravi Prakash commented on YARN-5762:


Hi Jian He!
I did notice the {{ApplicationBaseProtocol.getApplications}} method. It would 
return a response of size O(number of applications in the cluster) . I don't 
know if for big clusters that would be more intensive than O(num of 
applicaitions on a node) * RPC for 1 application.
Should we just extend the API?


> Summarize ApplicationNotFoundException in the RM log
> 
>
> Key: YARN-5762
> URL: https://issues.apache.org/jira/browse/YARN-5762
> Project: Hadoop YARN
>  Issue Type: Task
>Affects Versions: 2.7.2
>Reporter: Ravi Prakash
>Assignee: Ravi Prakash
>Priority: Minor
> Attachments: YARN-5762.01.patch
>
>
> We found a lot of {{ApplicationNotFoundException}} in the RM logs. These were 
> most likely caused by the {{AggregatedLogDeletionService}} [which 
> checks|https://github.com/apache/hadoop/blob/262827cf75bf9c48cd95335eb04fd8ff1d64c538/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/logaggregation/AggregatedLogDeletionService.java#L156]
>  that the application is not running anymore. e.g.
> {code}2016-10-17 15:25:26,542 INFO org.apache.hadoop.ipc.Server: IPC Server 
> handler 20 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from :12205 Call#35401 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1473396553140_1451' doesn't exist in RM.
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> 2016-10-17 15:25:26,633 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 47 on 8032, call 
> org.apache.hadoop.yarn.api.ApplicationClientProtocolPB.getApplicationReport 
> from :12205 Call#35404 Retry#0
> org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application 
> with id 'application_1473396553140_1452' doesn't exist in RM.
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:327)
> at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:175)
> at 
> org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:417)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5864) Capacity Scheduler preemption for fragmented cluster

2016-11-15 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15669047#comment-15669047
 ] 

Carlo Curino commented on YARN-5864:


[~wangda] I think we are on the same page on the problem side, and I agree that 
the scheduling invariants (that were once hard constraints) will eventually 
look more like soft-constraints, which we aim to meet/maximize but are ok to 
comprise over in some cases. 

Understanding how to trade one for the other, or how to make decisions that 
maximize the number/amount of met constraints is the hard problem. To this 
purpose I would argue that (2) is structurally better position to capture all 
the tradeoffs in a compact and easy to understand way, than any combination of 
heuristics.  Said this how to design (2) in a scalable/fast way is an open 
problem (an interesting direction recently appeared in OSDI 2016,  
http://www.firmament.io/, while it is not enough, it has some good ideas we 
could consider to leverage). So I am proposing it more as a north-star than as 
a short-term proposal of how to tackle this JIRA (or the scheduler issues in 
general).  On the other hand, (1) is an ongoing activity we can start 
right-away, and we should do it regardless of whether we eventually manage to 
do something like (2) or not. 

Regarding abuses/scope of the feature. I am certain that the initial scenarios 
you are designing for has all the right properties to be 
safe/reasonable/trusted, but once the feature is out there, people will start 
using it in the most baroque ways and some of the issues I allude it to, might 
come up.  Having very crisply defined semantics, configuration-validation 
mechanics (that prevent the worst configuration mistakes), and very tight unit 
tests are probably our best line of defense.



> Capacity Scheduler preemption for fragmented cluster 
> -
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-5864.poc-0.patch
>
>
> YARN-4390 added preemption for reserved container. However, we found one case 
> that large container cannot be allocated even if all queues are under their 
> limit.
> For example, we have:
> {code}
> Two queues, a and b, capacity 50:50 
> Two nodes: n1 and n2, each of them have 50 resource 
> Now queue-a uses 10 on n1 and 10 on n2
> queue-b asks for one single container with resource=45. 
> {code} 
> The container could be reserved on any of the host, but no preemption will 
> happen because all queues are under their limits. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5885) Cleanup YARN-4752 for merge

2016-11-15 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-5885:
---
Attachment: yarn-5885.2.patch

> Cleanup YARN-4752 for merge
> ---
>
> Key: YARN-5885
> URL: https://issues.apache.org/jira/browse/YARN-5885
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5885.1.patch, yarn-5885.2.patch
>
>
> JIRA to track changes necessary for branch merge. These include:
> # Remove names from TODOs (e.g. KK) and add JIRA numbers for follow-up work.
> # Fix tests that have been commented out in earlier patches on the branch.
> # Double check method and field visibility of newly added code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-15 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-5600:
-
Attachment: YARN-5600.014.patch

Fixing checkstyle and unit test

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch, 
> YARN-5600.008.patch, YARN-5600.009.patch, YARN-5600.010.patch, 
> YARN-5600.011.patch, YARN-5600.012.patch, YARN-5600.013.patch, 
> YARN-5600.014.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5669) Add support for Docker pull

2016-11-15 Thread Zhankun Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668968#comment-15668968
 ] 

Zhankun Tang commented on YARN-5669:


[~sidharta-s], The YARN-3854, YARN-5669 and YARN-5670 are the JIRAs needed for 
docker image localization.

> Add support for Docker pull
> ---
>
> Key: YARN-5669
> URL: https://issues.apache.org/jira/browse/YARN-5669
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: luhuichun
> Attachments: YARN-5669.001.patch
>
>
> We need to add docker pull to support Docker image localization. Refer to 
> YARN-3854 for the details. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5881) Enable configuration of queue capacity in terms of absolute resources

2016-11-15 Thread Carlo Curino (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668964#comment-15668964
 ] 

Carlo Curino commented on YARN-5881:


[~seanpo03] thanks for raising this JIRA. This is indeed a very important 
topic. Beside cloud settings, where clearly users care to get a fixed amount of 
resources they paid for and not a relative amount, this is also important in 
some on-prem settings, where certain production queues have fixed jobs running 
on a schedule that must run (mission critical), and required fixed amount of 
resources. In the reservation subsystem as you know we deal with this as  
reservations (dynamic leaf queues), but the same concept should be applied more 
generally to the queue structure. This will also be important to support 
services with gang-semantics. 

Beside the general engineering involved, I see a non-trivial issue related of 
what to do when capacity fluctuate up/down. I assume you will have buffers to 
accommodate modest fluctuations, but what happens if we loose enough capacity 
to drop below the amount of absolutely configured queues. You could prioritize 
certain queues over others, uniformly shrink all queues, etc. 
Few questions to answers:
 #  do we allow for a mix of absolutely and relatively configured queues? 
 # how are capacity fluctuations managed?
 # how is "over-capacity" resources distributed? (I can imagine to 
instantaneously cast both capacity in the relative domain and perform standard 
calculations)
 # same as above for preemption actions.
 # can we do this cleanly in CapacityScheduler? (as I mention in other JIRAs 
the interaction between many of the tunables is become very unclear)

Overall I think this is very important, and even solving part of the problem 
under some simplifying assumption might be ok. 




> Enable configuration of queue capacity in terms of absolute resources
> -
>
> Key: YARN-5881
> URL: https://issues.apache.org/jira/browse/YARN-5881
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Sean Po
>Assignee: Sean Po
>
> Currently, Yarn RM supports the configuration of queue capacity in terms of a 
> proportion to cluster capacity. In the context of Yarn being used as a public 
> cloud service, it makes more sense if queues can be configured absolutely. 
> This will allow administrators to set usage limits more concretely and 
> simplify customer expectations for cluster allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5292) Support for PAUSED container state

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668940#comment-15668940
 ] 

Hadoop QA commented on YARN-5292:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} YARN-5292 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-5292 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12838841/YARN-5292.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13933/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Support for PAUSED container state
> --
>
> Key: YARN-5292
> URL: https://issues.apache.org/jira/browse/YARN-5292
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Hitesh Sharma
>Assignee: Hitesh Sharma
> Attachments: YARN-5292.001.patch, YARN-5292.002.patch, yarn-5292.pdf
>
>
> YARN-2877 introduced OPPORTUNISTIC containers, and YARN-5216 proposes to add 
> capability to customize how OPPORTUNISTIC containers get preempted.
> In this JIRA we propose introducing a PAUSED container state.
> When a running container gets preempted, it enters the PAUSED state, where it 
> remains until resources get freed up on the node then the preempted container 
> can resume to the running state.
>  
> One scenario where this capability is useful is work preservation. How 
> preemption is done, and whether the container supports it, is implementation 
> specific.
> For instance, if the container is a virtual machine, then preempt would pause 
> the VM and resume would restore it back to the running state.
> If the container doesn't support preemption, then preempt would default to 
> killing the container. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5669) Add support for Docker pull

2016-11-15 Thread Sidharta Seethana (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668932#comment-15668932
 ] 

Sidharta Seethana commented on YARN-5669:
-

Hi [~luhuichun], Isn't the scope of this patch to add localization support via 
docker pull? If this is not the case, this jira and YARN-5670 do not fully 
cover the changes needed to support docker image localization.  

thanks.

> Add support for Docker pull
> ---
>
> Key: YARN-5669
> URL: https://issues.apache.org/jira/browse/YARN-5669
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Zhankun Tang
>Assignee: luhuichun
> Attachments: YARN-5669.001.patch
>
>
> We need to add docker pull to support Docker image localization. Refer to 
> YARN-3854 for the details. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5885) Cleanup YARN-4752 for merge

2016-11-15 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668920#comment-15668920
 ] 

Daniel Templeton commented on YARN-5885:


Patch looks good.  On the first pass, I only see trivia:

* The javadoc for {{FSAppAttempt. getStarvation()}} should end with a period
* Javadoc for {{FSLeafQueue. isStarvedForMinShare()}} and 
{{isStarvedForFairShare()}} and {{FSPreemptionThread. 
identifyContainersToPreempt()}} {{@return}} should not end with a period


> Cleanup YARN-4752 for merge
> ---
>
> Key: YARN-5885
> URL: https://issues.apache.org/jira/browse/YARN-5885
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5885.1.patch
>
>
> JIRA to track changes necessary for branch merge. These include:
> # Remove names from TODOs (e.g. KK) and add JIRA numbers for follow-up work.
> # Fix tests that have been commented out in earlier patches on the branch.
> # Double check method and field visibility of newly added code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-15 Thread Konstantinos Karanasos (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668916#comment-15668916
 ] 

Konstantinos Karanasos commented on YARN-1593:
--

Thanks for starting this! As [~asuresh] and [~hrsharma] pointed out, this is 
very related to the container pooling we have been thinking of, so it's great 
to see there is more work to this direction.

Here are some first thoughts:
- There seems to be a common need to have containers not belonging to an AM. I 
like your analysis about the pros and cons of the three approaches. Ideally, 
and if possible, it would be good to agree on an approach that is not hybrid, 
i.e., to not have some containers going through option (1) and some others 
through option (3), but rather have a unified approach. In container pooling we 
have thought of having a component in the RM that manages how many "system" 
containers will running at each node, but we are willing to adopt another 
approach if it is more suitable.
- Looking both at your document and the comments above, it seems that no 
approach can properly tackle the dependencies problem. Probably we should solve 
this in the scheduler: just like there will be support for (anti-)affinity 
constraints, we can add support for dependencies in the scheduler, e.g., to not 
schedule that container to a node before a shuffle container is running on that 
node.
- Although I like your proposal of using a new ExecutionType for the system 
containers, I am not sure it is always desirable to couple system containers 
with the highest priority ExecutionType. For instance, there can be system 
containers that are not as important and can be preempted to make space if 
needed. Also, apart from the execution priority, I am not sure if the 
ExecutionType should determine whether a container should be automatically 
relaunched. If we end up having a component managing those containers, maybe it 
is its role to determine if they get restarted upon failure (irrespective of 
their ExecutionType).

> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668813#comment-15668813
 ] 

Hadoop QA commented on YARN-5600:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
36s{color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  5m 
13s{color} | {color:red} hadoop-yarn in trunk failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
11s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 57s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 1 new + 558 unchanged - 21 fixed = 559 total (was 579) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 41s{color} 
| {color:red} hadoop-yarn-api in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
37s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
46s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 67m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5600 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839076/YARN-5600.013.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux c43609932365 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f121d0b |
| Default Java | 1.8.0_101 |
| compile | 

[jira] [Commented] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668811#comment-15668811
 ] 

Hadoop QA commented on YARN-5836:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 196 unchanged - 6 fixed = 196 total (was 202) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
40s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 22s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5836 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839079/YARN-5836.v2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 9c57046ec320 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / f121d0b |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13932/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13932/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Malicious AM can kill containers of other apps running in any node its 
> containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: 

[jira] [Commented] (YARN-5634) Simplify initialization/use of RouterPolicy via a RouterPolicyFacade

2016-11-15 Thread Subru Krishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668732#comment-15668732
 ] 

Subru Krishnan commented on YARN-5634:
--

Thanks [~curino] for addressing my feedback. The latest patch LGTM. I have one 
concern though - we should synchronize 
{{RouterPolicyFacade::singlePolicyReinit}}.

> Simplify initialization/use of RouterPolicy via a RouterPolicyFacade 
> -
>
> Key: YARN-5634
> URL: https://issues.apache.org/jira/browse/YARN-5634
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Affects Versions: YARN-2915
>Reporter: Carlo Curino
>Assignee: Carlo Curino
>  Labels: oct16-medium
> Attachments: YARN-5634-YARN-2915.01.patch, 
> YARN-5634-YARN-2915.02.patch, YARN-5634-YARN-2915.03.patch, 
> YARN-5634-YARN-2915.04.patch, YARN-5634-YARN-2915.05.patch, 
> YARN-5634-YARN-2915.06.patch
>
>
> The current set of policies require some machinery to (re)initialize based on 
> changes in the SubClusterPolicyConfiguration. This JIRA tracks the effort to 
> hide much of that behind a simple RouterPolicyFacade, making lifecycle and 
> usage of the policies easier to consumers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5287) LinuxContainerExecutor fails to set proper permission

2016-11-15 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668690#comment-15668690
 ] 

Andrew Wang commented on YARN-5287:
---

Thanks for the ping Brahma, seems like we're all on the same page here. As long 
as the revert JIRA summary mentions the JIRA being reverted, it should be clear 
to end-users.

> LinuxContainerExecutor fails to set proper permission
> -
>
> Key: YARN-5287
> URL: https://issues.apache.org/jira/browse/YARN-5287
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.2
>Reporter: Ying Zhang
>Assignee: Naganarasimha G R
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: YARN-5287-tmp.patch, YARN-5287.003.patch, 
> YARN-5287.004.patch, YARN-5287.005.patch, YARN-5287.branch-2.001.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> LinuxContainerExecutor fails to set the proper permissions on the local 
> directories(i.e., /hadoop/yarn/local/usercache/... by default) if the cluster 
> has been configured with a restrictive umask, e.g.: umask 077. Job failed due 
> to the following reason:
> Path /hadoop/yarn/local/usercache/ambari-qa/appcache/application_ has 
> permission 700 but needs permission 750



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5836:
---
Attachment: YARN-5836.v2.patch

> Malicious AM can kill containers of other apps running in any node its 
> containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-5836.v1.patch, YARN-5836.v2.patch
>
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5136) Error in handling event type APP_ATTEMPT_REMOVED to the scheduler

2016-11-15 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668678#comment-15668678
 ] 

Wilfred Spiegelenburg commented on YARN-5136:
-

I was thrown of track a bit with all the changes that were made to the locking 
in the scheduler in YARN-3139.

After analysis it shows that the issue is not resolved yet and we have two 
situations that can cause a the above mentioned problem:
# if a call for a {{removeApplicationAttempt}} and a {{moveApplication}} for 
the same attempt are processed in that order in short succession the 
application attempt will still contain a queue reference but is already removed 
from the list of applications for the queue
# if two calls to {{removeApplicationAttempt}} come in in short succession the 
application will still contain a queue reference but is already removed from 
the list of applications for the queue

In both cases the 2nd call must come in before the {{removeApplication}} call 
is made.

> Error in handling event type APP_ATTEMPT_REMOVED to the scheduler
> -
>
> Key: YARN-5136
> URL: https://issues.apache.org/jira/browse/YARN-5136
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.7.1
>Reporter: tangshangwen
>Assignee: Wilfred Spiegelenburg
>
> move app cause rm exit
> {noformat}
> 2016-05-24 23:20:47,202 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type APP_ATTEMPT_REMOVED to the scheduler
> java.lang.IllegalStateException: Given app to remove 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt@ea94c3b
>  does not exist in queue [root.bdp_xx.bdp_mart_xx_formal, 
> demand=, running= vCores:13422>, share=, w= weight=1.0>]
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.removeApp(FSLeafQueue.java:119)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.removeApplicationAttempt(FairScheduler.java:779)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1231)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:114)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:680)
> at java.lang.Thread.run(Thread.java:745)
> 2016-05-24 23:20:47,202 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_e04_1464073905025_15410_01_001759 Container Transitioned from 
> ACQUIRED to RELEASED
> 2016-05-24 23:20:47,202 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-15 Thread Hitesh Sharma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668649#comment-15668649
 ] 

Hitesh Sharma commented on YARN-1593:
-

Thanks [~asuresh] for pointing to [YARN-5501]. Agree with you folks that there 
is some overlap and we will be happy to converge and discuss the best way to 
leverage the efforts here.

[~vvasudev], with regards to pooled container the behavior is to allow NM to 
serve container requests even if the pre-initialized container is not ready. 
For container pooling this behavior makes sense as we eventually want to 
advertise pre-initialized container as a resource and have the AM ask for it. 

Regarding the 2nd point, current implementation starts a fixed number of 
pre-initialized container on each node (what to start, resources to localize, 
and other details are currently passed via config files). Eventually we intend 
the RM to pick up some nodes where the pre-initialized container should be 
started. This is something we are starting to work upon.



> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5836:
---
Attachment: (was: YARN-5836.v2.patch)

> Malicious AM can kill containers of other apps running in any node its 
> containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-5836.v1.patch
>
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-15 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-5600:
-
Attachment: YARN-5600.013.patch

Rebasing patch.

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch, 
> YARN-5600.008.patch, YARN-5600.009.patch, YARN-5600.010.patch, 
> YARN-5600.011.patch, YARN-5600.012.patch, YARN-5600.013.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5280) Allow YARN containers to run with Java Security Manager

2016-11-15 Thread Greg Phillips (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668629#comment-15668629
 ] 

Greg Phillips commented on YARN-5280:
-

[~vvasudev] - thanks for the guidance.  I have definitely run out of space in 
the hadoop tmp dir in the past, and I completely agree that storing the 
java.policy in the container private directory is a better solution.  I have 
made that modification, and I am currently testing it.  For debugging purposes 
users can inspect the generated java.policy file from within their application 
using System.getSecurityManager(), or by providing client arguments for 
security manager debugging.  I will include notes on this in the javadoc, and 
in future feature documentation.

The difficulty arises when moving the functionality from prepareContainer to 
launchContainer.  In particular I need to modify the actual java run command 
instead of the container launch command.  The only way I have found to modify 
the run command found within the launch_container.sh is through the 
LinuxContainerExecutor#writeLaunchEnv.  A method which links the 
LinuxContainerExecutor with the ContainerRuntime prior to the environment being 
written seems necessary for this feature.  I am very interested in your 
thoughts on this matter.

> Allow YARN containers to run with Java Security Manager
> ---
>
> Key: YARN-5280
> URL: https://issues.apache.org/jira/browse/YARN-5280
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager, yarn
>Affects Versions: 2.6.4
>Reporter: Greg Phillips
>Assignee: Greg Phillips
>Priority: Minor
>  Labels: oct16-medium
> Attachments: YARN-5280.001.patch, YARN-5280.002.patch, 
> YARN-5280.003.patch, YARN-5280.004.patch, YARN-5280.patch, 
> YARNContainerSandbox.pdf
>
>
> YARN applications have the ability to perform privileged actions which have 
> the potential to add instability into the cluster. The Java Security Manager 
> can be used to prevent users from running privileged actions while still 
> allowing their core data processing use cases. 
> Introduce a YARN flag which will allow a Hadoop administrator to enable the 
> Java Security Manager for user code, while still providing complete 
> permissions to core Hadoop libraries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668632#comment-15668632
 ] 

Hadoop QA commented on YARN-5836:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} YARN-5836 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-5836 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839074/YARN-5836.v2.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13930/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Malicious AM can kill containers of other apps running in any node its 
> containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-5836.v1.patch, YARN-5836.v2.patch
>
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5836:
---
Attachment: YARN-5836.v2.patch

> Malicious AM can kill containers of other apps running in any node its 
> containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-5836.v1.patch, YARN-5836.v2.patch
>
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5814) Add druid as storage backend in YARN Timeline Service

2016-11-15 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668610#comment-15668610
 ] 

Sangjin Lee commented on YARN-5814:
---

Thanks [~BINGXUE QIU] for adding more details. It sounds like a good starting 
point. A couple of points.

It would be great to see some more details on the reader side of things. In the 
case of HBase/Phoenix, more sophisticated analytical queries are going to be 
based on the Phoenix (SQL) schema. In case of Druid, it would be interesting to 
see some examples being served right out of the same schema.

Also, with readers we do a fair amount of predicate (filter) pushdown. We have 
the set of timeline service filters which then get translated to HBase filters 
and are pushed down to the queries. Have you looked at what kind of predicate 
pushdown would be feasible in case of Druid?

I'd also like to bring your attention to the entity id prefix we just 
introduced (YARN-5715) to solve the problem of more natural sort/selection 
order. The lexicographical sort order based on the entity id's is going to fall 
short of the expectation, thus the reason for introducing the entity id prefix. 
You might want to take a look at that, and see how that would translate to the 
Druid schema.

I'll comment more if I have more things to think about.

>  Add druid as storage backend in YARN Timeline Service
> --
>
> Key: YARN-5814
> URL: https://issues.apache.org/jira/browse/YARN-5814
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: ATSv2
>Affects Versions: 3.0.0-alpha2
>Reporter: Bingxue Qiu
> Attachments: Add-Druid-in-YARN-Timeline-Service.pdf
>
>
> h3. Introduction
> I propose to add druid as storage backend in YARN Timeline Service.
> We run more than 6000 applications and generate 450 million metrics daily in 
> Alibaba Clusters with thousands of nodes. We need to collect and store 
> meta/events/metrics data, online analyze the utilization reports of various 
> dimensions and display the trends of allocation/usage resources for cluster 
> by joining and aggregating data. It helps us to manage and optimize the 
> cluster by tracking resource utilization.
> To achieve our goal we have changed to use druid as the storage instead of 
> HBase and have achieved sub-second OLAP performance in our production 
> environment for few months. 
> h3. Analysis
> Currently YARN Timeline Service only supports aggregating metrics at a) flow 
> level by FlowRunCoprocessor and b) application level metrics aggregating by 
> AppLevelTimelineCollector, offline (time-based periodic) aggregation for 
> flows/users/queues for reporting and analysis is planned but not yet 
> implemented. YARN Timeline Service chooses Apache HBase as the primary 
> storage backend. As we all know that HBase doesn't fit for OLAP.
>  For arbitrary exploration of data,such as online analyze the utilization 
> reports of various dimensions(Queue,Flow,Users,Application,CPU,Memory) by 
> joining and aggregating data, Druid's custom column format enables ad-hoc 
> queries without pre-computation. The format also enables fast scans on 
> columns, which is important for good aggregation performance.
> To achieve our goal that support to online analyze the utilization reports of 
> various dimensions, display the variation trends of allocation/usage 
> resources for cluster, and arbitrary exploration of data, we propose to add 
> druid storage and implement DruidWriter /DruidReader in YARN Timeline Service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5875) TestTokenClientRMService#testTokenRenewalWrongUser fails

2016-11-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668575#comment-15668575
 ] 

Hudson commented on YARN-5875:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10842 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10842/])
YARN-5875. TestTokenClientRMService#testTokenRenewalWrongUser fails. (xiao: rev 
f121d0b036fe031dd24f2f549ae5729304bfa59c)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestTokenClientRMService.java


> TestTokenClientRMService#testTokenRenewalWrongUser fails
> 
>
> Key: YARN-5875
> URL: https://issues.apache.org/jira/browse/YARN-5875
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Assignee: Gergely Novák
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: YARN-5875.001.patch
>
>
> {noformat}
> Running org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.983 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService
> testTokenRenewalWrongUser(org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService)
>   Time elapsed: 0.015 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:125)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:118)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService.testTokenRenewalWrongUser(TestTokenClientRMService.java:118)
> Results :
> Failed tests: 
>   TestTokenClientRMService.testTokenRenewalWrongUser:118 null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668543#comment-15668543
 ] 

Hadoop QA commented on YARN-5600:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} YARN-5600 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-5600 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839056/YARN-5600.012.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13929/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch, 
> YARN-5600.008.patch, YARN-5600.009.patch, YARN-5600.010.patch, 
> YARN-5600.011.patch, YARN-5600.012.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-15 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-5600:
-
Attachment: YARN-5600.012.patch

Patch addressing comments of [~vvasudev]

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch, 
> YARN-5600.008.patch, YARN-5600.009.patch, YARN-5600.010.patch, 
> YARN-5600.011.patch, YARN-5600.012.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5713) Update jackson from 1.9.13 to 2.x in hadoop-yarn

2016-11-15 Thread Akira Ajisaka (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668520#comment-15668520
 ] 

Akira Ajisaka commented on YARN-5713:
-

Sure!

> Update jackson from 1.9.13 to 2.x in hadoop-yarn
> 
>
> Key: YARN-5713
> URL: https://issues.apache.org/jira/browse/YARN-5713
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build, timelineserver
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>  Labels: oct16-medium
> Attachments: HADOOP-13677.01.patch, HADOOP-13677.02.patch, 
> YARN-5713.03.patch, YARN-5713.04.patch
>
>
> Sub-task of HADOOP-13332.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5875) TestTokenClientRMService#testTokenRenewalWrongUser fails

2016-11-15 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated YARN-5875:

Summary: TestTokenClientRMService#testTokenRenewalWrongUser fails  (was: 
TestTokenClientRMService#testTokenRenewalWrongUser fails on trunk)

> TestTokenClientRMService#testTokenRenewalWrongUser fails
> 
>
> Key: YARN-5875
> URL: https://issues.apache.org/jira/browse/YARN-5875
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Assignee: Gergely Novák
> Attachments: YARN-5875.001.patch
>
>
> {noformat}
> Running org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.983 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService
> testTokenRenewalWrongUser(org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService)
>   Time elapsed: 0.015 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:125)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:118)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService.testTokenRenewalWrongUser(TestTokenClientRMService.java:118)
> Results :
> Failed tests: 
>   TestTokenClientRMService.testTokenRenewalWrongUser:118 null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-15 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668499#comment-15668499
 ] 

Sangjin Lee commented on YARN-1593:
---

Thanks for starting the proposal!

I took a quick look at it, and here are some of my initial thoughts (maybe more 
later).

One use case that is not mentioned is the timeline service v.2 collector 
(writer). We can think of it in two possible approaches: (1) another system 
container/service that needs to be launched on every node before NM can serve 
containers, or (2) system container that can be started on demand when an app 
is started (one container per app). I think (1) fits nicely with the system 
container you're envisioning. (2) is much more dynamic than any of the 
approaches discussed in the doc. FYI.

I think discovery is going to be one major piece that needs to be addressed 
from the beginning. Even in the most basic use cases (e.g. MR shuffle handler, 
timeline reader, etc.), the discoverability of the containers and their 
endpoints is hugely important. It would be great if it is addressed in the 
first design.

I also agree that localization is going to be a problem, and I think it's going 
to be an issue no matter which option you take. If the system container needs 
to run as long as the node is up, it's hard to avoid the issue of localization 
unless you pre-deliver the bits as part of setting up the nodes.

In terms of the approaches, I lean slightly towards (3). It feels awkward to 
treat it as "just another app" as they have different semantics from any other 
app. If we're elevating the notion of the system containers to first class, we 
might as well be explicit while still trying to reuse a lot of the pieces for 
implementation. That's my 2 cents.

One question: what do we do with the resource utilization of these system 
containers? Should they be reported just like any container (I'm thinking of 
{{ContainersMonitorImpl}}, {{NMTimelinePublisher}} and so on)? Or should they 
be considered outside the monitoring scope, like a YARN daemon today? Have you 
thought about that?

> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5875) TestTokenClientRMService#testTokenRenewalWrongUser fails on trunk

2016-11-15 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668500#comment-15668500
 ] 

Xiao Chen commented on YARN-5875:
-

+1 committing this. Thanks [~varun_saxena] for reporting the issue and 
[~GergelyNovak] for the fix!

> TestTokenClientRMService#testTokenRenewalWrongUser fails on trunk
> -
>
> Key: YARN-5875
> URL: https://issues.apache.org/jira/browse/YARN-5875
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Varun Saxena
>Assignee: Gergely Novák
> Attachments: YARN-5875.001.patch
>
>
> {noformat}
> Running org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.983 sec <<< 
> FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService
> testTokenRenewalWrongUser(org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService)
>   Time elapsed: 0.015 sec  <<< FAILURE!
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:125)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:118)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService.testTokenRenewalWrongUser(TestTokenClientRMService.java:118)
> Results :
> Failed tests: 
>   TestTokenClientRMService.testTokenRenewalWrongUser:118 null
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5600) Add a parameter to ContainerLaunchContext to emulate yarn.nodemanager.delete.debug-delay-sec on a per-application basis

2016-11-15 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668474#comment-15668474
 ] 

Miklos Szegedi commented on YARN-5600:
--

Thank you, [~vvasudev] for the comments! I addressed all the issues that you 
raised. I will send out the new patch shortly.
"Given that it was undocumented, I think it’s fine to break it in trunk. Maybe 
in branch-2 we leave as it was but undocumented as earlier?"
We can do that. Since we are talking about a debug feature we may even be able 
to backport the whole change as it is. What do you think?

> Add a parameter to ContainerLaunchContext to emulate 
> yarn.nodemanager.delete.debug-delay-sec on a per-application basis
> ---
>
> Key: YARN-5600
> URL: https://issues.apache.org/jira/browse/YARN-5600
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.0.0-alpha1
>Reporter: Daniel Templeton
>Assignee: Miklos Szegedi
>  Labels: oct16-medium
> Attachments: YARN-5600.000.patch, YARN-5600.001.patch, 
> YARN-5600.002.patch, YARN-5600.003.patch, YARN-5600.004.patch, 
> YARN-5600.005.patch, YARN-5600.006.patch, YARN-5600.007.patch, 
> YARN-5600.008.patch, YARN-5600.009.patch, YARN-5600.010.patch, 
> YARN-5600.011.patch
>
>
> To make debugging application launch failures simpler, I'd like to add a 
> parameter to the CLC to allow an application owner to request delayed 
> deletion of the application's launch artifacts.
> This JIRA solves largely the same problem as YARN-5599, but for cases where 
> ATS is not in use, e.g. branch-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3053) [Security] Review and implement security in ATS v.2

2016-11-15 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668416#comment-15668416
 ] 

Li Lu commented on YARN-3053:
-

bq. Can we capture that aspect as a future work as part of implementing the 
timeline collector as a full user container?
Sure. For now let's make the current (aux service) based model work with 
security. We may do a slight extension to allow collectors in a separate 
process also work if it's a low hanging fruit. 

> [Security] Review and implement security in ATS v.2
> ---
>
> Key: YARN-3053
> URL: https://issues.apache.org/jira/browse/YARN-3053
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>  Labels: YARN-5355
> Attachments: ATSv2Authentication(draft).pdf
>
>
> Per design in YARN-2928, we want to evaluate and review the system for 
> security, and ensure proper security in the system.
> This includes proper authentication, token management, access control, and 
> any other relevant security aspects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5736) YARN container executor config does not handle white space

2016-11-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668391#comment-15668391
 ] 

Hudson commented on YARN-5736:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10841 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10841/])
YARN-5736 Addendum. Fixes segfault due to unterminated string. (templedf: rev 
264ddb13ff7455282fb640b6ff6c565adddea44e)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/configuration.c


> YARN container executor config does not handle white space
> --
>
> Key: YARN-5736
> URL: https://issues.apache.org/jira/browse/YARN-5736
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5736.001.patch, YARN-5736.002.patch, 
> YARN-5736.addendum.000.patch, YARN_5736.000.patch
>
>
> The container executor configuration reader does not handle white spaces or 
> malformed key value pairs in the config file correctly or gracefully
> as an example the following key value line which is part of the configuration 
> (note the << is used as a marker to show the extra trailing space):
> yarn.nodemanager.linux-container-executor.group=yarn <<
> is a valid line but when you run the check over the file:
> [root@test]#./container-executor --checksetup
> Can't get group information for yarn - Success.
> [root@test]#
> It fails to find the yarn group but it really tries to find the "yarn " group 
> which fails. There is no trimming anywhere while processing the lines. If a 
> space would be added in before or after the = sign a failure would also occur.
> Minor nit is the fact that a failure still is logged as a Success



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668353#comment-15668353
 ] 

Hadoop QA commented on YARN-5836:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 0 new + 191 unchanged - 6 fixed = 191 total (was 197) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 12m 32s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 28m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.TestContainerManagerWithLCE |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5836 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839034/YARN-5836.v1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5eb7488cebe7 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 
21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5af572b |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-YARN-Build/13928/artifact/patchprocess/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13928/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13928/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Malicious AM can kill containers of other apps running in any node its 
> containers are 

[jira] [Commented] (YARN-5736) YARN container executor config does not handle white space

2016-11-15 Thread Daniel Templeton (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668308#comment-15668308
 ] 

Daniel Templeton commented on YARN-5736:


+1  Checking in.

> YARN container executor config does not handle white space
> --
>
> Key: YARN-5736
> URL: https://issues.apache.org/jira/browse/YARN-5736
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5736.001.patch, YARN-5736.002.patch, 
> YARN-5736.addendum.000.patch, YARN_5736.000.patch
>
>
> The container executor configuration reader does not handle white spaces or 
> malformed key value pairs in the config file correctly or gracefully
> as an example the following key value line which is part of the configuration 
> (note the << is used as a marker to show the extra trailing space):
> yarn.nodemanager.linux-container-executor.group=yarn <<
> is a valid line but when you run the check over the file:
> [root@test]#./container-executor --checksetup
> Can't get group information for yarn - Success.
> [root@test]#
> It fails to find the yarn group but it really tries to find the "yarn " group 
> which fails. There is no trimming anywhere while processing the lines. If a 
> space would be added in before or after the = sign a failure would also occur.
> Minor nit is the fact that a failure still is logged as a Success



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Botong Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668274#comment-15668274
 ] 

Botong Huang commented on YARN-5836:


[~jlowe] You are right, it turns out that the RPC server is indeed verifying 
the token password when token authentication is used. Updated the description, 
title and v1 patch uploaded. Thanks! 

> Malicious AM can kill containers of other apps running in any node its 
> containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-5836.v1.patch
>
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5736) YARN container executor config does not handle white space

2016-11-15 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668259#comment-15668259
 ] 

Shane Kumpf commented on YARN-5736:
---

Thanks [~miklos.szeg...@cloudera.com] for the quick turnaround! The addendum 
patch looks good to me and I can confirm that it fixes the problem.

> YARN container executor config does not handle white space
> --
>
> Key: YARN-5736
> URL: https://issues.apache.org/jira/browse/YARN-5736
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5736.001.patch, YARN-5736.002.patch, 
> YARN-5736.addendum.000.patch, YARN_5736.000.patch
>
>
> The container executor configuration reader does not handle white spaces or 
> malformed key value pairs in the config file correctly or gracefully
> as an example the following key value line which is part of the configuration 
> (note the << is used as a marker to show the extra trailing space):
> yarn.nodemanager.linux-container-executor.group=yarn <<
> is a valid line but when you run the check over the file:
> [root@test]#./container-executor --checksetup
> Can't get group information for yarn - Success.
> [root@test]#
> It fails to find the yarn group but it really tries to find the "yarn " group 
> which fails. There is no trimming anywhere while processing the lines. If a 
> space would be added in before or after the = sign a failure would also occur.
> Minor nit is the fact that a failure still is logged as a Success



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5836) Malicious AM can kill containers of other apps running in any node its containers are running

2016-11-15 Thread Jason Lowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-5836:
-
Summary: Malicious AM can kill containers of other apps running in any node 
its containers are running  (was: ContainerManagerImpl not throwing exception 
when AppId in NMTokenIdentifier does not match containerId to kill. Malicious 
AM can kill containers of other apps running in any node its containers are 
running)

Simplifying the summary to describe the symptom rather than detail the fix.

Thanks for the patch!  Looks good to me pending a Jenkins result.



> Malicious AM can kill containers of other apps running in any node its 
> containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-5836.v1.patch
>
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5736) YARN container executor config does not handle white space

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668186#comment-15668186
 ] 

Hadoop QA commented on YARN-5736:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
54s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 24s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5736 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839027/YARN-5736.addendum.000.patch
 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux ca5a3564f904 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5af572b |
| Default Java | 1.8.0_111 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13927/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13927/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> YARN container executor config does not handle white space
> --
>
> Key: YARN-5736
> URL: https://issues.apache.org/jira/browse/YARN-5736
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5736.001.patch, YARN-5736.002.patch, 
> YARN-5736.addendum.000.patch, YARN_5736.000.patch
>
>
> The container executor configuration reader does not handle white spaces or 
> malformed key value pairs in the config file correctly or gracefully
> as an example the following key value line which is part of the configuration 
> (note the << is used as a marker to show the extra trailing space):
> yarn.nodemanager.linux-container-executor.group=yarn <<
> is a valid line but when you run the check over the file:
> [root@test]#./container-executor --checksetup
> Can't get 

[jira] [Commented] (YARN-5870) Expose getApplications API in YarnClient with GetApplicationsRequest parameter

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668172#comment-15668172
 ] 

Hadoop QA commented on YARN-5870:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 13s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client: The patch generated 1 new + 
53 unchanged - 0 fixed = 54 total (was 53) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
12s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client 
generated 2 new + 158 unchanged - 0 fixed = 160 total (was 158) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 16m 
16s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 29m 42s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5870 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839025/YARN-5870.1.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8631988cd3cd 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 5af572b |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/13926/artifact/patchprocess/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
| javadoc | 
https://builds.apache.org/job/PreCommit-YARN-Build/13926/artifact/patchprocess/diff-javadoc-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-client.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13926/testReport/ |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13926/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   

[jira] [Updated] (YARN-5836) ContainerManagerImpl not throwing exception when AppId in NMTokenIdentifier does not match containerId to kill. Malicious AM can kill containers of other apps running in a

2016-11-15 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5836:
---
Attachment: YARN-5836.v1.patch

> ContainerManagerImpl not throwing exception when AppId in NMTokenIdentifier 
> does not match containerId to kill. Malicious AM can kill containers of other 
> apps running in any node its containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
> Attachments: YARN-5836.v1.patch
>
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5836) ContainerManagerImpl not throwing exception when AppId in NMTokenIdentifier does not match containerId to kill. Malicious AM can kill containers of other apps running in a

2016-11-15 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5836:
---
Summary: ContainerManagerImpl not throwing exception when AppId in 
NMTokenIdentifier does not match containerId to kill. Malicious AM can kill 
containers of other apps running in any node its containers are running  (was: 
NMToken passwd not checked in ContainerManagerImpl, malicious AM can fake the 
Token and kill containers of other apps at will)

> ContainerManagerImpl not throwing exception when AppId in NMTokenIdentifier 
> does not match containerId to kill. Malicious AM can kill containers of other 
> apps running in any node its containers are running
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5836) NMToken passwd not checked in ContainerManagerImpl, malicious AM can fake the Token and kill containers of other apps at will

2016-11-15 Thread Botong Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-5836:
---
Description: 
When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
for authentication. The RPC server will verify the password of NMToken 
(originally generated by RM) so that we know the content of NMTokenIdentifier 
is geniune. 

Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
{{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
containers do belong to the AM by comparing them against the AppId in 
NMTokenIdentifier. However, right now when the appId doesn't match, 
{{authorizeGetAndStopContainerRequest()}} only log a warning message and 
continues to kill the container... Overall a malicious AM can kill containers 
of other apps running in any node its containers are running. 

  was:
When AM calls NM via stopContainers() in ContainerManagementProtocol, the 
NMToken (generated by RM) is passed along via the user ugi. However currently 
ContainerManagerImpl is not validating this token correctly, specifically in 
authorizeGetAndStopContainerRequest() in ContainerManagerImpl. Basically it 
blindly trusts the content in the NMTokenIdentifier without verifying the 
password (RM generated signature) in the NMToken, so that malicious AM can just 
fake the content in the NMTokenIdentifier and pass it to NMs. Moreover, 
currently even for plain text checking, when the appId doesn’t match, all it 
does is log it as a warning and continues to kill the container…

For startContainers the NMToken is not checked correctly in authorizeUser() as 
well, however the ContainerToken is verified properly by regenerating and 
comparing the password in verifyAndGetContainerTokenIdentifier(), so that 
malicious AM cannot launch containers at will. 


> NMToken passwd not checked in ContainerManagerImpl, malicious AM can fake the 
> Token and kill containers of other apps at will
> -
>
> Key: YARN-5836
> URL: https://issues.apache.org/jira/browse/YARN-5836
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Minor
>   Original Estimate: 5h
>  Remaining Estimate: 5h
>
> When AM calls NM via {{ContainerManagementProtocol}}, the NMToken is suppied 
> for authentication. The RPC server will verify the password of NMToken 
> (originally generated by RM) so that we know the content of NMTokenIdentifier 
> is geniune. 
> Next, for {{stopContainers()}} and {{getContainerStatus()}}, method 
> {{authorizeGetAndStopContainerRequest()}} is used to verify that the requsted 
> containers do belong to the AM by comparing them against the AppId in 
> NMTokenIdentifier. However, right now when the appId doesn't match, 
> {{authorizeGetAndStopContainerRequest()}} only log a warning message and 
> continues to kill the container... Overall a malicious AM can kill containers 
> of other apps running in any node its containers are running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4355) NPE while processing localizer heartbeat

2016-11-15 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668084#comment-15668084
 ] 

Naganarasimha G R commented on YARN-4355:
-

Its not exactly a contributor's responsibility actually while committing i 
missed it for 2.7 and in other upper versions just specifying in Fix version 
takes care of release notes. :)

> NPE while processing localizer heartbeat
> 
>
> Key: YARN-4355
> URL: https://issues.apache.org/jira/browse/YARN-4355
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: YARN-4355-branch-2.7.001.patch, 
> YARN-4355-branch-2.7.002.patch, YARN-4355.01.patch, YARN-4355.02.patch, 
> YARN-4355.03.patch, YARN-4355.04.patch, YARN-4355.05.patch
>
>
> While analyzing YARN-4354 I noticed a nodemanager was getting NPEs while 
> processing a private localizer heartbeat.  I think there's a race where we 
> can cleanup resources for an application and therefore remove the app local 
> resource tracker just as we are trying to handle the localizer heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5736) YARN container executor config does not handle white space

2016-11-15 Thread Miklos Szegedi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668081#comment-15668081
 ] 

Miklos Szegedi commented on YARN-5736:
--

Thank you, [~shaneku...@gmail.com], for reporting this! I attached a patch that 
should fix the issue based on your description. There was a bug in the code 
that it did not explicitly close the strings at the end, which strncpy does not 
do itself.

> YARN container executor config does not handle white space
> --
>
> Key: YARN-5736
> URL: https://issues.apache.org/jira/browse/YARN-5736
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5736.001.patch, YARN-5736.002.patch, 
> YARN-5736.addendum.000.patch, YARN_5736.000.patch
>
>
> The container executor configuration reader does not handle white spaces or 
> malformed key value pairs in the config file correctly or gracefully
> as an example the following key value line which is part of the configuration 
> (note the << is used as a marker to show the extra trailing space):
> yarn.nodemanager.linux-container-executor.group=yarn <<
> is a valid line but when you run the check over the file:
> [root@test]#./container-executor --checksetup
> Can't get group information for yarn - Success.
> [root@test]#
> It fails to find the yarn group but it really tries to find the "yarn " group 
> which fails. There is no trimming anywhere while processing the lines. If a 
> space would be added in before or after the = sign a failure would also occur.
> Minor nit is the fact that a failure still is logged as a Success



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-4355) NPE while processing localizer heartbeat

2016-11-15 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-4355:
---

Assignee: Naganarasimha G R  (was: Varun Saxena)

> NPE while processing localizer heartbeat
> 
>
> Key: YARN-4355
> URL: https://issues.apache.org/jira/browse/YARN-4355
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Assignee: Naganarasimha G R
> Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: YARN-4355-branch-2.7.001.patch, 
> YARN-4355-branch-2.7.002.patch, YARN-4355.01.patch, YARN-4355.02.patch, 
> YARN-4355.03.patch, YARN-4355.04.patch, YARN-4355.05.patch
>
>
> While analyzing YARN-4354 I noticed a nodemanager was getting NPEs while 
> processing a private localizer heartbeat.  I think there's a race where we 
> can cleanup resources for an application and therefore remove the app local 
> resource tracker just as we are trying to handle the localizer heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4355) NPE while processing localizer heartbeat

2016-11-15 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4355:

Assignee: Varun Saxena  (was: Naganarasimha G R)

> NPE while processing localizer heartbeat
> 
>
> Key: YARN-4355
> URL: https://issues.apache.org/jira/browse/YARN-4355
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: YARN-4355-branch-2.7.001.patch, 
> YARN-4355-branch-2.7.002.patch, YARN-4355.01.patch, YARN-4355.02.patch, 
> YARN-4355.03.patch, YARN-4355.04.patch, YARN-4355.05.patch
>
>
> While analyzing YARN-4354 I noticed a nodemanager was getting NPEs while 
> processing a private localizer heartbeat.  I think there's a race where we 
> can cleanup resources for an application and therefore remove the app local 
> resource tracker just as we are trying to handle the localizer heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3053) [Security] Review and implement security in ATS v.2

2016-11-15 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668066#comment-15668066
 ] 

Sangjin Lee commented on YARN-3053:
---

{quote}
A real challenge is to provide a general approach to authenticate the timeline 
collectors. Right now we ran the collectors within the NM, so kerberos login is 
not an issue. However, we also plan to run collectors in separate processes, or 
even in containers. For collectors running in a separate process then the NM, 
it's fine to run the collector manager process as YARN and perform a kerberos 
login. However, if we'd like to run the collectors in separate containers, the 
containers may probably under the user's name (to better tracking it's resource 
usage). In this way, the collector itself needs some sort of authentication? 
Thoughts here?
{quote}

That is a good question. Among the 3 modes of running the timeline collector 
(NM aux service, a daemon or a "system" container, and a special "user" 
container), the first two are probably not very problematic.

Our thought on the third mode isn't complete though. Can we capture that aspect 
as a future work as part of implementing the timeline collector as a full user 
container?

> [Security] Review and implement security in ATS v.2
> ---
>
> Key: YARN-3053
> URL: https://issues.apache.org/jira/browse/YARN-3053
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Reporter: Sangjin Lee
>Assignee: Varun Saxena
>  Labels: YARN-5355
> Attachments: ATSv2Authentication(draft).pdf
>
>
> Per design in YARN-2928, we want to evaluate and review the system for 
> security, and ensure proper security in the system.
> This includes proper authentication, token management, access control, and 
> any other relevant security aspects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5736) YARN container executor config does not handle white space

2016-11-15 Thread Miklos Szegedi (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miklos Szegedi updated YARN-5736:
-
Attachment: YARN-5736.addendum.000.patch

Adding addendum patch

> YARN container executor config does not handle white space
> --
>
> Key: YARN-5736
> URL: https://issues.apache.org/jira/browse/YARN-5736
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5736.001.patch, YARN-5736.002.patch, 
> YARN-5736.addendum.000.patch, YARN_5736.000.patch
>
>
> The container executor configuration reader does not handle white spaces or 
> malformed key value pairs in the config file correctly or gracefully
> as an example the following key value line which is part of the configuration 
> (note the << is used as a marker to show the extra trailing space):
> yarn.nodemanager.linux-container-executor.group=yarn <<
> is a valid line but when you run the check over the file:
> [root@test]#./container-executor --checksetup
> Can't get group information for yarn - Success.
> [root@test]#
> It fails to find the yarn group but it really tries to find the "yarn " group 
> which fails. There is no trimming anywhere while processing the lines. If a 
> space would be added in before or after the = sign a failure would also occur.
> Minor nit is the fact that a failure still is logged as a Success



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4752) [Umbrella] FairScheduler: Improve preemption

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668053#comment-15668053
 ] 

Hadoop QA commented on YARN-4752:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 45s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 15 new + 180 unchanged - 140 fixed = 195 total (was 320) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 926 unchanged - 9 fixed = 926 total (was 935) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
23s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 38m  2s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 79m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestTokenClientRMService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-4752 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839013/yarn-4752.3.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ec9116424412 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 3219b7b |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 

[jira] [Commented] (YARN-5885) Cleanup YARN-4752 for merge

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668035#comment-15668035
 ] 

Hadoop QA commented on YARN-5885:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 45s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 15 new + 178 unchanged - 140 fixed = 193 total (was 318) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager
 generated 0 new + 926 unchanged - 9 fixed = 926 total (was 935) {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
27s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 41m 59s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.TestTokenClientRMService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5885 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839013/yarn-4752.3.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 59a3d04465a9 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 3219b7b |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 

[jira] [Updated] (YARN-5870) Expose getApplications API in YarnClient with GetApplicationsRequest parameter

2016-11-15 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-5870:
--
Attachment: YARN-5870.1.patch

> Expose getApplications API in YarnClient with GetApplicationsRequest parameter
> --
>
> Key: YARN-5870
> URL: https://issues.apache.org/jira/browse/YARN-5870
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Gour Saha
>Assignee: Jian He
> Attachments: YARN-5870.1.patch, YARN-5870.patch
>
>
> It would be best to expose getApplications API in YarnClient with 
> GetApplicationsRequest parameter. That opens up all the filters and limits to 
> the client. 
> This will prevent us from the need to expose more getApplications API going 
> forward, for every new parameter/filter like it was done in YARN-4491.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5870) Expose getApplications API in YarnClient with GetApplicationsRequest parameter

2016-11-15 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15668020#comment-15668020
 ] 

Jian He commented on YARN-5870:
---

Good catch, Rohith ! 
I updated the patch to fix that. 

> Expose getApplications API in YarnClient with GetApplicationsRequest parameter
> --
>
> Key: YARN-5870
> URL: https://issues.apache.org/jira/browse/YARN-5870
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Gour Saha
>Assignee: Jian He
> Attachments: YARN-5870.1.patch, YARN-5870.patch
>
>
> It would be best to expose getApplications API in YarnClient with 
> GetApplicationsRequest parameter. That opens up all the filters and limits to 
> the client. 
> This will prevent us from the need to expose more getApplications API going 
> forward, for every new parameter/filter like it was done in YARN-4491.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5887) Policies for choosing which opportunistic containers to kill

2016-11-15 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5887:


 Summary: Policies for choosing which opportunistic containers to 
kill
 Key: YARN-5887
 URL: https://issues.apache.org/jira/browse/YARN-5887
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos


When a guaranteed container arrives at an NM but there are no resources to 
start its execution, opportunistic containers will be killed to make space for 
the guaranteed container.

At the moment, we kill opportunistic containers in reverse order of arrival 
(first the most recently started ones). This is not always the right decision. 
For example, we might want to minimize the number of containers killed: to 
start a 6GB container, we could kill one 6GB opportunistic or three 2GB ones. 
Another example would be to refrain from killing containers of jobs that are 
very close to completion (we have to pass job completion information to the NM 
in that case).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5713) Update jackson from 1.9.13 to 2.x in hadoop-yarn

2016-11-15 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667936#comment-15667936
 ] 

Steve Loughran commented on YARN-5713:
--

I'm @ apachecon right now; not looking @ code. Can you ping me again next week?

> Update jackson from 1.9.13 to 2.x in hadoop-yarn
> 
>
> Key: YARN-5713
> URL: https://issues.apache.org/jira/browse/YARN-5713
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build, timelineserver
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>  Labels: oct16-medium
> Attachments: HADOOP-13677.01.patch, HADOOP-13677.02.patch, 
> YARN-5713.03.patch, YARN-5713.04.patch
>
>
> Sub-task of HADOOP-13332.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5886) Dynamically prioritize execution of opportunistic containers (NM queue reordering)

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos reassigned YARN-5886:


Assignee: Konstantinos Karanasos

> Dynamically prioritize execution of opportunistic containers (NM queue 
> reordering)
> --
>
> Key: YARN-5886
> URL: https://issues.apache.org/jira/browse/YARN-5886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> Currently the {{ContainerScheduler}} in the NM picks the next queued 
> opportunistic container to be executed in a FIFO manner. That is, we first 
> execute containers that arrived first at the NM.
> This JIRA proposes to add pluggable queue reordering strategies at the NM 
> that will dynamically determine which opportunistic container will be 
> executed next.
> For example, we can choose to prioritize containers that belong to jobs which 
> are closer to completion, or containers that are short-running (if such 
> information is available).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5886) Dynamically prioritize execution of opportunistic containers (NM queue reordering)

2016-11-15 Thread Konstantinos Karanasos (JIRA)
Konstantinos Karanasos created YARN-5886:


 Summary: Dynamically prioritize execution of opportunistic 
containers (NM queue reordering)
 Key: YARN-5886
 URL: https://issues.apache.org/jira/browse/YARN-5886
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Konstantinos Karanasos


Currently the {{ContainerScheduler}} in the NM picks the next queued 
opportunistic container to be executed in a FIFO manner. That is, we first 
execute containers that arrived first at the NM.

This JIRA proposes to add pluggable queue reordering strategies at the NM that 
will dynamically determine which opportunistic container will be executed next.
For example, we can choose to prioritize containers that belong to jobs which 
are closer to completion, or containers that are short-running (if such 
information is available).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4972) Cleanup ContainerScheduler tests to remove long sleep times

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-4972:
-
Parent Issue: YARN-5541  (was: YARN-4742)

> Cleanup ContainerScheduler tests to remove long sleep times
> ---
>
> Key: YARN-4972
> URL: https://issues.apache.org/jira/browse/YARN-4972
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4972) Cleanup ContainerScheduler tests to remove long sleep times

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-4972:
-
Summary: Cleanup ContainerScheduler tests to remove long sleep times  (was: 
Cleanup QueuingContainerManager tests to remove long sleep times)

> Cleanup ContainerScheduler tests to remove long sleep times
> ---
>
> Key: YARN-4972
> URL: https://issues.apache.org/jira/browse/YARN-4972
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4828) Create a pull request template for github

2016-11-15 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667897#comment-15667897
 ] 

Sangjin Lee commented on YARN-4828:
---

If my understanding is correct, we should place the license header in all types 
of source code without exception.

> Create a pull request template for github
> -
>
> Key: YARN-4828
> URL: https://issues.apache.org/jira/browse/YARN-4828
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 3.0.0-alpha1
> Environment: github
>Reporter: Steve Loughran
>Assignee: Gergely Novák
>Priority: Minor
>  Labels: oct16-easy
> Attachments: YARN-4828.001.patch, YARN-4828.002.patch
>
>
> We're starting to see PRs appear without any JIRA, explanation etc. These are 
> going to be ignored without them.
> It's possible to [create a PR text 
> template](https://help.github.com/articles/creating-a-pull-request-template-for-your-repository/)
>  under {{.github/PULL_REQUEST_TEMPLATE}}
> We can do such a template, which provides template summary points, such as:
> * which JIRA
> * if against an object store, how did you test it?
> * if its a shell script, how did you test it?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2889) Limit in the number of opportunistic container requests per AM

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2889:
-
Parent Issue: YARN-5542  (was: YARN-4742)

> Limit in the number of opportunistic container requests per AM
> --
>
> Key: YARN-2889
> URL: https://issues.apache.org/jira/browse/YARN-2889
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
>
> We introduce a way to limit the number of queueable requests that each AM can 
> submit to the LocalRM.
> This way we can restrict the number of queueable containers handed out by the 
> system, as well as throttle down misbehaving AMs (asking for too many 
> queueable containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2889) Limit in the number of opportunistic container requests per AM

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2889:
-
Summary: Limit in the number of opportunistic container requests per AM  
(was: Limit in the number of queueable container requests per AM)

> Limit in the number of opportunistic container requests per AM
> --
>
> Key: YARN-2889
> URL: https://issues.apache.org/jira/browse/YARN-2889
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Arun Suresh
>
> We introduce a way to limit the number of queueable requests that each AM can 
> submit to the LocalRM.
> This way we can restrict the number of queueable containers handed out by the 
> system, as well as throttle down misbehaving AMs (asking for too many 
> queueable containers).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5216) Expose configurable preemption policy for OPPORTUNISTIC containers running on the NM

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5216:
-
Issue Type: Sub-task  (was: Bug)
Parent: YARN-5541

> Expose configurable preemption policy for OPPORTUNISTIC containers running on 
> the NM
> 
>
> Key: YARN-5216
> URL: https://issues.apache.org/jira/browse/YARN-5216
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: distributed-scheduling
>Reporter: Arun Suresh
>Assignee: Hitesh Sharma
>  Labels: oct16-hard
> Attachments: YARN5216.001.patch, yarn5216.002.patch
>
>
> Currently, the default action taken by the QueuingContainerManager, 
> introduced in YARN-2883, when a GUARANTEED Container is scheduled on an NM 
> with OPPORTUNISTIC containers using up resources, is to KILL the running 
> OPPORTUNISTIC containers.
> This JIRA proposes to expose a configurable hook to allow the NM to take a 
> different action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5415) Add support for NodeLocal and RackLocal OPPORTUNISTIC requests

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5415:
-
Parent Issue: YARN-5542  (was: YARN-4742)

> Add support for NodeLocal and RackLocal OPPORTUNISTIC requests
> --
>
> Key: YARN-5415
> URL: https://issues.apache.org/jira/browse/YARN-5415
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Konstantinos Karanasos
>
> Currently, the Distirbuted Scheduling framework only support ResourceRequests 
> with *ANY* resource name and additionally requires that the resource requests 
> have relaxLocality turned on.
> This jira seeks to add support for Node and Rack local allocations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-2886) Estimating waiting time in NM container queues

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-2886:
-
Parent Issue: YARN-5542  (was: YARN-4742)

> Estimating waiting time in NM container queues
> --
>
> Key: YARN-2886
> URL: https://issues.apache.org/jira/browse/YARN-2886
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> This JIRA is about estimating the waiting time of each NM queue.
> Having these estimates is crucial for the distributed scheduling of container 
> requests, as it allows the LocalRM to decide in which NMs to queue the 
> queuable container requests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5414) Integrate NodeQueueLoadMonitor with ClusterNodeTracker

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5414:
-
Parent Issue: YARN-5542  (was: YARN-4742)

> Integrate NodeQueueLoadMonitor with ClusterNodeTracker
> --
>
> Key: YARN-5414
> URL: https://issues.apache.org/jira/browse/YARN-5414
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: container-queuing, distributed-scheduling, scheduler
>Reporter: Arun Suresh
>Assignee: Arun Suresh
>
> The {{ClusterNodeTracker}} tracks the states of clusterNodes and provides 
> convenience methods like sort and filter.
> The {{NodeQueueLoadMonitor}} should use the {{ClusterNodeTracker}} instead of 
> maintaining its own data-structure of node information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5688) Make allocation of opportunistic containers asynchronous

2016-11-15 Thread Konstantinos Karanasos (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantinos Karanasos updated YARN-5688:
-
Issue Type: Sub-task  (was: Improvement)
Parent: YARN-5542

> Make allocation of opportunistic containers asynchronous
> 
>
> Key: YARN-5688
> URL: https://issues.apache.org/jira/browse/YARN-5688
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Konstantinos Karanasos
>Assignee: Konstantinos Karanasos
>
> In the current implementation of the 
> {{OpportunisticContainerAllocatorAMService}}, we synchronously perform the 
> allocation of opportunistic containers. This results in "blocking" the 
> service at the RM when scheduling the opportunistic containers.
> The {{OpportunisticContainerAllocator}} should instead asynchronously run as 
> a separate thread.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5885) Cleanup YARN-4752 for merge

2016-11-15 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667812#comment-15667812
 ] 

Karthik Kambatla commented on YARN-5885:


Also, posted the [cumulative 
patch|https://issues.apache.org/jira/secure/attachment/12839013/yarn-4752.3.patch]
 off trunk on YARN-4752. Let us try and accommodate any other items necessary 
for merge in this JIRA.

> Cleanup YARN-4752 for merge
> ---
>
> Key: YARN-5885
> URL: https://issues.apache.org/jira/browse/YARN-5885
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5885.1.patch
>
>
> JIRA to track changes necessary for branch merge. These include:
> # Remove names from TODOs (e.g. KK) and add JIRA numbers for follow-up work.
> # Fix tests that have been commented out in earlier patches on the branch.
> # Double check method and field visibility of newly added code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4752) [Umbrella] FairScheduler: Improve preemption

2016-11-15 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-4752:
---
Attachment: yarn-4752.3.patch

> [Umbrella] FairScheduler: Improve preemption
> 
>
> Key: YARN-4752
> URL: https://issues.apache.org/jira/browse/YARN-4752
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.8.0
>Reporter: Karthik Kambatla
> Attachments: YARN-4752.FairSchedulerPreemptionOverhaul.pdf, 
> yarn-4752-1.patch, yarn-4752.2.patch, yarn-4752.3.patch
>
>
> A number of issues have been reported with respect to preemption in 
> FairScheduler along the lines of:
> # FairScheduler preempts resources from nodes even if the resultant free 
> resources cannot fit the incoming request.
> # Preemption doesn't preempt from sibling queues
> # Preemption doesn't preempt from sibling apps under the same queue that is 
> over its fairshare
> # ...
> Filing this umbrella JIRA to group all the issues together and think of a 
> comprehensive solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Assigned] (YARN-5769) Integrate update app lifetime using feature implemented in YARN-5611

2016-11-15 Thread Jian He (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He reassigned YARN-5769:
-

Assignee: Jian He

> Integrate update app lifetime using feature implemented in YARN-5611
> 
>
> Key: YARN-5769
> URL: https://issues.apache.org/jira/browse/YARN-5769
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Gour Saha
>Assignee: Jian He
> Fix For: yarn-native-services
>
>
> The REST API PUT call provides capability to update the lifetime of a running 
> application. Once YARN-5611 is available we need to integrate it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5884) TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720

2016-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667794#comment-15667794
 ] 

Hadoop QA commented on YARN-5884:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 41m 
27s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
16s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:a9ad5d6 |
| JIRA Issue | YARN-5884 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12839001/YARN-5884.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 5a005519f225 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 
17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 3219b7b |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13923/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13923/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720
> ---
>
> Key: YARN-5884
> URL: https://issues.apache.org/jira/browse/YARN-5884
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
> Attachments: YARN-5884.patch
>
>
> {noformat}
> java.lang.AssertionError: null
>   at 

[jira] [Updated] (YARN-5885) Cleanup YARN-4752 for merge

2016-11-15 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-5885:
---
Attachment: yarn-5885.1.patch

> Cleanup YARN-4752 for merge
> ---
>
> Key: YARN-5885
> URL: https://issues.apache.org/jira/browse/YARN-5885
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Karthik Kambatla
> Attachments: yarn-5885.1.patch
>
>
> JIRA to track changes necessary for branch merge. These include:
> # Remove names from TODOs (e.g. KK) and add JIRA numbers for follow-up work.
> # Fix tests that have been commented out in earlier patches on the branch.
> # Double check method and field visibility of newly added code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5885) Cleanup YARN-4752 for merge

2016-11-15 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-5885:
--

 Summary: Cleanup YARN-4752 for merge
 Key: YARN-5885
 URL: https://issues.apache.org/jira/browse/YARN-5885
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: fairscheduler
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla


JIRA to track changes necessary for branch merge. These include:
# Remove names from TODOs (e.g. KK) and add JIRA numbers for follow-up work.
# Fix tests that have been commented out in earlier patches on the branch.
# Double check method and field visibility of newly added code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Reopened] (YARN-5736) YARN container executor config does not handle white space

2016-11-15 Thread Shane Kumpf (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shane Kumpf reopened YARN-5736:
---

> YARN container executor config does not handle white space
> --
>
> Key: YARN-5736
> URL: https://issues.apache.org/jira/browse/YARN-5736
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5736.001.patch, YARN-5736.002.patch, 
> YARN_5736.000.patch
>
>
> The container executor configuration reader does not handle white spaces or 
> malformed key value pairs in the config file correctly or gracefully
> as an example the following key value line which is part of the configuration 
> (note the << is used as a marker to show the extra trailing space):
> yarn.nodemanager.linux-container-executor.group=yarn <<
> is a valid line but when you run the check over the file:
> [root@test]#./container-executor --checksetup
> Can't get group information for yarn - Success.
> [root@test]#
> It fails to find the yarn group but it really tries to find the "yarn " group 
> which fails. There is no trimming anywhere while processing the lines. If a 
> space would be added in before or after the = sign a failure would also occur.
> Minor nit is the fact that a failure still is logged as a Success



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5736) YARN container executor config does not handle white space

2016-11-15 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667770#comment-15667770
 ] 

Shane Kumpf commented on YARN-5736:
---

Parsing of container-executor.cfg appears to have been impacted by this patch. 
If container-executor.cfg contains entries with empty values, parsing doesn't 
work properly and all applications fail.

Here is the c-e output with additional debugging.
{code}
Failing this attempt.Diagnostics: Application application_1479220852002_0001 
initialization failed (exitCode=255) with output: read_config :Conf file name 
is : /usr/local/src/hadoop_install/hadoop/etc/hadoop/container-executor.cfg
read_config : Adding conf key : yarn.nodemanager.linux-container-executor.group
read_config : Adding conf value : yarn
read_config : Adding conf key : banned.users
read_config : Adding conf key : min.user.id
read_config : Adding conf value : 50
read_config : Adding conf key : allowed.system.users
read_config : Adding conf key : docker.binary
read_config : Adding conf value : /usr/bin/docker
read_config : Adding conf key : feature.docker.enabled
read_config : Adding conf value : 1
read_config : Adding conf key : feature.tc.enabled
read_config : Adding conf value : 1
Supplied key: yarn.nodemanager.linux-container-executor.group
Compare key: yarn.nodemanager.linux-container-executor.group
main : command provided 0
main : run as user is nobody
main : requested yarn user is root
get_value Supplied key: min.user.id
get_value Compare key: yarn.nodemanager.linux-container-executor.group
get_value Compare key: min.user.ids
get_value Compare key: docker.binarym.users
get_value Compare key: feature.docker.enabled
get_value Compare key: feature.tc.enabled
Requested user nobody is not whitelisted and has id 99,which is below the 
minimum allowed 1000
{code}

What was observed is that the min.user.id value is not properly parsed, because 
the key comparison in {{configuration.c#get_value}} will never succeed. This is 
because the keys have extra characters appended (min.user.id vs min.user.ids 
and docker.binary vs docker.binarym.users in the output above). Keys with an 
"empty" value (represented by # in container-executor.cfg) are being 
incorrectly combined with the key that follows in the config.

The relevant min.user.id parts below: 
{code}
-snip-
get_value Supplied key: min.user.id
-snip-
get_value Compare key: min.user.ids
{code}

Note that banned.users and min.user.id are combined and docker.binary and 
allowed.system.users are combined.

Here is an example container-executor.cfg that shows the issue:
{code}
yarn.nodemanager.linux-container-executor.group=yarn
banned.users=#
min.user.id=50
allowed.system.users=#
docker.binary=/usr/bin/docker
feature.docker.enabled=1
feature.tc.enabled=1
{code}

I haven't quite tracked down the root cause, but reverting this patch does 
appear to resolve the issue. I'm going to reopen this issue for further 
investigation.

> YARN container executor config does not handle white space
> --
>
> Key: YARN-5736
> URL: https://issues.apache.org/jira/browse/YARN-5736
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Miklos Szegedi
>Assignee: Miklos Szegedi
>Priority: Trivial
>  Labels: oct16-medium
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-5736.001.patch, YARN-5736.002.patch, 
> YARN_5736.000.patch
>
>
> The container executor configuration reader does not handle white spaces or 
> malformed key value pairs in the config file correctly or gracefully
> as an example the following key value line which is part of the configuration 
> (note the << is used as a marker to show the extra trailing space):
> yarn.nodemanager.linux-container-executor.group=yarn <<
> is a valid line but when you run the check over the file:
> [root@test]#./container-executor --checksetup
> Can't get group information for yarn - Success.
> [root@test]#
> It fails to find the yarn group but it really tries to find the "yarn " group 
> which fails. There is no trimming anywhere while processing the lines. If a 
> space would be added in before or after the = sign a failure would also occur.
> Minor nit is the fact that a failure still is logged as a Success



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4355) NPE while processing localizer heartbeat

2016-11-15 Thread Jonathan Hung (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667768#comment-15667768
 ] 

Jonathan Hung commented on YARN-4355:
-

Thanks everyone! Also I missed the CHANGES.txt addition, thanks [~brahmareddy] 
for catching that.

> NPE while processing localizer heartbeat
> 
>
> Key: YARN-4355
> URL: https://issues.apache.org/jira/browse/YARN-4355
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.7.2
>Reporter: Jason Lowe
>Assignee: Varun Saxena
> Fix For: 2.8.0, 2.9.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: YARN-4355-branch-2.7.001.patch, 
> YARN-4355-branch-2.7.002.patch, YARN-4355.01.patch, YARN-4355.02.patch, 
> YARN-4355.03.patch, YARN-4355.04.patch, YARN-4355.05.patch
>
>
> While analyzing YARN-4354 I noticed a nodemanager was getting NPEs while 
> processing a private localizer heartbeat.  I think there's a race where we 
> can cleanup resources for an application and therefore remove the app local 
> resource tracker just as we are trying to handle the localizer heartbeat.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-5884) TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720

2016-11-15 Thread Brahma Reddy Battula (JIRA)
Brahma Reddy Battula created YARN-5884:
--

 Summary: TestTokenClientRMService#testTokenRenewalWrongUser Fails 
after HADOOP-13720
 Key: YARN-5884
 URL: https://issues.apache.org/jira/browse/YARN-5884
 Project: Hadoop YARN
  Issue Type: Bug
  Components: test
Reporter: Brahma Reddy Battula


{noformat}
java.lang.AssertionError: null
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:125)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:118)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
at 
org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService.testTokenRenewalWrongUser(TestTokenClientRMService.java:118)
 Stand
{noformat}

 *Reference* 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/226/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5884) TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720

2016-11-15 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667654#comment-15667654
 ] 

Brahma Reddy Battula commented on YARN-5884:


Yes..thanks [~sunilg]

> TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720
> ---
>
> Key: YARN-5884
> URL: https://issues.apache.org/jira/browse/YARN-5884
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
> Attachments: YARN-5884.patch
>
>
> {noformat}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:125)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:118)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService.testTokenRenewalWrongUser(TestTokenClientRMService.java:118)
>  Stand
> {noformat}
>  *Reference* 
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/226/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-5884) TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720

2016-11-15 Thread Sunil G (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667643#comment-15667643
 ] 

Sunil G commented on YARN-5884:
---

Possible duplicate of YARN-5875 ?

> TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720
> ---
>
> Key: YARN-5884
> URL: https://issues.apache.org/jira/browse/YARN-5884
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
> Attachments: YARN-5884.patch
>
>
> {noformat}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:125)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:118)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService.testTokenRenewalWrongUser(TestTokenClientRMService.java:118)
>  Stand
> {noformat}
>  *Reference* 
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/226/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-5884) TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720

2016-11-15 Thread Brahma Reddy Battula (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated YARN-5884:
---
Attachment: YARN-5884.patch

Uploaded the patch..Kindly review.

> TestTokenClientRMService#testTokenRenewalWrongUser Fails after HADOOP-13720
> ---
>
> Key: YARN-5884
> URL: https://issues.apache.org/jira/browse/YARN-5884
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: test
>Reporter: Brahma Reddy Battula
> Attachments: YARN-5884.patch
>
>
> {noformat}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:125)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService$3.run(TestTokenClientRMService.java:118)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.TestTokenClientRMService.testTokenRenewalWrongUser(TestTokenClientRMService.java:118)
>  Stand
> {noformat}
>  *Reference* 
> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/226/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-15 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667638#comment-15667638
 ] 

Varun Vasudev commented on YARN-1593:
-

[~asuresh] - 
{quote}
Thanks for driving this Varun Vasudev

At first glance, this looks similar in spirit to YARN-5501, and maybe even 
supersedes it. It would be advantageous to model pooled containers as a system 
container.

Further to the point raised by Hitesh Shah about formalizing how we affinitize 
an application's container to a Node on a which a dependent system container is 
run, we were also investigating a scenario where an application might also need 
a countable number of system containers on a Node. An initial thought was to 
probably expose the container as a Generalized resource (YARN-3926). For eg, 
assume spark Executors can be started as Pre-started containers on select 
nodes. Assume a node A has 2 pre-started spark executors, and Node B has 4. A 
spark app might have 3 ContainerRequests that requires <4 VCores, 2 GB, 2 
spark-executors>, in which case the ResourceManager will ensure that 1 such 
container is allocated on Node A and 2 on Node B.

Thoughts ?
{quote}
I think there's quite a bit of overlap. Couple of questions about pooled 
containers - 
1) If they fail to come up should the NM continue to accept container requests 
so should it stop accepting container requests? 
2) Are they meant to run on a subset of nodes or on all nodes? Is this 
controlled by an admin?

Like I mentioned to Hitesh above - the affinity stuff is something we think is 
the long term solution, but we also realize that a solution which is 
essentially "launch this container on every node" will help bridge the gap for 
now. Hence, the inclusion of both in the design doc.

> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1593) support out-of-proc AuxiliaryServices

2016-11-15 Thread Varun Vasudev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667624#comment-15667624
 ] 

Varun Vasudev commented on YARN-1593:
-

[~hitesh] - 
{quote}
My concern is around the feedback loop in terms of failure handling by the apps 
when the system container dies at any of the following points:

system container dies before an allocated container is launched on that node
it dies while a container is running
it dies after a container has completed

Would applications that define affinity to these system services now be getting 
updates (notifications) when system service containers go down or come back up?
{quote}
All of these are questions that we have to solve for the general services 
scenarios and I suspect that they might take some time to get right. Our 
solution till we have a well rounded story for these questions is to use the 
second method I mentioned above where we launch the Tez shuffle service on 
every node. That way Tez doesn't need to change any behaviour for now. Once we 
have the services scheduling and notification pieces sorted out we can start 
moving to the affinity model. 

{quote}
In addition to the feedback loop, is there any behavior change as a result of 
this? i.e. if the system container is not alive, will the app container still 
get launched given that its dependent service is down ( for shuffle, this might 
be ok if the system container eventually comes up but there might be other 
services that provide more synchronous functionality such as a caching layer? 
{quote}
This depends on whether it's a system service or a system container (the 
difference is that the first one has an AM running whereas the second is more 
like auxiliary services running as a container). In case of system containers - 
the NM will stop accepting container requests until the system container is 
back up. In case of the system service, the NM will continue to accept 
container requests.

> support out-of-proc AuxiliaryServices
> -
>
> Key: YARN-1593
> URL: https://issues.apache.org/jira/browse/YARN-1593
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager, rolling upgrade
>Reporter: Ming Ma
>Assignee: Varun Vasudev
> Attachments: SystemContainersandSystemServices.pdf
>
>
> AuxiliaryServices such as ShuffleHandler currently run in the same process as 
> NM. There are some benefits to host them in dedicated processes.
> 1. NM rolling restart. If we want to upgrade YARN , NM restart will force the 
> ShuffleHandler restart. If ShuffleHandler runs as a separate process, 
> ShuffleHandler can continue to run during NM restart. NM can reconnect the 
> the running ShuffleHandler after restart.
> 2. Resource management. It is possible another type of AuxiliaryServices will 
> be implemented. AuxiliaryServices are considered YARN application specific 
> and could consume lots of resources. Running AuxiliaryServices in separate 
> processes allow easier resource management. NM could potentially stop a 
> specific AuxiliaryServices process from running if it consumes resource way 
> above its allocation.
> Here are some high level ideas:
> 1. NM provides a hosting process for each AuxiliaryService. Existing 
> AuxiliaryService API doesn't change.
> 2. The hosting process provides RPC server for AuxiliaryService proxy object 
> inside NM to connect to.
> 3. When we rolling restart NM, the existing AuxiliaryService processes will 
> continue to run. NM could reconnect to the running AuxiliaryService processes 
> upon restart.
> 4. Policy and resource management of AuxiliaryServices. So far we don't have 
> immediate need for this. AuxiliaryService could run inside a container and 
> its resource utilization could be taken into account by RM and RM could 
> consider a specific type of applications overutilize cluster resource.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4597) Introduce ContainerScheduler and a SCHEDULED state to NodeManager container lifecycle

2016-11-15 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667562#comment-15667562
 ] 

Brahma Reddy Battula commented on YARN-4597:


It's worked after clean compile..Sorry for false alarm.

> Introduce ContainerScheduler and a SCHEDULED state to NodeManager container 
> lifecycle
> -
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, 
> YARN-4597.012.patch, YARN-4597.013.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4597) Introduce ContainerScheduler and a SCHEDULED state to NodeManager container lifecycle

2016-11-15 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667557#comment-15667557
 ] 

Arun Suresh commented on YARN-4597:
---

Looks like the build passed too :  
https://builds.apache.org/job/Hadoop-trunk-Commit/10838/

> Introduce ContainerScheduler and a SCHEDULED state to NodeManager container 
> lifecycle
> -
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, 
> YARN-4597.012.patch, YARN-4597.013.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-4597) Introduce ContainerScheduler and a SCHEDULED state to NodeManager container lifecycle

2016-11-15 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667554#comment-15667554
 ] 

Hudson commented on YARN-4597:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10838 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10838/])
YARN-4597. Introduce ContainerScheduler and a SCHEDULED state to (arun suresh: 
rev 3219b7b4ac7d12aee343f6ab2980b3357fc618b6)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeStatusUpdater.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerShutdown.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/ContainerLogsUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests/src/test/java/org/apache/hadoop/yarn/server/MiniYARNCluster.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/Context.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerScheduler.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/package-info.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/TestContainerSchedulerQueuing.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* (delete) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/queuing/TestQueuingContainerManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestDistributedScheduling.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestNodeManagerResync.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/scheduler/ContainerSchedulerEvent.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/SchedulerNode.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestOpportunisticContainerAllocation.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/ContainerExitStatus.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/RecoveredContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/BaseAMRMProxyTest.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerState.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AbstractYarnScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/TestContainer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/yarn_protos.proto
* (edit) 

[jira] [Commented] (YARN-4597) Introduce ContainerScheduler and a SCHEDULED state to NodeManager container lifecycle

2016-11-15 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15667539#comment-15667539
 ] 

Arun Suresh commented on YARN-4597:
---

[~brahmareddy], can you try a mvn clean compile ?
It compiles fine for me... and looks like jenkins is also ok with it..


> Introduce ContainerScheduler and a SCHEDULED state to NodeManager container 
> lifecycle
> -
>
> Key: YARN-4597
> URL: https://issues.apache.org/jira/browse/YARN-4597
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: nodemanager
>Reporter: Chris Douglas
>Assignee: Arun Suresh
>  Labels: oct16-hard
> Fix For: 3.0.0-alpha2
>
> Attachments: YARN-4597.001.patch, YARN-4597.002.patch, 
> YARN-4597.003.patch, YARN-4597.004.patch, YARN-4597.005.patch, 
> YARN-4597.006.patch, YARN-4597.007.patch, YARN-4597.008.patch, 
> YARN-4597.009.patch, YARN-4597.010.patch, YARN-4597.011.patch, 
> YARN-4597.012.patch, YARN-4597.013.patch
>
>
> Currently, the NM immediately launches containers after resource 
> localization. Several features could be more cleanly implemented if the NM 
> included a separate stage for reserving resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



  1   2   >