[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-26 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665887#comment-16665887
 ] 

Hudson commented on YARN-8569:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15328 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/15328/])
YARN-8569. Create an interface to provide cluster information to (billie: rev 
d07e873b7db6cb317eccb4768607c1afb505c99b)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/string-utils.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/privileged/PrivilegedOperation.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/client/ServiceClient.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceMaster.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/utils/ServiceApiUtil.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/docker/DockerRunCommand.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestLinuxContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/LinuxContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/linux/runtime/DockerLinuxContainerRuntime.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/ServiceScheduler.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/main.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/TestContainersMonitorResourceChange.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/test/java/org/apache/hadoop/yarn/service/TestServiceAM.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.h
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/utils/string-utils.c
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/webapp/NMWebServices.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core/src/main/java/org/apache/hadoop/yarn/service/utils/HttpUtil.java


> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Fix For: 3.3.0
>
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-26 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665803#comment-16665803
 ] 

Hadoop QA commented on YARN-8569:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
56s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
20s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 147 unchanged - 1 fixed = 147 total (was 148) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m 
36s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
37s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
22s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-25 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664254#comment-16664254
 ] 

Eric Yang commented on YARN-8569:
-

The failed test cases are not related to this patch.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch, YARN-8569.014.patch, 
> YARN-8569.015.patch, YARN-8569.016.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664143#comment-16664143
 ] 

Hadoop QA commented on YARN-8569:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
2s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
30s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 146 unchanged - 1 fixed = 146 total (was 147) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 35s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
47s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 11s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 22s{color} 
| {color:red} hadoop-yarn-services-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
41s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-24 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662884#comment-16662884
 ] 

Billie Rinaldi commented on YARN-8569:
--

Thanks [~eyang]! I am +1 for patch 16, pending precommit build.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch, YARN-8569.014.patch, 
> YARN-8569.015.patch, YARN-8569.016.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-24 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662826#comment-16662826
 ] 

Eric Yang commented on YARN-8569:
-

[~billie.rinaldi] Thank you for the review.  Patch 16 includes your suggestion.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch, YARN-8569.014.patch, 
> YARN-8569.015.patch, YARN-8569.016.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-24 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662804#comment-16662804
 ] 

Billie Rinaldi commented on YARN-8569:
--

Thanks, [~eyang]. I tested out patch 15 in an insecure environment and it 
worked. One thing I noticed is that the AM log says "YARN sysfs synchronized" 
when no components have sysfs configured, so the sysfs has not actually been 
synchronized. It would be nice to skip this output when sysfs is not configured.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch, YARN-8569.014.patch, 
> YARN-8569.015.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659797#comment-16659797
 ] 

Hadoop QA commented on YARN-8569:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  3s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 147 unchanged - 1 fixed = 147 total (was 148) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
15s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
4s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m  
9s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 
45s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
26s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
44s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-22 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659618#comment-16659618
 ] 

Eric Yang commented on YARN-8569:
-

Patch 015 rebased on YARN-8922.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch, YARN-8569.014.patch, 
> YARN-8569.015.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657452#comment-16657452
 ] 

Eric Yang commented on YARN-8569:
-

[~rkanter] Thank you for filing the problem as separate JIRA.  I will remove 
the hotfix from this JIRA, and link both JIRA together.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch, YARN-8569.014.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-19 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657385#comment-16657385
 ] 

Robert Kanter commented on YARN-8569:
-

[~eyang], I think I've come up with the proper fix for the issue.  I've created 
YARN-8922 to take care of it; please take a look.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch, YARN-8569.014.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657311#comment-16657311
 ] 

Hadoop QA commented on YARN-8569:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
3s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  9m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
49s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 146 unchanged - 1 fixed = 146 total (was 147) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m 28s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
46s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
20s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-19 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657208#comment-16657208
 ] 

Robert Kanter commented on YARN-8569:
-

[~eyang], I tried it out and with the YARN-8569 014 patch, 
test-container-executor fails when run as root:
{noformat}
[root@rkanter-dev hadoop-yarn-server-nodemanager]# 
target/native/target/usr/local/bin/test-container-executor systest
Attempting to clean up from any previous runs
chmod: cannot access ‘/tmp/test-container-executor’: No such file or directory

Our executable is 
/root/hadoop-upstream/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/target/usr/local/bin/test-container-executor

Starting tests

test_is_empty()

Testing is_empty function
Directory is not empty /
Could not open directory /tmp/test-container-executor/noexist - No such file or 
directory
Could not open directory /tmp/test-container-executor/emptydir - No such file 
or directory
FAIL: /tmp/test-container-executor/emptydir should be empty
{noformat}
It looks like it can't create the {{/tmp/test-container-executor/emptydir}} 
directory.

Here's the output when run on trunk:
{noformat}
[root@rkanter-dev hadoop-yarn-server-nodemanager]# 
target/native/target/usr/local/bin/test-container-executor systest
Attempting to clean up from any previous runs
chmod: cannot access ‘/tmp/test-container-executor’: No such file or directory

Our executable is 
/root/hadoop-upstream/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/native/target/usr/local/bin/test-container-executor

Starting tests

test_is_empty()

Testing is_empty function
Directory is not empty /
Could not open directory /tmp/test-container-executor/noexist - No such file or 
directory

Testing recursive_unlink_children()

Testing resolve_config_path()

Testing resolve_config_path

Testing get_user_directory()

Testing check_nm_local_dir()

Testing get_app_directory()

Testing get_container_work_directory()

Testing get_container_launcher_file()

Testing get_container_credentials_file()

Testing get_container_keystore_file()

Testing get_container_truststore_file()

Testing get_app_log_dir()

Testing check_configuration_permissions
File /tmp/test-container-executor must not be world or group writable, but is 
777

Testing delete_container()

Testing delete_app()

Testing delete race

Testing is_feature_enabled()
Illegal value '1klajdflkajdsflk' for 'feature.name3.enabled' in configuration. 
Using default value: 0.
Illegal value 'asdkjfasdkljfklsdjf0' for 'feature.name4.enabled' in 
configuration. Using default value: 0.
Illegal value '-1' for 'feature.name5.enabled' in configuration. Using default 
value: 1.
Illegal value '2' for 'feature.name6.enabled' in configuration. Using default 
value: 0.

Testing test_check_user
Requested user lp is not whitelisted and has id 4,which is below the minimum 
allowed 500
Running as root is not allowed

Testing clean_docker_cgroups
clean_docker_cgroups: Invalid mount table
clean_docker_cgroups: Invalid yarn_hierarchy
clean_docker_cgroups: Invalid container_id: null
clean_docker_cgroups: Invalid container_id: not_a_container_123

Running test test_signal_container_group in child process

Testing group signal_container
Child container launched as 16189
Killing process group 16189 with 9

Testing init app

Testing launch container without HTTPS
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /tmp/test-container-executor/pid.txt.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...

Testing launch container with HTTPS
Getting exit code file...
Creating script paths...
Writing pid file...
Writing to tmp file /tmp/test-container-executor/pid.txt.tmp
Writing to cgroup task files...
Creating local dirs...
Launching container...

Testing delete_user
baseDir 
"/tmp/test-container-executor/local-1/usercache/systest/appcache/app_3/test.cfg"
 is a file and cannot contain subdir "file1".
0
Trying banned default user()

Testing test_check_user
Requested user bin is banned
Running as root is not allowed

Testing test_check_user
User sys not found
Running as root is not allowed

Testing trim function

Finished tests
{noformat}

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-19 Thread Robert Kanter (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657096#comment-16657096
 ] 

Robert Kanter commented on YARN-8569:
-

Sorry about that.  I had to fix a number of problems with 
test-container-executor to get it to run under root, which runs also runs a 
number of additional tests (I think the pre-commit build must only run it as a 
normal user).  I had changed the dir permissions as part of that, but perhaps 
it wasn't necessary.  I'll double check that it still works as root, using your 
patch here.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch, YARN-8569.014.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657077#comment-16657077
 ] 

Eric Yang commented on YARN-8569:
-

Patch 14 includes the test-container-executor hot fix to pass pre-commit build 
and rebase to current trunk.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657073#comment-16657073
 ] 

Eric Yang commented on YARN-8569:
-

Unit test failure is caused by YARN-8448, where /tmp/test-container-executor is 
changed to 0777, but cetest expects it to be 0755.  [~rkanter] By changing the 
/tmp/test-container-executor to 0755, it seems to pass fine.  Hence, I include 
the change in this JIRA instead of reopening YARN-8448.  Let me know if this 
works for you.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16657027#comment-16657027
 ] 

Eric Yang commented on YARN-8569:
-

Node manager unit test failure is caused by YARN-8449 where the tmp directory 
is not clean up after test, and not related to changes in this patch.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656163#comment-16656163
 ] 

Hadoop QA commented on YARN-8569:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
26s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
14s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 147 unchanged - 1 fixed = 147 total (was 148) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
48s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 30s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
47s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-18 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656050#comment-16656050
 ] 

Eric Yang commented on YARN-8569:
-

Patch 13 addressed issues to run this patch in non-secure cluster.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch, YARN-8569.013.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-17 Thread Billie Rinaldi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16654432#comment-16654432
 ] 

Billie Rinaldi commented on YARN-8569:
--

Adding a file name parameter later wouldn't need to be an incompatible API 
change; we could have the file name default to app.json when not specified. I'm 
not sure it makes sense to support more than one sysfs file per application, so 
I don't have a problem with it having a single file name for the initial 
implementation. Since sysfs-aware apps are going to have to read this file, it 
makes sense for it to be in a consistent location. Another future consideration 
might be whether apps would want to have access to the same information in 
other file formats besides JSON.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647222#comment-16647222
 ] 

Hadoop QA commented on YARN-8569:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
23s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 146 unchanged - 1 fixed = 146 total (was 147) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m  
5s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m  
6s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-11 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16647030#comment-16647030
 ] 

Eric Yang commented on YARN-8569:
-

[~leftnoteasy] . Patch 12 includes set_user fix, and added sysfs path name to 
nmPrivate/[appId]/sysfs.  In the current implementation, the sync will make n 
rest api call to node managers where n is less than total number of bare metal 
host that hosting containers.  This reduced required network traffic to keep 
information in sync.  With the introduced sysfs prefix in nmPrivate directory, 
it will pave ways to add more than just app.json to sysfs directory and prevent 
path traversal attack.  

When information is populated into other files, then there is higher chance of 
race condition, where state is changed in some files but not all files.  
Multiple files population mechanism will require more thoughts to keep the 
information transactional.  The first version does not add add container id, or 
arbitrary filename support to reduce the transaction commits and ensure the 
information propagation is idempotent from container point of view.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch, 
> YARN-8569.012.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-10 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16645324#comment-16645324
 ] 

Eric Yang commented on YARN-8569:
-

{quote}If you will do #1, IIUC, the #2 has to be done, otherwise as a normal 
user we cannot read from nmPrivate dir. {quote}

Line 2632-2638 of container-executor does exactly as open file as node manager 
user, and it already drop privileges while writing the file.  I am not disagree 
with what you said, but it is already part of the patch.

{quote}We already limit to read file under .../nmPrivate/app../sys/fs/ 
correct? How it is possible to read token file from that directory?{quote}

App.json is writing to nmPrivate/[appid]/app.json, there is no sub-directory in 
nmPrivate/[appid]/sysfs.  Sysfs directory only exists in distributed cache 
location.  Good security practice is to make generated paths have hard coded 
depth and fix filename to prevent path exploits.  If we allow API to specify 
arbitrary filename, user can easy specify "sysfs/../container_tokens" from rest 
api to defeat security measurement that were put in place.  Readlink can 
flatten the path, but there is no real indicator that source can be copied.  
For now, I can do source path check to make sure it is jailed in 
nmPrivate/[appid]/sysfs path.  However, the first version of API will not 
accept arbitrary filename to minimize security concerns.  It would be easy to 
extend the REST API method with support of arbitrary filename when more 
thoughts are given.  I leave that responsibility to the one that wants to bare 
the problem.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-09 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644427#comment-16644427
 ] 

Wangda Tan commented on YARN-8569:
--

[~eyang], 

 
{quote}In sync_yarn_sysfs, check_nm_local_dir make sure the nm_local_dir is 
owned by yarn user. 
{quote}
If you will do #1, IIUC, the #2 has to be done, otherwise as a normal user we 
cannot read from nmPrivate dir. 

 
{quote}I am concerned that tokens are localized in the same nmPrivate directory.
{quote}
We already limit to read file under .../nmPrivate/app../sys/fs/ correct? 
How it is possible to read token file from that directory? 
{quote}Point 4 is a bit concerning to me. I am error on side of caution.. ..
{quote}
Given the file hierarchy is part of the API, we should make it correct in the 
first cut. Any changes in the future will be an incompatible change. So I 
prefer to get the API correct in the first patch. 

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-09 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644277#comment-16644277
 ] 

Eric Yang commented on YARN-8569:
-

{quote}
1) We use use set_user before the method to drop privileges.
{quote}

Good catch, check_user will be applied in the next patch.

{quote}
2) Root privilege should be only used by open_file_as_nm.
{quote}

In sync_yarn_sysfs, check_nm_local_dir make sure the nm_local_dir is owned by 
yarn user.  This ensure the copy file operation is safe guarded against 
possible path exploits.  This is equivalent to next point to ensure nm local 
dirs are indeed owned by yarn.

{quote}
3) We should check all in/out file paths under local dirs defined by 
container_executor.cfg.
{quote}

Initialize container, launch container, launch docker container, they are all 
using nm-local-dirs path passed from node manager to container executor.  There 
is no definition of yarn.nodemanager.local-dirs in container-executor.cfg file. 
 For containing the scope of this JIRA, I opened YARN-8863 to track the 
proposed global pattern change.

{quote}
4) Regarding to API, I think we should allow filename specified.
{quote}

I am concerned that tokens are localized in the same nmPrivate directory.  If 
filename can be specified, it may increase vulnerable surface for attack to use 
the REST API to copy hdfs token into sysfs directory.  I don't unsure this 
appear to be innocent action is safe.

{quote}5) For the mounted dir for container, for app-level information, I would 
prefer to put under ...sys/fs/app/ instead of ...sys/fs/ since we 
want to support per-container information in the future.{quote}

Point 4 is a bit concerning to me.  I am error on side of caution.  Therefore, 
I do not plan to make arbitrary filename change in this JIRA.  Please open a 
separate one, if you feel strongly about supporting custom filename in the rest 
API.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-09 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644223#comment-16644223
 ] 

Wangda Tan commented on YARN-8569:
--

Discussed with [~eyang]/[~suma.shivaprasad] offline, 

Several comments: 

1) We use use set_user before the method to drop privileges.

2) Root privilege should be only used by open_file_as_nm.

3) We should check all in/out file paths under local dirs defined by 
container_executor.cfg.

4) Regarding to API, I think we should allow filename specified.

5) For the mounted dir for container, for app-level information, I would prefer 
to put under ...sys/fs/app/ instead of ...sys/fs/ since we want to 
support per-container information in the future.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-09 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644071#comment-16644071
 ] 

Suma Shivaprasad commented on YARN-8569:


[~eyang] Thanks for updating the patch. +1

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch, YARN-8569.011.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-09 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16643972#comment-16643972
 ] 

Hadoop QA commented on YARN-8569:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 42s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} hadoop-yarn-project/hadoop-yarn: The patch generated 
0 new + 148 unchanged - 1 fixed = 148 total (was 149) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 24s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
45s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
39s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
42s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
57s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-08 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642483#comment-16642483
 ] 

Suma Shivaprasad commented on YARN-8569:


[~eyang] Patch 10 Looks good. A few minor checkstyle issues would need to be 
resolved. 

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642465#comment-16642465
 ] 

Hadoop QA commented on YARN-8569:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
29s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 25s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 14 new + 146 unchanged - 1 fixed = 160 total (was 147) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
58s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 13m 
17s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
22s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-08 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16642226#comment-16642226
 ] 

Eric Yang commented on YARN-8569:
-

Patch10 renamed YARN_CONTAINER_RUNTIME_YARN_SYSFS to 
YARN_CONTAINER_RUNTIME_YARN_SYSFS_ENABLE.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch, YARN-8569.010.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-05 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640484#comment-16640484
 ] 

Suma Shivaprasad commented on YARN-8569:


[~eyang] YARN_CONTAINER_RUNTIME_YARN_SYSFS_ENABLE sounds good. Thanks !

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-05 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640461#comment-16640461
 ] 

Eric Yang commented on YARN-8569:
-

[~suma.shivaprasad] Good catch that handleYarnSysFSUpdate method was out dated 
code, it will be removed in next patch.

YARN_CONTAINER_RUNTIME_YARNSYSFS following the existing nomenclature and 
generate readable flag in launcher script that looks like:
{code}
export YARN_CONTAINER_RUNTIME_YARN_SYSFS=true
{code}

This looks better than "export YARN_CONTAINER_RUNTIME_EXPORT_YARN_SYSFS=true" 
in my opinion.  The existing nomenclature looks like this: 
YARN_CONTAINER_RUNTIME_[TYPE|DOCKER_*|YARN_SYSFS].  How about 
YARN_CONTAINER_RUNTIME_YARN_SYSFS_ENABLE=true?



> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-05 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640432#comment-16640432
 ] 

Suma Shivaprasad commented on YARN-8569:


Thanks [~eyang] for updating the patch.

Looks like DockerLinuxContainerRuntime. handleYarnSysFSUpdate is not used. If 
so, can we remove it?
YARN_CONTAINER_RUNTIME_YARN_SYSFS - Could we rename this to 
YARN_CONTAINER_RUNTIME_EXPORT_YARN_SYSFS for better understanding of its use 
and update documentation that it should be set to true/false. 

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-05 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640392#comment-16640392
 ] 

Hadoop QA commented on YARN-8569:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 11m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 24s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 11 new + 149 unchanged - 1 fixed = 160 total (was 150) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 17s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
59s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
15s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 10s{color} 
| {color:red} hadoop-yarn-services-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-05 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640262#comment-16640262
 ] 

Eric Yang commented on YARN-8569:
-

[~suma.shivaprasad] Patch 009 includes the following fix:

- Synchronize updateYarnSysFS.
- Fixed file descriptor leaks.
- Renamed service.json to app.json
- Removed PrivilegedOperationException from ContainerExecutor to prevent 
implementation bleeding to abstract class.


> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch, 
> YARN-8569.009.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-05 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640186#comment-16640186
 ] 

Eric Yang commented on YARN-8569:
-

[~suma.shivaprasad] The list of hosts required the update is normalized in AM 
logic first, then one call to the node instead of calling on every container.  
Multiple writers to the same file shouldn't happen, but I will sync the file 
write access just in case.  Thanks

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-05 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640094#comment-16640094
 ] 

Suma Shivaprasad commented on YARN-8569:


[~eyang] Missed this earlier. There could be potential race conditions between 
containers running on the same host while executing updateYarnSysFs since they 
are writing to the same file. Can we synchronize the file writes/deletes?

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639178#comment-16639178
 ] 

Eric Yang commented on YARN-8569:
-

[~suma.shivaprasad] Thanks for the review, appinfo.json is probably better 
choice than AMInfo.json because the API can be used by code other than 
Application master even though it might be rare.  Info in appinfo.json provides 
no additional information to the file name.  How about app.json in short for 
easy nomenclature of naming this file?  Good catch on the file close, will 
update in the next patch.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-04 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639205#comment-16639205
 ] 

Suma Shivaprasad commented on YARN-8569:


[~eyang] app.json sounds good. Thanks

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-10-04 Thread Suma Shivaprasad (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16639012#comment-16639012
 ] 

Suma Shivaprasad commented on YARN-8569:


Thanks [~eyang] .  I agree with [~leftnoteasy] that the local file name should 
be renamed to something like  "appInfo.json" or "AMInfo.json" instead of 
service.json since it could be used by other custom AMs other than Yarn Service 
like Spark AM. 

Patch generally LGTM. A few minor comments

ServiceClient.addFilesToCompression and 
DockerLInuxContainerRuntime.handleYarnSysFSUpdate are opening *Stream  classes 
which could be closed in finally like in addYarnSysFs to prevent resource leaks.



> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-10 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609755#comment-16609755
 ] 

Hadoop QA commented on YARN-8569:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
33s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 17s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 4 new + 147 unchanged - 1 fixed = 151 total (was 148) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 10s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
49s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
58s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
25s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
16s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-10 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609645#comment-16609645
 ] 

Eric Yang commented on YARN-8569:
-

Patch 008 fixed more check style issues, and change sync sysfs api to be based 
on application id instead of combination of application id and container id.  
This reduces the number of network requests and repetitive syncing of same 
cluster spec information.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch, YARN-8569.007.patch, YARN-8569.008.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608313#comment-16608313
 ] 

Hadoop QA commented on YARN-8569:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  8m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m  
1s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 10s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 8 new + 147 unchanged - 1 fixed = 155 total (was 148) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
43s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 19m  
1s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
57s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
24s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-07 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607802#comment-16607802
 ] 

Hadoop QA commented on YARN-8569:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
30s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  9m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 30s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 43 new + 149 unchanged - 1 fixed = 192 total (was 150) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 21s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
57s{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 generated 4 new + 0 unchanged - 0 fixed = 4 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
45s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 
31s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 17m  0s{color} 
| {color:red} hadoop-yarn-services-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-07 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16607714#comment-16607714
 ] 

Eric Yang commented on YARN-8569:
-

[~leftnoteasy] Patch 6 addressed the following:

- Use localizer to distribute initial copy of service.json in a tarball.
- Mount expanded sysfs.tar to container.
- Added logic to replace sysfs.tar local copy with the latest status after 
service reaches stable state.
- Added test cases in Java and C level
- On-off feature switch in container-executor.cfg to disable the feature.

I didn't make sysfs REST API a generic mechanism for tarball replacer for 
distributed cache to prevent people from abusing this API for unintended 
purpose.  If you still like that generic feature, please open a separate JIRA 
for that.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch, 
> YARN-8569.006.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-05 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604686#comment-16604686
 ] 

Eric Yang commented on YARN-8569:
-

YARN localizer only support tarball, archive, and individual file.  It does not 
support directory with files in it.  This causes docker to mount the path to a 
specific file instead of a directory with files in it.  We get conflicts of 
double mounting the same path when merge patch 1 and patch 5 together with 
attempt to double mount the same subdirectory:

{code}
{
"Type": "bind",
"Source": 
"/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1536164978190_0001/container_1536164978190_0001_01_05/sysfs",
"Destination": "/hadoop/yarn/sysfs",
"Mode": "ro",
"RW": false,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": 
"/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/appcache/application_1536164978190_0001/filecache/10/service.json",
"Destination": "/hadoop/yarn/sysfs/service.json",
"Mode": "ro",
"RW": false,
"Propagation": "rprivate"
},
{code}

Docker would error out at this point.  Therefore, I am going to try to generate 
an archive or tarball with service.json in it, and let localizer decompress the 
directory, and mount the directory.  The follow up update logic will locate the 
localizer directory and replace information in the localizer directory.  In 
case, if anyone wonder that why go through the compress service.json file to 
tarball step in the next patch.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603707#comment-16603707
 ] 

Wangda Tan commented on YARN-8569:
--

Thanks [~eyang],  

And forgot to mention: if we're going to add new functionalities to c-e, it 
could be helpful to add a flag to c-e.cfg, and make it off by default. Just 
like what we did for YARN-6623 and YARN-6033 to avoid exposing security holes.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-04 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603686#comment-16603686
 ] 

Eric Yang commented on YARN-8569:
-

[~leftnoteasy] {quote}How we can make it available prior to container launch? 
By localizing service.json for each launched container? {quote}

Yes, then the new file will replace the old file by removing the symlink to 
replace the file in symlink.  I need to experiment with this.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-04 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603680#comment-16603680
 ] 

Wangda Tan commented on YARN-8569:
--

[~eyang], 

How we can make it available prior to container launch? By localizing 
service.json for each launched container? 
{quote}We could also make a subdirectory in nmPrviate to ensure that prefix 
path is not wide open for exploit
{quote}
This sounds also good :).

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-03 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602513#comment-16602513
 ] 

Eric Yang commented on YARN-8569:
-

{quote}As mentioned in the doc, the info appears with possible delay after 
container launch. So containers need to add additional logics to get what is 
the resources allocated to the container. However, such "sys info" for 
individual containers should be created when container got launched. (Do you 
think does it make sense which "/sys/fs/cgroups"  created minutes after process 
launched by OS?).{quote}

I plan to do another round to merge patch 1 and patch 5 to present resource 
information prior to container launch.  However, IP addresses will not be 
available until docker reports them.  Delay for obtaining IP is unavoidable.

{quote}And in implementation, AM should have ability to write files to an 
arbitrary file instead of hard coded to "service.json". {quote}

This restriction is to prevent hiked YARN user from leaking more senstive 
information like hdfs delegation token.  We could also make a subdirectory in 
nmPrviate to ensure that prefix path is not wide open for exploit.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-03 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602350#comment-16602350
 ] 

Wangda Tan commented on YARN-8569:
--

And in implementation, AM should have ability to write files to an arbitrary 
file instead of hard coded to "service.json". 

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-09-03 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602345#comment-16602345
 ] 

Wangda Tan commented on YARN-8569:
--

[~eyang],

I still think it is a bad idea to support "sys info" by using the API.

As mentioned in the doc, the info appears with possible delay after container 
launch. So containers need to add additional logics to get what is the 
resources allocated to the container. However, such "sys info" for individual 
containers should be created when container got launched. (Do you think does it 
make sense which "/sys/fs/cgroups or /"  created minutes after process launched 
by OS?).

Second, given this is completely created/managed by AM, we should never call it 
"/hadoop/yarn/sysfs" or "cluster info", etc., as well as APIs, given it is not 
controlled by YARN. It should be renamed to "am-info" or "app-info".

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-31 Thread genericqa (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599318#comment-16599318
 ] 

genericqa commented on YARN-8569:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 19s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
20s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  1m 
11s{color} | {color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  1m 11s{color} | 
{color:red} hadoop-yarn in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 11s{color} 
| {color:red} hadoop-yarn in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  8s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 28 new + 130 unchanged - 1 fixed = 158 total (was 131) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  3m 
50s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
23s{color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed. 
{color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
22s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
38s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 23s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 16m 53s{color} 
| {color:red} hadoop-yarn-services-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
14s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} |
| 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-31 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16599266#comment-16599266
 ] 

Eric Yang commented on YARN-8569:
-

Patch 005 fixed white space and findbugs issues.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569 YARN sysfs interface to provide cluster 
> information to application.pdf, YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch, YARN-8569.005.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-30 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598002#comment-16598002
 ] 

Eric Yang commented on YARN-8569:
-

Patch 004 minor code clean up and added unit test for LinuxContainerExecutor 
changes.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch, YARN-8569.004.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-30 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16597940#comment-16597940
 ] 

Eric Yang commented on YARN-8569:
-

Patch 003 added the following improvements:
- unit tests for container-executor
- unit tests for ServiceScheduler in YARN Service AM.
- Site documentation to explain YARN sysfs concept.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch, YARN-8569.002.patch, 
> YARN-8569.003.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596578#comment-16596578
 ] 

Eric Yang commented on YARN-8569:
-

{quote}There were tons of debates regarding to yarn user should be treated as 
root or not before. We saw some issues of c-e causes yarn user can manipulate 
other user's directories, or directly escalate to root user. All of these 
issues become CVE.{quote}

If I recall correctly, I reported and fixed container-executor security issues 
like YARN-7590 and YARN-8207.  I think I have written proper security check to 
make sure the caller via network has the right Kerberos tgt that matches end 
user's container directory and also validated the source of the data is coming 
from node manager private directory.  There is a permission validation to copy 
spec file information from node manager private directory to end user container 
directory.  This design is similar to transporting delegation tokens to 
container working directory.  I think there are good enough security 
validations to ensure no security hole has been added to this work.  Let me 
know if you find security holes.

{quote}
>From YARN's design purpose, ideally all NM/RM logics should be as general as 
>possible, all service-related stuffs should be handled by service framework 
>like API server or ServiceMaster. I really don't like the idea of adding 
>service-specific API to NM API.{quote}

The new API is not YARN service framework specific.  ContainerExecutor provide 
basic API for starting, stopping, and clean up containers, but it is missing 
more sophisticated API like synchronize configuration among containers.  The 
new API syncYarnSysFS is proposed to allow ContainerExecutor developer to write 
their own implementation of populating text information to /hadoop/yarn/sysfs.  
Custom AM can be written to use the new API and populate other text 
information.  The newly added API in node manager is generic and avoid the 
double serialization cost exists in container manager protobuf and rpc code.  
There is no extra serialization cost of the content during transport, thus the 
new API is more efficient and light weight.  There is nothing specific to YARN 
service, although YARN service is the first consumer for this API.

{quote}
1) ServiceMaster ro mount a local directory (under the container's local dir) 
when launch docker container (example like: ./service-info -> /service/sys/fs/) 

2) ServiceMaster request to re-localize new service spec json file to the 
./service-info folder.
{quote}

What is "ro mount" in the first sentence?  Is it remote mount or read-only 
mount?  ServiceMaster is not guarantee to run on the same node as other 
container.  Hence, there is no practical way to mount service master's 
directory to container's local directory cross nodes.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch, YARN-8569.002.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command 

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-29 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596014#comment-16596014
 ] 

Wangda Tan commented on YARN-8569:
--

[~eyang], 
{quote}Unless malicious user already hacked into yarn user account and populate 
data as yarn user, there is no easy parameter hacking to container-executor to 
trigger exploits.
{quote}
 

There were tons of debates regarding to yarn user should be treated as root or 
not before. We saw some issues of c-e causes yarn user can manipulate other 
user's directories, or directly escalate to root user. All of these issues 
become CVE.

 
{quote}This is the reason that this solution is invented to lower the bar of 
writing clustering software for Hadoop.
{quote}
 

It gonna be help if you can share some real-world examples. 

>From YARN's design purpose, ideally all NM/RM logics should be as general as 
>possible, all service-related stuffs should be handled by service framework 
>like API server or ServiceMaster. I really don't like the idea of adding 
>service-specific API to NM API.

If you do think update service spec json file is important, another approach 
could be:

1) ServiceMaster ro mount a local directory (under the container's local dir) 
when launch docker container (example like: ./service-info -> /service/sys/fs/) 

2) ServiceMaster request to re-localize new service spec json file to the 
./service-info folder.

 

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch, YARN-8569.002.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-29 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595986#comment-16595986
 ] 

Eric Yang commented on YARN-8569:
-

{quote}
1) Frequently syncing service spec could cause cluster performance issue, a one 
thousand component instances service can cause several thousands of DFS calls 
for any service state change. Even if the feature is on by demand, a 
non-malicious user can still cause big perf impact to a cluster.
{quote}

There is no dfs calls in patch 2.  The spec information is transferred from 
AM's memory to node manager local directory.  It is light weight HTTP REST 
calls.  Ten years ago, I would agree with you that one thousand HTTP request to 
copy spec information to all node managers could choke some older generation of 
network equipment from 2005.  However, network has advanced a lot in the past 
10 years.  Most 10GB Ethernet can achieve [one million 
IOPS|https://phys.org/news/2010-01-intel-million-iops-gigabit-ethernet.html].  
Doing 1000 REST calls with 4kbytes payload can be completed in 1-5 millisecond 
and only require 4MB of bandwidth.  Base on my local testing, the 
synchronization happens in faction of milliseconds.  The described performance 
impact hard to achieve because the docker container launch and reach stable 
state took some amount of time.  Spec information distribution only happens 
when all information are reported to AM, and AM has reached stable state.  The 
spec information distribution is minuscule that machine stay idle through out 
the operation.

{quote}
2) Changes of this patch are very involved, all the way from native service to 
C code. It seems that local_dir is passed by user's command, is it possible a 
malicious user can use c-e to update service spec of other users which can 
potentially cause security issues? I haven't checked much details yet, just 
want to point out possibilities.
{quote}

Malicious user can not use container-executor to update service spec because 
container-executor checks the permission of the source directory is owned by 
node manager before it proceed with update.  This means the source of truth is 
all coming from yarn user.  Unless malicious user already hacked into yarn user 
account and populate data as yarn user, there is no easy parameter hacking to 
container-executor to trigger exploits.  The technique of local_dir passing is 
same as container launch, and check the local_dir is owned by yarn user.  This 
allows us to use local_dir from yarn-site.xml instead of hard code in 
container-executor.cfg for local directories.  Hence, this new patch to 
container-executor provides 0% more chance to cause exploits.

{quote}
Instead of doing this, to me it gonna be very useful to write 
container-specific information to spec or even ENV, such as resource, etc. It 
requires much less changes and 70-80% of issues can be addressed.
{quote}

Container information is already updated in spec file on HDFS by YARN service 
application master that is outside scope of this JIRA.  The limitation of this 
information on HDFS is still inaccessible to container that has no knowledge 
about Hadoop API.  This is the reason that this information is exposed as a 
file mounted to container without require container to carry any custom code to 
process cluster information.
 
ENV approach has been identified as a non-starter because there is no way to 
update ENV once docker container started running.  Hence, using flex command, 
or the container restarts on another node.  ENV variable will have out-dated 
information because ENV variable are immutable to the cluster changes.  This is 
the reason that this solution is invented to lower the bar of writing 
clustering software for Hadoop.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch, YARN-8569.002.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-28 Thread Wangda Tan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595949#comment-16595949
 ] 

Wangda Tan commented on YARN-8569:
--

[~eyang],

As we discussed offline, the use case is not clear to me. For example, TF 
doesn't assume tasks set will be changed during job run, and TF_CONFIG (which 
includes all endpoints of each instances) can be calculated upfront before 
launching service. There could be some services can be benefit from the spec, 
however, we cannot find a strong real world example to support this feature. 

In spite of above comment, to me it is always good to have new features added. 
However there're two big concerns not addressed:

1) Frequently syncing service spec could cause cluster performance issue, a one 
thousand component instances service can cause several thousands of DFS calls 
for any service state change. Even if the feature is on by demand, a 
non-malicious user can still cause big perf impact to a cluster.

2) Changes of this patch are very involved, all the way from native service to 
C code. It seems that local_dir is passed by user's command, is it possible a 
malicious user can use c-e to update service spec of other users which can 
potentially cause security issues? I haven't checked much details yet, just 
want to point out possibilities.

Instead of doing this, to me it gonna be very useful to write 
container-specific information to spec or even ENV, such as resource, etc. It 
requires much less changes and 70-80% of issues can be addressed. 

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch, YARN-8569.002.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-28 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595777#comment-16595777
 ] 

Eric Yang commented on YARN-8569:
-

Patch 2 is a fully working implementation.  When YARN service become stable, 
YARN service application master will populate the container information to 
/hadoop/yarn/sysfs/service.json.  I will add test case and documentation in 
patch 3.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch, YARN-8569.002.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-22 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589430#comment-16589430
 ] 

Eric Yang commented on YARN-8569:
-

>From today's docker meetup discussion, distributed cache is not ideal 
>interface for replicate frequently changing cluster information.  If the file 
>checksum changes due to cluster information update, the file may not get 
>replicated to distributed cache.  This information is more similar to token 
>generation and population instead of jar file distribution.  I will change the 
>population mechanism to align with token population instead of distributed 
>cache.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-20 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586769#comment-16586769
 ] 

Eric Yang commented on YARN-8569:
-

First patch for demo what the interface looks like.  Here is my testing json 
for launching the app:
{code}
{
  "name": "sleeper-service",
  "version": "1.0",
  "components" :
  [
{
  "name": "ping",
  "number_of_containers": 2,
  "artifact": {
"id": "hadoop/centos:latest",
"type": "DOCKER"
  },
  "launch_command": "sleep,1",
  "resource": {
"cpus": 1,
"memory": "256"
  },
  "restart_policy": "NEVER",
  "configuration": {
"env": {
  "YARN_CONTAINER_RUNTIME_DOCKER_DELAYED_REMOVAL":"true",
  "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true",
  "YARN_CONTAINER_RUNTIME_YARN_SYSFS":"true"
},
"properties": {
  "docker.network": "host"
}
  }
}
  ]
}
{code}

The patch will localize a copy of service.json in HDFS, and distribute it as a 
localized resource to nodes that are going to start the container.  The other 
copy of [appname].json is not used because the content of the file changes too 
frequently that triggers IOException while localizing.

Let me know if this is the direction that we want to continue.  If all looks 
good, I will provide live update to this file when application completes 
transition between STARTED, FLEXING or STABLE, etc.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-10 Thread Shane Kumpf (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16576681#comment-16576681
 ] 

Shane Kumpf commented on YARN-8569:
---

Thanks for filing this [~eyang]. I have a use case that could benefit from this 
as well.

When running in containers, one challenging piece is determining how much CPU 
and memory was allocated to the container. Traditional os tooling shows the 
totals from the host. This is especially problematic for tools like Ambari, 
which use os tooling to dynamically set configuration. Exposing the resource 
request details via this mechanism could be used to solve this problem.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-09 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575445#comment-16575445
 ] 

Eric Yang commented on YARN-8569:
-

Sysfs is a pseudo file system provided by Linux Kernel to expose system related 
information to user space.  YARN can mimic the same ideology to export cluster 
information to container.  The proposal is to expose cluster information to:

{code}
/hadoop/yarn/fs/cluster.json
{code}

This basically have the runtime information about the deployed application, and 
getting updated when state changes happen.  The file is replicated from hdfs to 
host system in appcache for the application.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-07-25 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556348#comment-16556348
 ] 

Eric Yang commented on YARN-8569:
-

[~oliverhuh...@gmail.com] Your approach works fine for application that carry 
hadoop client or zookeeper client.  This proposed interface is to lower the bar 
of entry to obtain cluster information for non-Hadoop native applications.  
This is the main reason to offer a file based interface for nodes.

The high level view of the design looks like this:
# Application Master received YARN service JSON from yarn cli.
# Application Master write the hostname information to YARN service JSON 
resides in /user/${USER}/.yarn/services/[service]/[service].json
# The file is added to distributed cache and localized during container launch.
# The file is bind-mount into docker container for consumption at a predefined 
location.
# Flex operation will trigger update of [service].json and repopulate 
distributed cache when nodes involved in the cluster has changed.

User application can poll file changes from docker container to be notified of 
cluster information changes.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-07-25 Thread Keqiu Hu (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556244#comment-16556244
 ] 

Keqiu Hu commented on YARN-8569:


I can see the value of sharing some information between cluster nodes with 
this, just want to throw how we tackle the problem. We did this by storing the 
information in application master, each worker node would have a TaskExecutor 
to heartbeat with AM to get latest cluster information.

How do you ensure the file is atomic, for example, multiple nodes can modify 
the mounted file the same time?

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>  Labels: Docker
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org