[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801020#comment-16801020
 ] 

Hudson commented on YARN-9391:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16277 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16277/])
YARN-9391.  Fixed node manager environment leaks into Docker containers. 
(eyang: rev 3c45762a0bfb403e069a03e30d35dd11432ee8b0)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java


> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.2.0, 3.1.1, 3.1.2
>Reporter: Eric Yang
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.3.0, 3.2.1, 3.1.3
>
> Attachments: YARN-9391.001.patch
>
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-25 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801008#comment-16801008
 ] 

Eric Badger commented on YARN-9391:
---

+1 lgtm

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-9391.001.patch
>
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-25 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800961#comment-16800961
 ] 

Eric Yang commented on YARN-9391:
-

+1 verified with mapreduce, non-entrypoint mode, and entrypoint mode.

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-9391.001.patch
>
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-25 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800957#comment-16800957
 ] 

Jim Brennan commented on YARN-9391:
---

[~ebadger], [~eyang] patch 001 is ready for review.

 

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-9391.001.patch
>
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-25 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800930#comment-16800930
 ] 

Hadoop QA commented on YARN-9391:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 
38s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 73m 59s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | YARN-9391 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12963640/YARN-9391.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux d23a5771029f 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e5d72f5 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/23799/testReport/ |
| Max. process+thread count | 317 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/23799/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> Disable PATH variable to be passed to 

[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-20 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797182#comment-16797182
 ] 

Jim Brennan commented on YARN-9391:
---

OK.  I will put up a patch to fix this issue.

 

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796620#comment-16796620
 ] 

Eric Yang commented on YARN-9391:
-

[~Jim_Brennan] Yes, this is the cause to add the NM variables to entrypoint 
docker container.

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796597#comment-16796597
 ] 

Jim Brennan commented on YARN-9391:
---

[~eyang] if the concern is only for Nodemanager white-list variables leaking 
through, it may be due to this code in ContainerExecutor.writeLaunchEnv():
{noformat}
  // Add the whitelist vars to the environment.  Do this after writing
  // environment variables so they are not written twice.
  for(String var : whitelistVars) {
if (!environment.containsKey(var)) {
  String val = getNMEnvVar(var);
  if (val != null) {
environment.put(var, val);
  }
}
  }
}
{noformat}
This is adding the white-listed variables to the environment map which gets 
passed to launchContainer. In the native and non-entry-point cases, I don't 
think this is necessary, but I am not 100% sure about that - we use the launch 
script in those cases. In the entry-point case, this code is what may be adding 
the white-list variables to the environment map, which you then pass raw to the 
container.  Note that it won't add variables that were already defined by the 
user.
 Do you think this might explain what you are seeing?

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796569#comment-16796569
 ] 

Eric Yang commented on YARN-9391:
-

In non-entry point mode docker, container works more like a chroot environment. 
 If someone wants to run mapreduce task in docker container.  This would be the 
mode to support that use case.  It is likely that host level Java and 
executable are available in the same location as the host.  In this case, it 
make sense to pass in host level environment variables to docker container.

If we are in agreement with the operating style of non-entrypoint docker, then 
we only need to make code changes for filtering for entrypoint based docker 
container like [~Jim_Brennan] stated.  [~Jim_Brennan] would you like to take 
ownership of this one?

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796560#comment-16796560
 ] 

Jim Brennan commented on YARN-9391:
---

[~ebadger] I assume you are referring to the PATH variable in particular, not 
to all environment variables.

I think in this case the PATH variable is being included in the whitelist for 
the NM (for the non-docker case that [~eyang] mentioned above).   I agree that 
it is unlikely that we really need the NM's PATH variable inside of a docker 
container.   I'm a little reluctant to have the NM just skip that whitelist 
entry for docker/oci runtimes.   Currently the whitelist environment variable 
processing/script writing is done in ContainerLaunch/ContainerExecutor.   What 
is there works for the non-entry-point case (as long as the PATH is defined in 
the image, which I think we are assuming here).

For the entry point case, the environment is added to the run command in 
DockerLinuxContainerRuntime.launchContainer().

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796533#comment-16796533
 ] 

Eric Badger commented on YARN-9391:
---

bq. I think this issue is specific to the entry-point case where whitelist 
variables override those specified in the image.

In what circumstances does it make sense for the docker container to use the 
environment variables that are specified on the host? The docker image could be 
wildly different than the layout on the host. I don't think that we can use the 
assumption that the docker image and the host are going to have similar 
layouts. So that makes me want to not use the environment variables of the NM 
unless the job explicitly asks for them (or possibly not even then. Then they 
would just specify them in the environment for their job). 

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796521#comment-16796521
 ] 

Jim Brennan commented on YARN-9391:
---

{quote}
The whitelist needs to behave differently for docker containers and non-docker 
containers.
{quote}

[~ebadger] I'm not sure this is what we want.   It already does behave 
differently in that for non-Entry-Point docker, the docker image can override 
whitelist variables.

I think this issue is specific to the entry-point case where whitelist 
variables override those specified in the image.

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796467#comment-16796467
 ] 

Eric Badger commented on YARN-9391:
---

bq. When filtering PATH variable from environment white list, it has some 
undesired side effects for mapreduce style workload outside of docker 
container. For example, streaming task that depends on python will not work 
anymore.

Yep, that makes sense. The whitelist needs to behave differently for docker 
containers and non-docker containers.

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Eric Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796238#comment-16796238
 ] 

Eric Yang commented on YARN-9391:
-

[~ebadger] When filtering PATH variable from environment white list, it has 
some undesired side effects for mapreduce style workload outside of docker 
container.  For example, streaming task that depends on python will not work 
anymore.

If we are looking at the problem by partition of container types, the desired 
outcome looks more like this:

| | Linux Container | Docker without EntryPoint | Docker with EntryPoint |
| Allowed variables | All white listed variables | All white listed variables + 
Docker specific variables + YARN User defined variables | Subset of white 
listed variables + YARN Docker specific variables + User defined variables |
| Shell expansion of variables | Yes | Yes | No |

It looks like the subset of variables to pass for entrypoint mode is LANG and 
TZ only . The rest will have undesired side effects.

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Jim Brennan (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796227#comment-16796227
 ] 

Jim Brennan commented on YARN-9391:
---

[~ebadger] you are correct.  It doesn't look like we explicitly add PATH to the 
container environment unless is is specified in {{yarn.nodemanager.admin-env}} 
or, as you say, if it is specified in {{yarn.nodemanager.env-whitelist}}.
[~eyang] do you know where the PATH variable is coming from in this case?



> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container

2019-03-19 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796216#comment-16796216
 ] 

Eric Badger commented on YARN-9391:
---

We can already do this today by changing the whitelist to not include PATH, 
right [~Jim_Brennan]? 

I agree with [~eyang] that the PATH variable (or anything really) outside of 
the container shouldn't really be relevant inside of the container. Ideally, 
the image should define PATH so that it will override what the NM has. But, in 
the case that it isn't set, I'm not sure falling back to the NM PATH is the 
correct thing to do. At the best it's masking failures and at the worst it's 
leaking environment variable info about the host. 

And just a note, if PATH is set in the image, it will be selected over what is 
set in the whitelist. The only way this isn't true is if the variable was 
explicitly set by the user. [~Jim_Brennan] can correct me if I'm wrong on this. 

> Disable PATH variable to be passed to Docker container
> --
>
> Key: YARN-9391
> URL: https://issues.apache.org/jira/browse/YARN-9391
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Priority: Major
>
> This is observed from using Apache NiFi docker image.  It makes assumption 
> that PATH variable contains /bin to reference to system utility.  Where host 
> YARN environment PATH variable is default to leaked into container by 
> accident and not containing /bin path (default configuration).  In general, 
> it seems like node manager should block PATH variable from leaking into 
> container.  Not sure if there is a valid use case that host PATH variable 
> must leak into container from docker point of view.  From Hadoop point of 
> view, if container is merely a chroot, and container is a mirror image of 
> host worker dir.  It is good to keep host PATH variable the same.
> Maybe we want to be more specific that block PATH variable to leak into 
> Docker container, if it is using ENTRYPOINT only?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org