[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801020#comment-16801020 ] Hudson commented on YARN-9391: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16277 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16277/]) YARN-9391. Fixed node manager environment leaks into Docker containers. (eyang: rev 3c45762a0bfb403e069a03e30d35dd11432ee8b0) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/ContainerExecutor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.2.0, 3.1.1, 3.1.2 >Reporter: Eric Yang >Assignee: Jim Brennan >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: YARN-9391.001.patch > > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801008#comment-16801008 ] Eric Badger commented on YARN-9391: --- +1 lgtm > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-9391.001.patch > > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800961#comment-16800961 ] Eric Yang commented on YARN-9391: - +1 verified with mapreduce, non-entrypoint mode, and entrypoint mode. > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-9391.001.patch > > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800957#comment-16800957 ] Jim Brennan commented on YARN-9391: --- [~ebadger], [~eyang] patch 001 is ready for review. > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Jim Brennan >Priority: Major > Attachments: YARN-9391.001.patch > > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800930#comment-16800930 ] Hadoop QA commented on YARN-9391: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 13s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 32s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 20m 38s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 73m 59s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f | | JIRA Issue | YARN-9391 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12963640/YARN-9391.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux d23a5771029f 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e5d72f5 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_191 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/23799/testReport/ | | Max. process+thread count | 317 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/23799/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Disable PATH variable to be passed to
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16797182#comment-16797182 ] Jim Brennan commented on YARN-9391: --- OK. I will put up a patch to fix this issue. > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796620#comment-16796620 ] Eric Yang commented on YARN-9391: - [~Jim_Brennan] Yes, this is the cause to add the NM variables to entrypoint docker container. > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796597#comment-16796597 ] Jim Brennan commented on YARN-9391: --- [~eyang] if the concern is only for Nodemanager white-list variables leaking through, it may be due to this code in ContainerExecutor.writeLaunchEnv(): {noformat} // Add the whitelist vars to the environment. Do this after writing // environment variables so they are not written twice. for(String var : whitelistVars) { if (!environment.containsKey(var)) { String val = getNMEnvVar(var); if (val != null) { environment.put(var, val); } } } } {noformat} This is adding the white-listed variables to the environment map which gets passed to launchContainer. In the native and non-entry-point cases, I don't think this is necessary, but I am not 100% sure about that - we use the launch script in those cases. In the entry-point case, this code is what may be adding the white-list variables to the environment map, which you then pass raw to the container. Note that it won't add variables that were already defined by the user. Do you think this might explain what you are seeing? > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796569#comment-16796569 ] Eric Yang commented on YARN-9391: - In non-entry point mode docker, container works more like a chroot environment. If someone wants to run mapreduce task in docker container. This would be the mode to support that use case. It is likely that host level Java and executable are available in the same location as the host. In this case, it make sense to pass in host level environment variables to docker container. If we are in agreement with the operating style of non-entrypoint docker, then we only need to make code changes for filtering for entrypoint based docker container like [~Jim_Brennan] stated. [~Jim_Brennan] would you like to take ownership of this one? > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796560#comment-16796560 ] Jim Brennan commented on YARN-9391: --- [~ebadger] I assume you are referring to the PATH variable in particular, not to all environment variables. I think in this case the PATH variable is being included in the whitelist for the NM (for the non-docker case that [~eyang] mentioned above). I agree that it is unlikely that we really need the NM's PATH variable inside of a docker container. I'm a little reluctant to have the NM just skip that whitelist entry for docker/oci runtimes. Currently the whitelist environment variable processing/script writing is done in ContainerLaunch/ContainerExecutor. What is there works for the non-entry-point case (as long as the PATH is defined in the image, which I think we are assuming here). For the entry point case, the environment is added to the run command in DockerLinuxContainerRuntime.launchContainer(). > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796533#comment-16796533 ] Eric Badger commented on YARN-9391: --- bq. I think this issue is specific to the entry-point case where whitelist variables override those specified in the image. In what circumstances does it make sense for the docker container to use the environment variables that are specified on the host? The docker image could be wildly different than the layout on the host. I don't think that we can use the assumption that the docker image and the host are going to have similar layouts. So that makes me want to not use the environment variables of the NM unless the job explicitly asks for them (or possibly not even then. Then they would just specify them in the environment for their job). > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796521#comment-16796521 ] Jim Brennan commented on YARN-9391: --- {quote} The whitelist needs to behave differently for docker containers and non-docker containers. {quote} [~ebadger] I'm not sure this is what we want. It already does behave differently in that for non-Entry-Point docker, the docker image can override whitelist variables. I think this issue is specific to the entry-point case where whitelist variables override those specified in the image. > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796467#comment-16796467 ] Eric Badger commented on YARN-9391: --- bq. When filtering PATH variable from environment white list, it has some undesired side effects for mapreduce style workload outside of docker container. For example, streaming task that depends on python will not work anymore. Yep, that makes sense. The whitelist needs to behave differently for docker containers and non-docker containers. > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796238#comment-16796238 ] Eric Yang commented on YARN-9391: - [~ebadger] When filtering PATH variable from environment white list, it has some undesired side effects for mapreduce style workload outside of docker container. For example, streaming task that depends on python will not work anymore. If we are looking at the problem by partition of container types, the desired outcome looks more like this: | | Linux Container | Docker without EntryPoint | Docker with EntryPoint | | Allowed variables | All white listed variables | All white listed variables + Docker specific variables + YARN User defined variables | Subset of white listed variables + YARN Docker specific variables + User defined variables | | Shell expansion of variables | Yes | Yes | No | It looks like the subset of variables to pass for entrypoint mode is LANG and TZ only . The rest will have undesired side effects. > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796227#comment-16796227 ] Jim Brennan commented on YARN-9391: --- [~ebadger] you are correct. It doesn't look like we explicitly add PATH to the container environment unless is is specified in {{yarn.nodemanager.admin-env}} or, as you say, if it is specified in {{yarn.nodemanager.env-whitelist}}. [~eyang] do you know where the PATH variable is coming from in this case? > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9391) Disable PATH variable to be passed to Docker container
[ https://issues.apache.org/jira/browse/YARN-9391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16796216#comment-16796216 ] Eric Badger commented on YARN-9391: --- We can already do this today by changing the whitelist to not include PATH, right [~Jim_Brennan]? I agree with [~eyang] that the PATH variable (or anything really) outside of the container shouldn't really be relevant inside of the container. Ideally, the image should define PATH so that it will override what the NM has. But, in the case that it isn't set, I'm not sure falling back to the NM PATH is the correct thing to do. At the best it's masking failures and at the worst it's leaking environment variable info about the host. And just a note, if PATH is set in the image, it will be selected over what is set in the whitelist. The only way this isn't true is if the variable was explicitly set by the user. [~Jim_Brennan] can correct me if I'm wrong on this. > Disable PATH variable to be passed to Docker container > -- > > Key: YARN-9391 > URL: https://issues.apache.org/jira/browse/YARN-9391 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Priority: Major > > This is observed from using Apache NiFi docker image. It makes assumption > that PATH variable contains /bin to reference to system utility. Where host > YARN environment PATH variable is default to leaked into container by > accident and not containing /bin path (default configuration). In general, > it seems like node manager should block PATH variable from leaking into > container. Not sure if there is a valid use case that host PATH variable > must leak into container from docker point of view. From Hadoop point of > view, if container is merely a chroot, and container is a mirror image of > host worker dir. It is good to keep host PATH variable the same. > Maybe we want to be more specific that block PATH variable to leak into > Docker container, if it is using ENTRYPOINT only? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org