[jira] [Comment Edited] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255681#comment-16255681 ] Shane Kumpf edited comment on YARN-5534 at 11/16/17 5:45 PM: - {quote} We can check the origin of the docker image, if it comes from private registry, image name that starts with hostname of private registry, then we allow white list volumes. {quote} IMO the hosted docker private repositories should be allowed. Checking that the image isn't from docker.io would be problematic for that case. The docker hub private repository solution gives users a private space to store images without needing to manage the private registry infrastructure themselves. Using the docker hub private repositories also gives the user vulnerability scanning "for free", so it's appealing to new users where pull bandwidth isn't of major concern. IMO, this is a pretty safe alternative to running your own private registry. As [~ebadger] mentioned, there are other items that need to change to support these types of images beyond the whitelist; don't override the CMD/ENTRYPOINT, don't bind mount the container log dir, usercache, appcache, don't override the --user option, etc. I would prefer if we worked through those details holistically on a separate JIRA and see if it's even necessary to support that use case. was (Author: shaneku...@gmail.com): {code} We can check the origin of the docker image, if it comes from private registry, image name that starts with hostname of private registry, then we allow white list volumes. {code} IMO the hosted docker private repositories should be allowed. Checking that the image isn't from docker.io would be problematic for that case. The docker hub private repository solution gives users a private space to store images without needing to manage the private registry infrastructure themselves. Using the docker hub private repositories also gives the user vulnerability scanning "for free", so it's appealing to new users where pull bandwidth isn't of major concern. IMO, this is a pretty safe alternative to running your own private registry. As [~ebadger] mentioned, there are other items that need to change to support these types of images beyond the whitelist; don't override the CMD/ENTRYPOINT, don't bind mount the container log dir, usercache, appcache, don't override the --user option, etc. I would prefer if we worked through those details holistically on a separate JIRA and see if it's even necessary to support that use case. > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch, > YARN-5534.003.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168728#comment-16168728 ] Miklos Szegedi edited comment on YARN-5534 at 9/18/17 7:25 PM: --- [~eyang] I would approach this from the user point of view. container-executor and container-executor.cfg should govern the rules, how the yarn user can get root access or access it does not have otherwise. If the yarn user cannot access a directory, then mounting it should be whitelisted in container-executor.cfg. was (Author: miklos.szeg...@cloudera.com): [~eyang] I would approach this from the user point of new. container-executor and container-executor.cfg should govern the rules, how the yarn user can get root access or access it does not have otherwise. If the yarn user cannot access a directory, then mounting it should be whitelisted in container-executor.cfg. > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch, > YARN-5534.003.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168584#comment-16168584 ] Eric Yang edited comment on YARN-5534 at 9/15/17 9:58 PM: -- Yarn-site.xml and core-site.xml are trusted configuration from Hdoop server point of view. Hadoop Kerberos, and proxy users configuration are defined in the *.xml files. WIthout trusting those configurations, Hadoop security fall apart. There is keyword like final to lock configuration in place therefore an overlay of Hadoop configuration can not alter the configuration values. Volume white list in core-site.xml or yarn-site.xml are secured. There should be very little configuration in container-executor.cfg file to govern uid and banned user. The rest of the logic in core-site.xml is preferred to ensure the logic is preprocessed in yarn user before handing off to root for execution. YARN can act as a shielding user in pre-processing to make exploits more difficult. was (Author: eyang): Yarn-site.xml and core-site.xml are trusted configuration from Hdoop server point of view. Hadoop Kerberos, and proxy users configuration are defined in the *.xml files. WIthout trusting those configurations, Hadoop security fall apart. There is keyword like final to lock configuration in place therefore an overlay of Hadoop configuration can not alter the configuration values. Volume white list in core-site.xml or yarn-site.xml are secured. There should be very little configuration in container-executor.cfg file to govern uid and banned user. The rest of the logic in core-site.xml is preferred to ensure the logic is preprocessed in yarn user before handing off to root for execution. > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch, > YARN-5534.003.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113114#comment-16113114 ] Miklos Szegedi edited comment on YARN-5534 at 8/3/17 5:18 PM: -- Thank you, [~shaneku...@gmail.com] and [~vinodkv] for the details. As Shane said, Java knows the configuration letting launch the container and seeing it fail in C. If the system is sending so many invalid privileged requests that it affects system performance because of this, there is already something wrong with that system. However, one more thing. While having a general config to enable/disable privileged is nice, I think eventually admins will need to specify the users that should be allowed to elevate to privileged. This can be applied probably to the whitelist as well. Sorry for raising too many design questions late in the development. was (Author: miklos.szeg...@cloudera.com): Thank you, [~shaneku...@gmail.com] and [~vinodkv] for the details. As Shane said, Java knows the configuration letting launch the container and seeing it fail in C. If the system is sending so many invalid privileged requests that it affects system performance because of this, there is already something wrong with that system. However, one more thing. While having a general config to enable/disable privileged is nice, I think eventually admins will need to specify the users that should be allowed to elevate to privileged. This can be applied probably on the whitelist as well. Sorry for raising too many design questions late in the development. > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: Shane Kumpf > Attachments: YARN-5534.001.patch, YARN-5534.002.patch, > YARN-5534.003.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15835291#comment-15835291 ] Daniel Templeton edited comment on YARN-5534 at 1/26/17 6:56 PM: - Thanks for updating the patch, [~luhuichun]. The new patch matches more with what I had in mind. There are still a couple of issues: * In {{YarnConfiguration}}, you should add a Javadoc comment for the new property * In {{DLCR.isArbitraryMount()}}, instead of the _for_ loop, you should use a _foreach_. * {{DSCR.isArbitraryMount()}} always returns {{false}}. {code}File child = new File(mount); for (int i = 0; i < whiteList.length; i++){ File parent = new File(mount); if (isSubDirectory(parent, child)){ return false; } }{code} It's always true that {{parent.equals(child)}}, so {{isSubdirectory()}} will always return true. * You should parse the white list property once and store it instead of doing it every time the method is called * Instead of using {{String.split()}}, you could use {{Pattern.split()}} to allow for whitespace, making it a little more user friendly * Look at http://stackoverflow.com/questions/26530445/compare-directories-to-check-if-one-is-sub-directory-of-another for ideas of how to do the subdirectory check more efficiently. I like the {{Set}} idea. * In {{DLCR.isSubdirectory()}}, the naming around _child_, _parent_, and _parentFile_ is pretty confusing. * In {{DLCR.isSubdirectory()}}, printing the stack trace is a bad idea. Do something more useful. At a minimum, log the exception instead of printing the stack trace to stderr. * You should have a space before the curly brace throughout, e.g. {code} if (parent.equals(parentFile)){ {code} was (Author: templedf): Thanks for updating the patch, [~luhuichun]. The new patch matches more with what I had in mind. There are still a couple of issues: * In {{YarnConfiguration}}, you should add a Javadoc comment for the new property * In {{DLCR.isArbitraryMount()}}, instead of the _for_ loop, you should use a _foreach_. * {{DSCR.isArbitraryMount()}} always returns {{false}}. {code}File child = new File(mount); for (int i = 0; i < whiteList.length; i++){ File parent = new File(mount); if (isSubDirectory(parent, child)){ return false; } }{code} It's always true that {{parent.equals(child)}}, so {{isSubdirectory()}} will always return true. * You should parse the white list property once and store it instead of doing it every time the method is called * Instead of using {{String.split()}}, you could use {{Pattern.split()}} to allow for whitespace, making it a little more user friendly * Look at http://stackoverflow.com/questions/26530445/compare-directories-to-check-if-one-is-sub-directory-of-another for ideas of how to do the subdirectory check more efficiently. I like the {{Set}} idea. * In {{DLCR.isSubdirectory()}}, the naming around _child_, _parent_, and _parentFile_ is pretty confusing. * In {{DLCR.isSubdirectory()}}, printing the stack trace in a bad idea. Do something more useful. At a minimum, log the exception instead of printing it in stderr. * You should have a space before the curly brace throughout, e.g. {code} if (parent.equals(parentFile)){ {code} > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: luhuichun > Attachments: YARN-5534.001.patch, YARN-5534.002.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-5534) Allow whitelisted volume mounts
[ https://issues.apache.org/jira/browse/YARN-5534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15681421#comment-15681421 ] Daniel Templeton edited comment on YARN-5534 at 11/20/16 4:39 PM: -- Thanks for posting the patch, [~luhuichun]. Sorry for taking so long to get around to reviewing it. I apparently also misread the issue description the first time. Given that the current volume mounts only allow mounting directories from the set of localized files, I'm not sure additional white listing is all that useful. And given that YARN-5298 already mounts all the localized directories, I'm not sure this JIRA will actually change anything. What I originally thought I read, and what I think *would* be useful, is allowing _arbitrary_ volume mounts from a whitelist, not just mounting localized resources. For example, If I'm going to use a Docker image to execute MR jobs, I have to install Hadoop in that image. When I upgrade my cluster, I then have to upgrade or recreate all my Docker images. If the Hadoop directories were mountable, I could let YARN mount them from the host machine and not have to worry about it. was (Author: templedf): Thanks for posting the patch, [~luhuichun]. Sorry for taking so long to get around to reviewing it. I apparently also misread the issue description the first time. Given that the current volume mounts only allow mounting directories from the set of localized files, I'm sot sure additional white listing is all that useful. And given that YARN-5298 already mounts all the localized directories, I'm not sure this JIRA will actually change anything. What I originally thought I read, and what I think *would* be useful, is allowing _arbitrary_ volume mounts from a whitelist, not just mounting localized resources. For example, If I'm going to use a Docker image to execute MR jobs, I have to install Hadoop in that image. When I upgrade my cluster, I then have to upgrade or recreate all my Docker images. If the Hadoop directories were mountable, I could let YARN mount them in and not have to worry about it. > Allow whitelisted volume mounts > > > Key: YARN-5534 > URL: https://issues.apache.org/jira/browse/YARN-5534 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: luhuichun >Assignee: luhuichun > Attachments: YARN-5534.001.patch > > > Introduction > Mounting files or directories from the host is one way of passing > configuration and other information into a docker container. > We could allow the user to set a list of mounts in the environment of > ContainerLaunchContext (e.g. /dir1:/targetdir1,/dir2:/targetdir2). > These would be mounted read-only to the specified target locations. This has > been resolved in YARN-4595 > 2.Problem Definition > Bug mounting arbitrary volumes into a Docker container can be a security risk. > 3.Possible solutions > one approach to provide safe mounts is to allow the cluster administrator to > configure a set of parent directories as white list mounting directories. > Add a property named yarn.nodemanager.volume-mounts.white-list, when > container executor do mount checking, only the allowed directories or > sub-directories can be mounted. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org