[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16425434#comment-16425434 ] Shane Kumpf commented on YARN-2466: --- All of the sub-tasks and linked issues are closed as resolved or won't fix. DockerContainerExecutor has been deprecated in branch-2 and removed in trunk. I'm going to resolve this umbrella. > Umbrella issue for Yarn launched Docker Containers > -- > > Key: YARN-2466 > URL: https://issues.apache.org/jira/browse/YARN-2466 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.4.1 >Reporter: Abin Shahab >Assignee: Abin Shahab >Priority: Major > > Docker (https://www.docker.io/) is, increasingly, a very popular container > technology. > In context of YARN, the support for Docker will provide a very elegant > solution to allow applications to package their software into a Docker > container (entire Linux file system incl. custom versions of perl, python > etc.) and use it as a blueprint to launch all their YARN containers with > requisite software environment. This provides both consistency (all YARN > containers will have the same software environment) and isolation (no > interference with whatever is installed on the physical machine). > In addition to software isolation mentioned above, Docker containers will > provide resource, network, and user-namespace isolation. > Docker provides resource isolation through cgroups, similar to > LinuxContainerExecutor. This prevents one job from taking other jobs > resource(memory and CPU) on the same hadoop cluster. > User-namespace isolation will ensure that the root on the container is mapped > an unprivileged user on the host. This is currently being added to Docker. > Network isolation will ensure that one user’s network traffic is completely > isolated from another user’s network traffic. > Last but not the least, the interaction of Docker and Kerberos will have to > be worked out. These Docker containers must work in a secure hadoop > environment. > Additional details are here: > https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15924097#comment-15924097 ] Feng Yuan commented on YARN-2466: - If i can think this feature has been usable in 3.0.x version of yarn,or there is something block it? Hope reply! > Umbrella issue for Yarn launched Docker Containers > -- > > Key: YARN-2466 > URL: https://issues.apache.org/jira/browse/YARN-2466 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.4.1 >Reporter: Abin Shahab >Assignee: Abin Shahab > > Docker (https://www.docker.io/) is, increasingly, a very popular container > technology. > In context of YARN, the support for Docker will provide a very elegant > solution to allow applications to package their software into a Docker > container (entire Linux file system incl. custom versions of perl, python > etc.) and use it as a blueprint to launch all their YARN containers with > requisite software environment. This provides both consistency (all YARN > containers will have the same software environment) and isolation (no > interference with whatever is installed on the physical machine). > In addition to software isolation mentioned above, Docker containers will > provide resource, network, and user-namespace isolation. > Docker provides resource isolation through cgroups, similar to > LinuxContainerExecutor. This prevents one job from taking other jobs > resource(memory and CPU) on the same hadoop cluster. > User-namespace isolation will ensure that the root on the container is mapped > an unprivileged user on the host. This is currently being added to Docker. > Network isolation will ensure that one user’s network traffic is completely > isolated from another user’s network traffic. > Last but not the least, the interaction of Docker and Kerberos will have to > be worked out. These Docker containers must work in a secure hadoop > environment. > Additional details are here: > https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15681190#comment-15681190 ] Shane Kumpf commented on YARN-2466: --- Should this and the sub tasks be closed? or are there plans to continue to pursue this for earlier Hadoop versions that included the initial implementation? > Umbrella issue for Yarn launched Docker Containers > -- > > Key: YARN-2466 > URL: https://issues.apache.org/jira/browse/YARN-2466 > Project: Hadoop YARN > Issue Type: New Feature >Affects Versions: 2.4.1 >Reporter: Abin Shahab >Assignee: Abin Shahab > > Docker (https://www.docker.io/) is, increasingly, a very popular container > technology. > In context of YARN, the support for Docker will provide a very elegant > solution to allow applications to package their software into a Docker > container (entire Linux file system incl. custom versions of perl, python > etc.) and use it as a blueprint to launch all their YARN containers with > requisite software environment. This provides both consistency (all YARN > containers will have the same software environment) and isolation (no > interference with whatever is installed on the physical machine). > In addition to software isolation mentioned above, Docker containers will > provide resource, network, and user-namespace isolation. > Docker provides resource isolation through cgroups, similar to > LinuxContainerExecutor. This prevents one job from taking other jobs > resource(memory and CPU) on the same hadoop cluster. > User-namespace isolation will ensure that the root on the container is mapped > an unprivileged user on the host. This is currently being added to Docker. > Network isolation will ensure that one user’s network traffic is completely > isolated from another user’s network traffic. > Last but not the least, the interaction of Docker and Kerberos will have to > be worked out. These Docker containers must work in a secure hadoop > environment. > Additional details are here: > https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287931#comment-14287931 ] Chen He commented on YARN-2466: --- This is a good point. For isolation concern, I think we should not need to inform RM about this since it is just one type of ContainerExecutor, RM should look it as a general container as others (default, lxc, etc). But as [~eronwright] mentioned, we should find a way to avoid being killed because of timeout. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287941#comment-14287941 ] Chen He commented on YARN-2466: --- Hi [~ashahab], if you don't mind, I will create a sub-task JIRA to trace this issue. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288608#comment-14288608 ] Leitao Guo commented on YARN-2466: -- Currently, if I want to use DCE in my cluster, all the application should be running in DCE, that is not practical in our cluster. Can yarn.nodemanager.container-executor.class support configurable per application? So that, we can use DCE in some applications, others can still use LCE. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288615#comment-14288615 ] Beckham007 commented on YARN-2466: -- We use YARN-2718 for this. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286380#comment-14286380 ] Abin Shahab commented on YARN-2466: --- Yes, [~airbots], Please elaborate this 'interactive' switch, and why it cannot be accomplished using docker exec as [~eronwright] suggests? Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286398#comment-14286398 ] Chen He commented on YARN-2466: --- Sorry about the confusion. If a container is running, we don't want every one to access (either A or B) the container because of security concern, right? Then, what if we add a parameter (switch which can be controlled by admins) to YARN that can enable user to see the whole process of docker container running including but not limited to A) and B). Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286431#comment-14286431 ] Chen He commented on YARN-2466: --- {quote}(A) would be available only to the root account on the host, which I think is sufficient. {quote} It may work if it is just POC. What if we put it into production? Not all users are root or in root group, right? Do we need some work here? May be not in docker level. Docker changes pretty fast, we need to make sure we do not need to make corresponding changes on Hadoop side once Docker has changes. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286405#comment-14286405 ] Chen He commented on YARN-2466: --- Then, we satisfy the security concern and the debugging demand. That is what I mean. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286402#comment-14286402 ] Chen He commented on YARN-2466: --- Another idea for DCE. If we want to use DCE in production, another problem is the docker image update. If a job need an updated version of docker image for running container, we have to update docker image before we start the job. It may take time to pull from registry and apply changes. It will be good if DCE can check registry and download it (the one that this job needed, could be specified through conf or CLI ) before starting the container. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286422#comment-14286422 ] Eron Wright commented on YARN-2466: That is an interesting line of thinking. We should be aware of the impact on task metrics (i.e. whether the image pull time is included in reported task time), and whether YARN may mistakenly conclude that the task is too slow and kill it. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286337#comment-14286337 ] Abin Shahab commented on YARN-2466: --- [~eronwright] We are working to make it easy to use this ContainerExecutor. [~chenchun] Thanks for the patch. We have tried that approach, but changing this many interfaces is a bit intrusive and would not go in before Hadoop/Yarn-3.0.0( I think). Therefore we are thinking of an interim approach to give users the ability to run DockerContainerExecutor interchangeably. Abin Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286299#comment-14286299 ] Chen He commented on YARN-2466: --- Hi [~eronwright], thank you for the idea. The parameter is a security switch for debugging mode or other priority access. The name can be changed. But the most important thing is that we do not want all user to have the ability to access running container, right? Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286344#comment-14286344 ] Eron Wright commented on YARN-2466: Please elaborate on the effects that the hypothetical 'interactive' switch would have.I see two scenarios that you might be alluding to: A) attach to an interactive bash shell within the container, from the host. B) observe the console output from the container. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286415#comment-14286415 ] Eron Wright commented on YARN-2466: I am still unsure what the proposed effects would be at the docker level. If we do nothing, (A) would be available only to the root account on the host, which I think is sufficient. (B) should be made similar to how task output is currently handled (which is that only the job owner may view the output, I assume). Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286439#comment-14286439 ] Chen He commented on YARN-2466: --- Or maybe the minimum changes on Hadoop side if necessary. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285434#comment-14285434 ] Chun Chen commented on YARN-2466: - I think this feature is currently only a alpha version and the author [~ashahab] seems not active on this right now. I upload a patch to create a CompositeContainerExecutor to allow running different types of containers with different container executors at the same time. See https://issues.apache.org/jira/browse/YARN-2718. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285359#comment-14285359 ] Eron Wright commented on YARN-2466: I've been wondering about the portability of applications that take a dependence on this feature.Obviously the cluster must be configured to use the DockerContainerExecutor. Let's make it easy for the cluster administrator to agree to that. We should strive to ensure that enabling the new executor has few negative tradeoffs or compat issues with apps designed for the standard executor. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285340#comment-14285340 ] Eron Wright commented on YARN-2466: I propose that the new 'docker exec' command be the basis for debugging, not a switch as proposed above. The 'exec' command allows you to spawn a bash shell (or any command) into an existing container, with -it flags as desired. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284637#comment-14284637 ] Max commented on YARN-2466: --- It would be helpful but it should be switchable. It should be possible to activate for debugin. Turning it off will increase security for production systems. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284598#comment-14284598 ] Chen He commented on YARN-2466: --- Hi [~ashahab], do we need to add a configuration parameter that can enable the container in interactive mode. Such as: yarn.docker.interactive. Then, user can attach to the running container just for debugging concern. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2466) Umbrella issue for Yarn launched Docker Containers
[ https://issues.apache.org/jira/browse/YARN-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14284898#comment-14284898 ] Chen He commented on YARN-2466: --- Yes, that is what I mean. Thank you for clarifying it, [~mikhmv]. Umbrella issue for Yarn launched Docker Containers -- Key: YARN-2466 URL: https://issues.apache.org/jira/browse/YARN-2466 Project: Hadoop YARN Issue Type: New Feature Affects Versions: 2.4.1 Reporter: Abin Shahab Assignee: Abin Shahab Docker (https://www.docker.io/) is, increasingly, a very popular container technology. In context of YARN, the support for Docker will provide a very elegant solution to allow applications to package their software into a Docker container (entire Linux file system incl. custom versions of perl, python etc.) and use it as a blueprint to launch all their YARN containers with requisite software environment. This provides both consistency (all YARN containers will have the same software environment) and isolation (no interference with whatever is installed on the physical machine). In addition to software isolation mentioned above, Docker containers will provide resource, network, and user-namespace isolation. Docker provides resource isolation through cgroups, similar to LinuxContainerExecutor. This prevents one job from taking other jobs resource(memory and CPU) on the same hadoop cluster. User-namespace isolation will ensure that the root on the container is mapped an unprivileged user on the host. This is currently being added to Docker. Network isolation will ensure that one user’s network traffic is completely isolated from another user’s network traffic. Last but not the least, the interaction of Docker and Kerberos will have to be worked out. These Docker containers must work in a secure hadoop environment. Additional details are here: https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers -- This message was sent by Atlassian JIRA (v6.3.4#6332)