[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization
[ https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425420#comment-16425420 ] Shane Kumpf commented on YARN-3289: --- DockerContainerExecutor has been deprecated in branch-2 and removed in trunk. Closing this as a duplicate of YARN-3854, which was opened for the follow on Docker runtime effort. > Docker images should be downloaded during localization > -- > > Key: YARN-3289 > URL: https://issues.apache.org/jira/browse/YARN-3289 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ravi Prakash >Priority: Major > > We currently call docker run on images while launching containers. If the > image size if sufficiently big, the task will timeout. We should download the > image we want to run during localization (if possible) to prevent this -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization
[ https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14445885#comment-14445885 ] Chen He commented on YARN-3289: --- Looks like the bottleneck of registry can be resolved by chaining registry [https://docs.docker.com/reference/api/hub_registry_spec/#chaining-registries] > Docker images should be downloaded during localization > -- > > Key: YARN-3289 > URL: https://issues.apache.org/jira/browse/YARN-3289 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ravi Prakash > > We currently call docker run on images while launching containers. If the > image size if sufficiently big, the task will timeout. We should download the > image we want to run during localization (if possible) to prevent this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization
[ https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347619#comment-14347619 ] Chen He commented on YARN-3289: --- Or, maybe, we add some module on NM that can automatically pull deltas from registry. User can configure the frequency and schedule. > Docker images should be downloaded during localization > -- > > Key: YARN-3289 > URL: https://issues.apache.org/jira/browse/YARN-3289 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ravi Prakash > > We currently call docker run on images while launching containers. If the > image size if sufficiently big, the task will timeout. We should download the > image we want to run during localization (if possible) to prevent this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization
[ https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347583#comment-14347583 ] Chen He commented on YARN-3289: --- Thank you for the quick feedback, [~jlowe]. {quote} What we're missing here is progress reporting during localization so AMs can properly monitor progress of container launch requests before their code starts running, and that's useful for non-docker localization scenarios as well.{quote} I agree. That will be great. The idea that I proposed is based on the condition that we do not chance localization part. {quote} One node may take tens of minutes to localize a docker image, but another node might only take a few seconds. Docker images are often derived from other images, and docker only downloads the deltas. So it will be difficult for YARN that is not aware of the docker contents of a node or image deltas to predict how long any node will take to localize a given docker image. So it will be difficult for YARN that is not aware of the docker contents of a node or image deltas to predict how long any node will take to localize a given docker image.{quote} That is true. Docker image localization is a little bit different from other APP localization process (from HDFS to localFS). They all pull from docker registry. The network bandwidth from docker registry to each NM could be a bottleneck no matter whether the docker image deltas is large or small (we may need higher bandwidth, let's say 30G infi-band. But for a larger Hadoop cluster, more than 10 thousand task running, it may still be a problem). This is another reason that we need to consider docker image locality. > Docker images should be downloaded during localization > -- > > Key: YARN-3289 > URL: https://issues.apache.org/jira/browse/YARN-3289 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ravi Prakash > > We currently call docker run on images while launching containers. If the > image size if sufficiently big, the task will timeout. We should download the > image we want to run during localization (if possible) to prevent this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization
[ https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347524#comment-14347524 ] Jason Lowe commented on YARN-3289: -- Regarding a separate prepping task, localization already is a separate preparation task for non-public resources. See ContainerLocalizer. I don't think docker image download and localization as is done today is fundamentally different at a high level -- in both cases we are prepping the node to be able to run the container. No need to complicate the process with a specialized extra step just for docker. What we're missing here is progress reporting during localization so AMs can properly monitor progress of container launch requests before their code starts running, and that's useful for non-docker localization scenarios as well. Adjusting locality based on the cost of localization is an interesting idea, and applies to the non-docker case as well. However the docker case can be a bit tricky. One node may take tens of minutes to localize a docker image, but another node might only take a few seconds. Docker images are often derived from other images, and docker only downloads the deltas. So it will be difficult for YARN that is not aware of the docker contents of a node or image deltas to predict how long any node will take to localize a given docker image. > Docker images should be downloaded during localization > -- > > Key: YARN-3289 > URL: https://issues.apache.org/jira/browse/YARN-3289 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ravi Prakash > > We currently call docker run on images while launching containers. If the > image size if sufficiently big, the task will timeout. We should download the > image we want to run during localization (if possible) to prevent this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization
[ https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347491#comment-14347491 ] Chen He commented on YARN-3289: --- Thank [~jlowe] for the comments. IMHO, we can move the docker image localization into a preparation task. If we are using DCE for running applications. For example, we have 10 task in a job, we create extra 1 "tasks" for each real task. I mean, start a extra dummy task that can heartbeat and do the image downloading work. Once it is done, the real task can start to run. The benefit is that we can control the placement of those dummy tasks and achieve "data locality" for docker image localization. For example: we have node1 which has already downloaded the docker image and AM starts to run on it. If possible, RM scheduler should put other dummy and real task on this node since node1 has already has the image. Comparing with job input data (a block? maybe), the docker image "locality" (more than 10 min to download a image, it will be more than 2GB) may be more important. > Docker images should be downloaded during localization > -- > > Key: YARN-3289 > URL: https://issues.apache.org/jira/browse/YARN-3289 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ravi Prakash > > We currently call docker run on images while launching containers. If the > image size if sufficiently big, the task will timeout. We should download the > image we want to run during localization (if possible) to prevent this -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization
[ https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345578#comment-14345578 ] Jason Lowe commented on YARN-3289: -- There is no application-level (e.g.: MapReduce) task heartbeat during localization because the application code isn't running yet. Downloading a large docker image during localization will still timeout, since the task can't heartbeat back to the AM to say it's making progress. > Docker images should be downloaded during localization > -- > > Key: YARN-3289 > URL: https://issues.apache.org/jira/browse/YARN-3289 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Ravi Prakash > > We currently call docker run on images while launching containers. If the > image size if sufficiently big, the task will timeout. We should download the > image we want to run during localization (if possible) to prevent this -- This message was sent by Atlassian JIRA (v6.3.4#6332)