[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization

2018-04-04 Thread Shane Kumpf (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425420#comment-16425420
 ] 

Shane Kumpf commented on YARN-3289:
---

DockerContainerExecutor has been deprecated in branch-2 and removed in trunk. 
Closing this as a duplicate of YARN-3854, which was opened for the follow on 
Docker runtime effort.

> Docker images should be downloaded during localization
> --
>
> Key: YARN-3289
> URL: https://issues.apache.org/jira/browse/YARN-3289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>Priority: Major
>
> We currently call docker run on images while launching containers. If the 
> image size if sufficiently big, the task will timeout. We should download the 
> image we want to run during localization (if possible) to prevent this



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization

2015-04-05 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14445885#comment-14445885
 ] 

Chen He commented on YARN-3289:
---

Looks like the bottleneck of registry can be resolved by chaining registry 
[https://docs.docker.com/reference/api/hub_registry_spec/#chaining-registries]

> Docker images should be downloaded during localization
> --
>
> Key: YARN-3289
> URL: https://issues.apache.org/jira/browse/YARN-3289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>
> We currently call docker run on images while launching containers. If the 
> image size if sufficiently big, the task will timeout. We should download the 
> image we want to run during localization (if possible) to prevent this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization

2015-03-04 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347619#comment-14347619
 ] 

Chen He commented on YARN-3289:
---

Or, maybe, we add some module on NM that can automatically pull deltas from 
registry. User can configure the frequency and schedule.

> Docker images should be downloaded during localization
> --
>
> Key: YARN-3289
> URL: https://issues.apache.org/jira/browse/YARN-3289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>
> We currently call docker run on images while launching containers. If the 
> image size if sufficiently big, the task will timeout. We should download the 
> image we want to run during localization (if possible) to prevent this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization

2015-03-04 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347583#comment-14347583
 ] 

Chen He commented on YARN-3289:
---

Thank you for the quick feedback, [~jlowe].

{quote} What we're missing here is progress reporting during localization so 
AMs can properly monitor progress of container launch requests before their 
code starts running, and that's useful for non-docker localization scenarios as 
well.{quote}

I agree. That will be great. The idea that I proposed is based on the condition 
that we do not chance localization part.

{quote} One node may take tens of minutes to localize a docker image, but 
another node might only take a few seconds. Docker images are often derived 
from other images, and docker only downloads the deltas. So it will be 
difficult for YARN that is not aware of the docker contents of a node or image 
deltas to predict how long any node will take to localize a given docker image. 
So it will be difficult for YARN that is not aware of the docker contents of a 
node or image deltas to predict how long any node will take to localize a given 
docker image.{quote}

That is true. Docker image localization is a little bit different from other 
APP localization process (from HDFS to localFS). They all pull from docker 
registry. The network bandwidth from docker registry to each NM could be a 
bottleneck no matter whether the docker image deltas is large or small (we may 
need higher bandwidth, let's say 30G infi-band. But for a larger Hadoop 
cluster, more than 10 thousand task running, it may still be a problem). This 
is another reason that we need to consider docker image locality. 


> Docker images should be downloaded during localization
> --
>
> Key: YARN-3289
> URL: https://issues.apache.org/jira/browse/YARN-3289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>
> We currently call docker run on images while launching containers. If the 
> image size if sufficiently big, the task will timeout. We should download the 
> image we want to run during localization (if possible) to prevent this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization

2015-03-04 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347524#comment-14347524
 ] 

Jason Lowe commented on YARN-3289:
--

Regarding a separate prepping task, localization already is a separate 
preparation task for non-public resources.  See ContainerLocalizer.  I don't 
think docker image download and localization as is done today is fundamentally 
different at a high level -- in both cases we are prepping the node to be able 
to run the container.  No need to complicate the process with a specialized 
extra step just for docker.  What we're missing here is progress reporting 
during localization so AMs can properly monitor progress of container launch 
requests before their code starts running, and that's useful for non-docker 
localization scenarios as well.

Adjusting locality based on the cost of localization is an interesting idea, 
and applies to the non-docker case as well.  However the docker case can be a 
bit tricky.  One node may take tens of minutes to localize a docker image, but 
another node might only take a few seconds.  Docker images are often derived 
from other images, and docker only downloads the deltas.  So it will be 
difficult for YARN that is not aware of the docker contents of a node or image 
deltas to predict how long any node will take to localize a given docker image.

> Docker images should be downloaded during localization
> --
>
> Key: YARN-3289
> URL: https://issues.apache.org/jira/browse/YARN-3289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>
> We currently call docker run on images while launching containers. If the 
> image size if sufficiently big, the task will timeout. We should download the 
> image we want to run during localization (if possible) to prevent this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization

2015-03-04 Thread Chen He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347491#comment-14347491
 ] 

Chen He commented on YARN-3289:
---

Thank [~jlowe] for the comments. IMHO, we can move the docker image 
localization into a preparation task. 
If we are using DCE for running applications. For example, we have 10 task in a 
job, we create extra 1 "tasks" for each real task.
I mean, start a extra dummy task that can heartbeat and do the image 
downloading work. Once it is done, the real task can start to run. 

The benefit is that we can control the placement of those dummy tasks and 
achieve "data locality" for docker image localization. 
For example:
   we have node1 which has already downloaded the docker image and AM starts to 
run on it. If possible, RM scheduler should put other dummy and real task on 
this node since node1 has already has the image. Comparing with job input data 
(a block? maybe), the docker image "locality" (more than 10 min to download a 
image, it will be more than 2GB) may be more important. 

> Docker images should be downloaded during localization
> --
>
> Key: YARN-3289
> URL: https://issues.apache.org/jira/browse/YARN-3289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>
> We currently call docker run on images while launching containers. If the 
> image size if sufficiently big, the task will timeout. We should download the 
> image we want to run during localization (if possible) to prevent this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3289) Docker images should be downloaded during localization

2015-03-03 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14345578#comment-14345578
 ] 

Jason Lowe commented on YARN-3289:
--

There is no application-level (e.g.: MapReduce) task heartbeat during 
localization because the application code isn't running yet.  Downloading a 
large docker image during localization will still timeout, since the task can't 
heartbeat back to the AM to say it's making progress.

> Docker images should be downloaded during localization
> --
>
> Key: YARN-3289
> URL: https://issues.apache.org/jira/browse/YARN-3289
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Ravi Prakash
>
> We currently call docker run on images while launching containers. If the 
> image size if sufficiently big, the task will timeout. We should download the 
> image we want to run during localization (if possible) to prevent this



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)