> On Sept. 1, 2014, 9:57 p.m., Tom Arnfeld wrote: > > Couple of points, please correct me if I've misunderstood anything :) > > > > Can you not just do a `docker run .. {image} ..` and let docker take care > > of pulling the image if needed? By default, docker will pull the image if > > one with the same registry/repo/tag combo doesn't exist. > > > > The assumption here is that an image (comprised of {registry + repository + > > tag}) is never going to change. For example, the default tag used by docker > > is `latest`, which suggests to me that you can push new versions of your > > image to a registry, and update the `latest` tag to point to the new image. > > After this change in mesos, I would need to log in to every mesos slave > > that had ever downloaded this image, and run a `docker pull`. > > > > The alternative is of course to use new tags for every new image (e.g. git > > hashes). Though this means I need to update every framework that has been > > configured with docker image names and change them to the new tag. I can > > see the appeal of this approach when thinking soley about service > > schedulers, because it could be problematic to control a rolling release if > > any new task will automatically run the new image (as it takes the latest > > image from the registry). > > > > I've actually raised this issue several times with various people in the > > docker community and never managed to get a concrete answer other than just > > run `docker pull` every time (which is what we've been doing outside of > > mesos). I think the difference between these use cases needs to be given > > some serious thought, as it's caused us pain in various ways, hence why we > > ended up running `docker pull` before every task to avoid the problem. > > > > A working example would be the redis repository > > (https://registry.hub.docker.com/v1/repositories/redis/tags), you'll see > > that the `latest` tag is pointing at version 2.8. This tag is updated every > > time a new image is published, and if I were to use the `latest` tag (or > > not specify a tag, since it's the default) I would need to either explicity > > change my deployment of redis to use a strict version, or manually `docker > > pull` on all slaves and restart all the tasks using this container image. > > > > It's also important to take into consideration long running frameworks like > > Hadoop on Mesos, if this change were to be merged, and to avoid logging > > into every slave and running `docker pull` we would need to restart the > > JobTracker and change the image to a newer (never previously used) tag. As > > opposed to new TaskTrackers automatically being launched inside the new > > image. > > > > I guess a fair amount of this depends on what you're expecting to get from > > using Docker. Software deployment or just dependency management and > > isolation? > > > > I'm not against running `docker inspect && docker pull` on every slave in > > the cluster, but I'd like the requirement to do that to be chosen. Perhaps > > you guys have already had this discussion... I'm very interested to see > > what others have been doing to solve this problem. > > Timothy Chen wrote: > Hi Tom, there are definitely lots of trade off questions and honestly I > don't think there are obvious choices. > We could allow docker pull on each run which we originally did, but hits > several problems like relying on registry server to be up at all times which > proves to be not the case. It also has limitations of the scalability of > registry server, as well as no longer allowing anyone to run local images. > > However, without a pull you don't necessarily get the very latest tag if > you simply specify no tag. > > Currently Docker run's semantics as you mentioned, doesn't auto pull if > it already exists locally and I'm simply matching that for now. If users > really want to gurantee what image you're running I think specifying the > exact tag for your image is the best way to go, and not relying on latest as > that's not reliable since even Docker run doesn't do it. > > It's sure can be optional, but so far from all the use cases I've heard > no one has required a docker pull on each run and most people are suprised on > why we pull each time. I'm trying not to expose too much knobs that are not > necessary. > > And answering your docker run {image} point, we intentionally seperate > the docker image pulling and running into two phases as we like to know what > exact phase the docker process is doing, and also it's easier to reason with > when integrated into Mesos as we need to handle a container being destroyed > in any point of time. > > Tom Arnfeld wrote: > All very fair points, thanks for the clarification.
Thanks for the comments! I think it's valuable to keep discussing these, and options are still wide open as well, nothing is set in stone :) - Timothy ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25237/#review52004 ----------------------------------------------------------- On Sept. 1, 2014, 7:16 p.m., Timothy Chen wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/25237/ > ----------------------------------------------------------- > > (Updated Sept. 1, 2014, 7:16 p.m.) > > > Review request for mesos, Benjamin Hindman and Jie Yu. > > > Repository: mesos-git > > > Description > ------- > > Avoid Docker pull on each run. > > Currently each Docker run will run a docker pull which calls the docker > registry each time. > To avoid this this patch adds a docker inspect <image> and skip calling pull > if it already exists. > > > Diffs > ----- > > src/slave/containerizer/docker.cpp 0febbac5df4126f6c8d9a06dd0ba1668d041b34a > > Diff: https://reviews.apache.org/r/25237/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Timothy Chen > >