[GitHub] [incubator-mxnet] leezu edited a comment on issue #18093: CI: unify centos7_base.sh and centos7_core.sh

GitBox Sat, 18 Apr 2020 20:51:27 -0700

leezu edited a comment on issue #18093: CI: unify centos7_base.sh and
centos7_core.sh
URL: https://github.com/apache/incubator-mxnet/pull/18093#issuecomment-615955998

@marcoabreu The current style is broken. We need to get rid of it and
replace it with a better style. Let me elaborate.

This practice of copying shell files into the container leads to constant
docker cache invalidation and makes for a horrible development experience. We
need to reconsider this strategy and find a better solution that actually
works. Just try generating a docker image locally. It will never rely on the
mxnetci docker cache. You can check a number of references about this problem

-
https://stackoverflow.com/questions/48551953/why-does-my-docker-cache-get-invalidated-by-this-copy-command
- Request for reason of cache invalidation: moby/moby#9294
- COPY invalidates cache: moby/moby#21913

That's problem Nr 1 with copying shell scripts. There may be some unreliable
workarounds by using `.dockerignore` files, but let's discuss a bit more about
what (I suppose) we actually attempt to achieve with these shell scripts and
why I believe it's not the best solution:

Each of our docker containers has a certain _purpose_, such as providing a
build setup, providing a setup with cuda drivers, providing a minimal setup for
testing of release artifacts etc. For each _purpose_ we like to ensure / test
it on a variety of platforms (CentOS7 and some newer Ubuntu distro for build;
various older and newer Linux distros for testing release artifacts). This
leads to a combinatorial number of Dockerfiles. The shell scripts are used to
avoid duplicating the same install steps in each Dockerfile. The problem here
is, that there is a disconnect between the content of each shell script and the
places of use that leads to a unmaintainable mess of shell scripts in our repo.
Sometimes multiple different purposes require similar shell scripts, and the
union of the multiple scripts is used. Unfortunately, once a single script is
used for multiple purposes, it becomes unnecessarily hard to reason about the
effect of changes to that script. Once one of the purposes requires a new
dependency, suddenly we install it in all other dockerfiles as well.
Just look at the contents of
https://github.com/apache/incubator-mxnet/tree/37dbbd4/ci/docker/install
It's truly a mess.

I suggest to firstly simplify the Dockerfiles. As you see in this PR, at
least some of the shell scripts install way to many dependencies for the
Dockerfiles in which they are included. Other shell scripts can be completely
removed by switching from Ubuntu 16.04 base image to 20.04 base image and using
system dependencies instead of compiling stuff from source (we can use a very
recent version of Ubuntu, because we already ensure compatibility with old
distros based on CentOS7). This allows us to greatly simplify each Dockerfile.

Secondly, instead of having a separate Dockerfile for every particular
setting (`Dockerfile.publish.centos7_gpu_cu102`,
`Dockerfile.publish.centos7_gpu_cu101`, ...), just have a single Dockerfile for
a class of settings (`Dockerfile.publish.centos7` in this example) and
[dynamically specify the base
image](https://www.jeffgeerling.com/blog/2017/use-arg-dockerfile-dynamic-image-specification).
Together, this will greatly simplify our setup and greatly improve the
developer experience by fixing the cache. This PR doesn't implement it yet
obviously, but is a first step.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] leezu edited a comment on issue #18093: CI: unify centos7_base.sh and centos7_core.sh

Reply via email to