This is an automated email from the ASF dual-hosted git repository.
kszucs pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 39fcc48 ARROW-8708: [CI] Utilize github actions cache for
docker-compose volumes
39fcc48 is described below
commit 39fcc48a6e3d98dab03fed3904b45303f01bfa28
Author: Krisztián Szűcs <[email protected]>
AuthorDate: Wed May 6 12:40:44 2020 +0200
ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Github actions has increased the repository level cache size to 5GB which
can be enough for the C++ based builds. With a higher compression level the
ubuntu cache consumed less than 300MB of disk.
It also worth mentioning, that the github actions caches are evicted once a
day even if the 5GB limit has been reached. So basically we don't need to worry
about the cache size at all.
I set up each job to use a platform specific cache key ~prefixed with
docker and postfixed with the actual git reference. This means that
`docker-ubuntu-18.04-refs/pull/<pr>/merge` key is used in case of a pull
request and `docker-ubuntu-18.04-master` on master. Also set a restore key to
the master key as a fallback to have a cache hit on new pull requests.~
Caching has builtin support for [branch
hierarchy](https://help.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#example-of-search-priority).
Closes #7081 from kszucs/gha-ccache
Authored-by: Krisztián Szűcs <[email protected]>
Signed-off-by: Krisztián Szűcs <[email protected]>
---
.github/workflows/cpp.yml | 6 ++
.github/workflows/cpp_cron.yml | 6 ++
.github/workflows/dev.yml | 6 ++
.github/workflows/integration.yml | 6 ++
.github/workflows/java.yml | 6 ++
.github/workflows/java_jni.yml | 6 ++
.github/workflows/python.yml | 106 +++++++++++++---------------
.github/workflows/python_cron.yml | 15 ++++
.github/workflows/r.yml | 6 ++
.github/workflows/ruby.yml | 6 ++
.gitignore | 3 +
dev/archery/archery/cli.py | 27 +++++---
dev/archery/archery/docker.py | 40 ++++++-----
dev/archery/archery/tests/test_docker.py | 115 +++++++++++++++++++++++++------
dev/tasks/tasks.yml | 20 +++---
docker-compose.yml | 47 +++++--------
docs/source/developers/docker.rst | 23 +++++--
17 files changed, 291 insertions(+), 153 deletions(-)
diff --git a/.github/workflows/cpp.yml b/.github/workflows/cpp.yml
index a04222a..c933285 100644
--- a/.github/workflows/cpp.yml
+++ b/.github/workflows/cpp.yml
@@ -70,6 +70,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.image }}-${{ hashFiles('cpp/**') }}
+ restore-keys: ${{ matrix.image }}-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.github/workflows/cpp_cron.yml b/.github/workflows/cpp_cron.yml
index a93484b..e227adf 100644
--- a/.github/workflows/cpp_cron.yml
+++ b/.github/workflows/cpp_cron.yml
@@ -86,6 +86,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.name }}-${{ hashFiles('cpp/**') }}
+ restore-keys: ${{ matrix.name }}-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.github/workflows/dev.yml b/.github/workflows/dev.yml
index 6bdeb72..59e8202 100644
--- a/.github/workflows/dev.yml
+++ b/.github/workflows/dev.yml
@@ -74,6 +74,12 @@ jobs:
- name: Free Up Disk Space
shell: bash
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ubuntu-18.04-${{ hashFiles('cpp/**') }}
+ restore-keys: ubuntu-18.04-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.github/workflows/integration.yml
b/.github/workflows/integration.yml
index 4fea799..905ff01 100644
--- a/.github/workflows/integration.yml
+++ b/.github/workflows/integration.yml
@@ -62,6 +62,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: conda-${{ hashFiles('cpp/**') }}
+ restore-keys: conda-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.github/workflows/java.yml b/.github/workflows/java.yml
index 7f8af19..a679622 100644
--- a/.github/workflows/java.yml
+++ b/.github/workflows/java.yml
@@ -66,6 +66,12 @@ jobs:
- name: Free Up Disk Space
shell: bash
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: maven-${{ hashFiles('java/**') }}
+ restore-keys: maven-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.github/workflows/java_jni.yml b/.github/workflows/java_jni.yml
index ae257c5..ef40642 100644
--- a/.github/workflows/java_jni.yml
+++ b/.github/workflows/java_jni.yml
@@ -64,6 +64,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: maven-${{ hashFiles('java/**') }}
+ restore-keys: maven-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.github/workflows/python.yml b/.github/workflows/python.yml
index c7e5ea8..9e83203 100644
--- a/.github/workflows/python.yml
+++ b/.github/workflows/python.yml
@@ -39,38 +39,47 @@ env:
jobs:
- docker:
- name: ${{ matrix.title }}
- runs-on: ubuntu-latest
- if: ${{ !contains(github.event.pull_request.title, 'WIP') }}
- strategy:
- fail-fast: false
- matrix:
- name:
- - ubuntu-16.04-python-3
- - conda-python-3.8-nopandas
- - conda-python-3.6-pandas-0.23
- - conda-python-3.6-pandas-latest
- include:
- - name: ubuntu-16.04-python-3
- image: ubuntu-python
- # this image always builds with python 3.5
- title: AMD64 Ubuntu 16.04 Python 3.5
- ubuntu: 16.04
- - name: conda-python-3.8-nopandas
- image: conda-python
- title: AMD64 Conda Python 3.8 Without Pandas
- python: 3.8
- - name: conda-python-3.6-pandas-0.23
- image: conda-python-pandas
- title: AMD64 Conda Python 3.6 Pandas 0.23
- python: 3.6
- pandas: 0.23
- - name: conda-python-3.6-pandas-latest
- image: conda-python-pandas
- title: AMD64 Conda Python 3.6 Pandas latest
- python: 3.6
- pandas: latest
+ docker:
+ name: ${{ matrix.title }}
+ runs-on: ubuntu-latest
+ if: ${{ !contains(github.event.pull_request.title, 'WIP') }}
+ strategy:
+ fail-fast: false
+ matrix:
+ name:
+ - ubuntu-16.04-python-3
+ - conda-python-3.8-nopandas
+ - conda-python-3.6-pandas-0.23
+ - conda-python-3.6-pandas-latest
+ - centos-python-3.6-manylinux1
+ include:
+ - name: ubuntu-16.04-python-3
+ cache: ubuntu-16.04-python-3
+ image: ubuntu-python
+ # this image always builds with python 3.5
+ title: AMD64 Ubuntu 16.04 Python 3.5
+ ubuntu: 16.04
+ - name: conda-python-3.8-nopandas
+ cache: conda-python-3.8
+ image: conda-python
+ title: AMD64 Conda Python 3.8 Without Pandas
+ python: 3.8
+ - name: conda-python-3.6-pandas-0.23
+ cache: conda-python-3.6
+ image: conda-python-pandas
+ title: AMD64 Conda Python 3.6 Pandas 0.23
+ python: 3.6
+ pandas: 0.23
+ - name: conda-python-3.6-pandas-latest
+ cache: conda-python-3.6
+ image: conda-python-pandas
+ title: AMD64 Conda Python 3.6 Pandas latest
+ python: 3.6
+ pandas: latest
+ - name: centos-python-3.6-manylinux1
+ cache: manylinux1
+ image: --force-pull --no-build centos-python-manylinux1
+ title: AMD64 CentOS 5.11 Python 3.6 manylinux1
env:
PYTHON: ${{ matrix.python || 3.7 }}
UBUNTU: ${{ matrix.ubuntu || 18.04 }}
@@ -84,6 +93,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}
+ restore-keys: ${{ matrix.cache }}-
- name: Setup Python
uses: actions/setup-python@v1
with:
@@ -100,33 +115,6 @@ jobs:
continue-on-error: true
run: archery docker push ${{ matrix.image }}
- manylinux1:
- name: AMD64 CentOS 5.11 Python 3.6 manylinux1
- runs-on: ubuntu-latest
- if: ${{ !contains(github.event.pull_request.title, 'WIP') }}
- steps:
- - name: Checkout Arrow
- uses: actions/checkout@v2
- with:
- fetch-depth: 0
- - name: Fetch Submodules and Tags
- shell: bash
- run: ci/scripts/util_checkout.sh
- - name: Free Up Disk Space
- shell: bash
- run: ci/scripts/util_cleanup.sh
- - name: Setup Python
- uses: actions/setup-python@v1
- with:
- python-version: 3.8
- - name: Setup Archery
- run: pip install -e dev/archery[docker]
- - name: Execute Docker Build
- run: |
- sudo sysctl -w kernel.core_pattern="core.%e.%p"
- ulimit -c unlimited
- archery docker run --no-build ${{ matrix.image }}
-
macos:
name: AMD64 MacOS 10.15 Python 3.7
runs-on: macos-latest
diff --git a/.github/workflows/python_cron.yml
b/.github/workflows/python_cron.yml
index 8334cce..2b9709c 100644
--- a/.github/workflows/python_cron.yml
+++ b/.github/workflows/python_cron.yml
@@ -55,38 +55,47 @@ jobs:
- conda-python-3.7-hdfs-2.9.2
include:
- name: debian-10-python-3
+ cache: debian-10-python-3
image: debian-python
title: AMD64 Debian 10 Python 3
debian: 10
- name: fedora-30-python-3
+ cache: fedora-30-python-3
image: fedora-python
title: AMD64 Fedora 30 Python 3
fedora: 30
- name: ubuntu-18.04-python-3
+ cache: ubuntu-18.04-python-3
image: ubuntu-python
title: AMD64 Ubuntu 18.04 Python 3
ubuntu: 18.04
- name: conda-python-3.7-dask-latest
+ cache: conda-python-3.7
image: conda-python-dask
title: AMD64 Conda Python 3.7 Dask latest
dask: latest
- name: conda-python-3.7-turbodbc-latest
+ cache: conda-python-3.7
image: conda-python-turbodbc
title: AMD64 Conda Python 3.7 Turbodbc latest
turbodbc: latest
- name: conda-python-3.7-kartothek-latest
+ cache: conda-python-3.7
image: conda-python-kartothek
title: AMD64 Conda Python 3.7 Kartothek latest
kartothek: latest
- name: conda-python-3.7-pandas-0.24
+ cache: conda-python-3.7
image: conda-python-pandas
title: AMD64 Conda Python 3.7 Pandas 0.24
pandas: 0.24
- name: conda-python-3.7-pandas-master
+ cache: conda-python-3.7
image: --no-cache-leaf conda-python-pandas
title: AMD64 Conda Python 3.7 Pandas master
pandas: master
- name: conda-python-3.7-hdfs-2.9.2
+ cache: conda-python-3.7
image: conda-python-hdfs
title: AMD64 Conda Python 3.7 HDFS 2.9.2
hdfs: 2.9.2
@@ -110,6 +119,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}
+ restore-keys: ${{ matrix.cache }}-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.github/workflows/r.yml b/.github/workflows/r.yml
index 7b8d354..26bc704 100644
--- a/.github/workflows/r.yml
+++ b/.github/workflows/r.yml
@@ -85,6 +85,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.name }}-${{ hashFiles('cpp/**') }}
+ restore-keys: ${{ matrix.name }}-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.github/workflows/ruby.yml b/.github/workflows/ruby.yml
index 06bbb14..facb9e8 100644
--- a/.github/workflows/ruby.yml
+++ b/.github/workflows/ruby.yml
@@ -65,6 +65,12 @@ jobs:
- name: Free Up Disk Space
shell: bash
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ubuntu-${{ matrix.ubuntu }}-ruby-${{ hashFiles('cpp/**') }}
+ restore-keys: ubuntu-${{ matrix.ubuntu }}-ruby-
- name: Setup Python
uses: actions/setup-python@v1
with:
diff --git a/.gitignore b/.gitignore
index 8aaad13..6f12336 100644
--- a/.gitignore
+++ b/.gitignore
@@ -77,3 +77,6 @@ site/
# macOS
cpp/Brewfile.lock.json
.DS_Store
+
+# docker volumes used for caching
+.docker
diff --git a/dev/archery/archery/cli.py b/dev/archery/archery/cli.py
index 4021c0a..19590b4 100644
--- a/dev/archery/archery/cli.py
+++ b/dev/archery/archery/cli.py
@@ -682,12 +682,14 @@ def docker_compose(obj, src):
@click.argument('command', required=False, default=None)
@click.option('--env', '-e', multiple=True,
help="Set environment variable within the container")
[email protected]('--build/--no-build', default=True,
[email protected]('--force-pull/--no-pull', default=True,
+ help="Whether to force pull the image and its ancestor images")
[email protected]('--force-build/--no-build', default=True,
help="Whether to force build the image and its ancestor images")
[email protected]('--cache/--no-cache', default=True,
[email protected]('--use-cache/--no-cache', default=True,
help="Whether to use cache when building the image and its "
"ancestor images")
[email protected]('--cache-leaf/--no-cache-leaf', default=True,
[email protected]('--use-leaf-cache/--no-leaf-cache', default=True,
help="Whether to use cache when building only the (leaf) image "
"passed as the argument. To disable caching for both the "
"image and its ancestors use --no-cache option.")
@@ -695,8 +697,8 @@ def docker_compose(obj, src):
help="Display the docker-compose commands instead of executing "
"them.")
@click.pass_obj
-def docker_compose_run(obj, image, command, env, build, cache, cache_leaf,
- dry_run):
+def docker_compose_run(obj, image, command, env, force_pull, force_build,
+ use_cache, use_leaf_cache, dry_run):
"""Execute docker-compose builds.
To see the available builds run `archery docker list`.
@@ -713,10 +715,10 @@ def docker_compose_run(obj, image, command, env, build,
cache, cache_leaf,
PYTHON=3.8 archery docker run conda-python
# disable the cache only for the leaf image
- PANDAS=master archery docker run --no-cache-leaf conda-python-pandas
+ PANDAS=master archery docker run --no-leaf-cache conda-python-pandas
# entirely skip building the image
- archery docker run --no-build conda-python
+ archery docker run --no-pull --no-build conda-python
# pass runtime parameters via docker environment variables
archery docker run -e CMAKE_BUILD_TYPE=release ubuntu-cpp
@@ -739,9 +741,14 @@ def docker_compose_run(obj, image, command, env, build,
cache, cache_leaf,
compose._execute = MethodType(_print_command, compose)
try:
- if build:
- compose.build(image, cache=cache, cache_leaf=cache_leaf)
- compose.run(image, command=command)
+ compose.run(
+ image,
+ command=command,
+ force_pull=force_pull,
+ force_build=force_build,
+ use_cache=use_cache,
+ use_leaf_cache=use_leaf_cache
+ )
except UndefinedImage as e:
raise click.ClickException(
"There is no service/image defined in docker-compose.yml with "
diff --git a/dev/archery/archery/docker.py b/dev/archery/archery/docker.py
index ca2237b..0c0a49a 100644
--- a/dev/archery/archery/docker.py
+++ b/dev/archery/archery/docker.py
@@ -138,31 +138,39 @@ class DockerCompose(Command):
)
)
- def build(self, image, cache=True, cache_leaf=True, params=None):
+ def pull(self, image, pull_leaf=True):
self._validate_image(image)
- if cache:
- # pull
- for ancestor in self.nodes[image]:
- self._execute('pull', '--ignore-pull-failures', ancestor)
- if cache_leaf:
- self._execute('pull', '--ignore-pull-failures', image)
- # build
- for ancestor in self.nodes[image]:
+ for ancestor in self.nodes[image]:
+ self._execute('pull', '--ignore-pull-failures', ancestor)
+
+ if pull_leaf:
+ self._execute('pull', '--ignore-pull-failures', image)
+
+ def build(self, image, use_cache=True, use_leaf_cache=True):
+ self._validate_image(image)
+
+ for ancestor in self.nodes[image]:
+ if use_cache:
self._execute('build', ancestor)
- if cache_leaf:
- self._execute('build', image)
else:
- self._execute('build', '--no-cache', image)
- else:
- # build
- for ancestor in self.nodes[image]:
self._execute('build', '--no-cache', ancestor)
+
+ if use_cache and use_leaf_cache:
+ self._execute('build', image)
+ else:
self._execute('build', '--no-cache', image)
- def run(self, image, command=None, env=None, params=None):
+ def run(self, image, command=None, env=None, force_pull=False,
+ force_build=False, use_cache=True, use_leaf_cache=True):
self._validate_image(image)
+ if force_pull:
+ self.pull(image, pull_leaf=use_leaf_cache)
+ if force_build:
+ self.build(image, use_cache=use_cache,
+ use_leaf_cache=use_leaf_cache)
+
args = []
if env is not None:
for k, v in env.items():
diff --git a/dev/archery/archery/tests/test_docker.py
b/dev/archery/archery/tests/test_docker.py
index ce75d84..f9c9cd9 100644
--- a/dev/archery/archery/tests/test_docker.py
+++ b/dev/archery/archery/tests/test_docker.py
@@ -233,14 +233,50 @@ def test_forwarding_env_variables(arrow_compose_path):
with assert_compose_calls(compose, expected_calls, env=expected_env):
assert os.environ['MY_CUSTOM_VAR_A'] == 'a'
assert os.environ['MY_CUSTOM_VAR_B'] == 'b'
+ compose.pull('conda-cpp')
compose.build('conda-cpp')
-def test_compose_build(arrow_compose_path):
+def test_compose_pull(arrow_compose_path):
compose = DockerCompose(arrow_compose_path)
expected_calls = [
"pull --ignore-pull-failures conda-cpp",
+ ]
+ with assert_compose_calls(compose, expected_calls):
+ compose.pull('conda-cpp')
+
+ expected_calls = [
+ "pull --ignore-pull-failures conda-cpp",
+ "pull --ignore-pull-failures conda-python",
+ "pull --ignore-pull-failures conda-python-pandas"
+ ]
+ with assert_compose_calls(compose, expected_calls):
+ compose.pull('conda-python-pandas')
+
+ expected_calls = [
+ "pull --ignore-pull-failures conda-cpp",
+ "pull --ignore-pull-failures conda-python",
+ ]
+ with assert_compose_calls(compose, expected_calls):
+ compose.pull('conda-python-pandas', pull_leaf=False)
+
+
+def test_compose_pull_params(arrow_compose_path):
+ expected_calls = [
+ "pull --ignore-pull-failures conda-cpp",
+ "pull --ignore-pull-failures conda-python",
+ ]
+ compose = DockerCompose(arrow_compose_path, params=dict(UBUNTU='18.04'))
+ expected_env = PartialEnv(PYTHON='3.6', PANDAS='latest')
+ with assert_compose_calls(compose, expected_calls, env=expected_env):
+ compose.pull('conda-python-pandas', pull_leaf=False)
+
+
+def test_compose_build(arrow_compose_path):
+ compose = DockerCompose(arrow_compose_path)
+
+ expected_calls = [
"build conda-cpp",
]
with assert_compose_calls(compose, expected_calls):
@@ -250,12 +286,9 @@ def test_compose_build(arrow_compose_path):
"build --no-cache conda-cpp"
]
with assert_compose_calls(compose, expected_calls):
- compose.build('conda-cpp', cache=False)
+ compose.build('conda-cpp', use_cache=False)
expected_calls = [
- "pull --ignore-pull-failures conda-cpp",
- "pull --ignore-pull-failures conda-python",
- "pull --ignore-pull-failures conda-python-pandas",
"build conda-cpp",
"build conda-python",
"build conda-python-pandas"
@@ -269,22 +302,20 @@ def test_compose_build(arrow_compose_path):
"build --no-cache conda-python-pandas",
]
with assert_compose_calls(compose, expected_calls):
- compose.build('conda-python-pandas', cache=False)
+ compose.build('conda-python-pandas', use_cache=False)
expected_calls = [
- "pull --ignore-pull-failures conda-cpp",
- "pull --ignore-pull-failures conda-python",
"build conda-cpp",
"build conda-python",
"build --no-cache conda-python-pandas",
]
with assert_compose_calls(compose, expected_calls):
- compose.build('conda-python-pandas', cache=True, cache_leaf=False)
+ compose.build('conda-python-pandas', use_cache=True,
+ use_leaf_cache=False)
def test_compose_build_params(arrow_compose_path):
expected_calls = [
- "pull --ignore-pull-failures ubuntu-cpp",
"build ubuntu-cpp",
]
@@ -306,18 +337,7 @@ def test_compose_build_params(arrow_compose_path):
compose = DockerCompose(arrow_compose_path, params=dict(UBUNTU='18.04'))
expected_env = PartialEnv(PYTHON='3.6', PANDAS='latest')
with assert_compose_calls(compose, expected_calls, env=expected_env):
- compose.build('conda-python-pandas', cache=False)
-
- compose = DockerCompose(arrow_compose_path, params=dict(PANDAS='0.25.3'))
- expected_env = PartialEnv(PYTHON='3.6', PANDAS='0.25.3')
- with assert_compose_calls(compose, expected_calls, env=expected_env):
- compose.build('conda-python-pandas', cache=False)
-
- compose = DockerCompose(arrow_compose_path,
- params=dict(PYTHON='3.8', PANDAS='master'))
- expected_env = PartialEnv(PYTHON='3.8', PANDAS='master')
- with assert_compose_calls(compose, expected_calls, env=expected_env):
- compose.build('conda-python-pandas', cache=False)
+ compose.build('conda-python-pandas', use_cache=False)
def test_compose_run(arrow_compose_path):
@@ -365,6 +385,57 @@ def test_compose_run(arrow_compose_path):
compose.run('conda-python', env=env)
+def test_compose_run_force_pull_and_build(arrow_compose_path):
+ compose = DockerCompose(arrow_compose_path)
+
+ expected_calls = [
+ "pull --ignore-pull-failures conda-cpp",
+ "run --rm conda-cpp"
+ ]
+ with assert_compose_calls(compose, expected_calls):
+ compose.run('conda-cpp', force_pull=True)
+
+ expected_calls = [
+ "build conda-cpp",
+ "run --rm conda-cpp"
+ ]
+ with assert_compose_calls(compose, expected_calls):
+ compose.run('conda-cpp', force_build=True)
+
+ expected_calls = [
+ "pull --ignore-pull-failures conda-cpp",
+ "build conda-cpp",
+ "run --rm conda-cpp"
+ ]
+ with assert_compose_calls(compose, expected_calls):
+ compose.run('conda-cpp', force_pull=True, force_build=True)
+
+ expected_calls = [
+ "pull --ignore-pull-failures conda-cpp",
+ "pull --ignore-pull-failures conda-python",
+ "pull --ignore-pull-failures conda-python-pandas",
+ "build conda-cpp",
+ "build conda-python",
+ "build conda-python-pandas",
+ "run --rm conda-python-pandas bash"
+ ]
+ with assert_compose_calls(compose, expected_calls):
+ compose.run('conda-python-pandas', command='bash', force_build=True,
+ force_pull=True)
+
+ expected_calls = [
+ "pull --ignore-pull-failures conda-cpp",
+ "pull --ignore-pull-failures conda-python",
+ "build conda-cpp",
+ "build conda-python",
+ "build --no-cache conda-python-pandas",
+ "run --rm conda-python-pandas bash"
+ ]
+ with assert_compose_calls(compose, expected_calls):
+ compose.run('conda-python-pandas', command='bash', force_build=True,
+ force_pull=True, use_leaf_cache=False)
+
+
def test_compose_push(arrow_compose_path):
compose = DockerCompose(arrow_compose_path, params=dict(PYTHON='3.8'))
expected_env = PartialEnv(PYTHON="3.8")
diff --git a/dev/tasks/tasks.yml b/dev/tasks/tasks.yml
index 51f830d..478e4f4 100644
--- a/dev/tasks/tasks.yml
+++ b/dev/tasks/tasks.yml
@@ -1706,7 +1706,7 @@ tasks:
PYTHON: 3.7
PANDAS: latest
# use the latest pandas release, so prevent reusing any cached layers
- run: --no-cache-leaf conda-python-pandas
+ run: --no-leaf-cache conda-python-pandas
test-conda-python-3.8-pandas-latest:
ci: github
@@ -1716,7 +1716,7 @@ tasks:
PYTHON: 3.8
PANDAS: latest
# use the latest pandas release, so prevent reusing any cached layers
- run: --no-cache-leaf conda-python-pandas
+ run: --no-leaf-cache conda-python-pandas
test-conda-python-3.7-pandas-master:
ci: github
@@ -1726,7 +1726,7 @@ tasks:
PYTHON: 3.7
PANDAS: master
# use the master branch of pandas, so prevent reusing any cached layers
- run: --no-cache-leaf conda-python-pandas
+ run: --no-leaf-cache conda-python-pandas
test-conda-python-3.6-pandas-0.23:
ci: github
@@ -1745,7 +1745,7 @@ tasks:
PYTHON: 3.7
DASK: latest
# use the latest dask release, so prevent reusing any cached layers
- run: --no-cache-leaf conda-python-dask
+ run: --no-leaf-cache conda-python-dask
test-conda-python-3.8-dask-master:
ci: github
@@ -1755,7 +1755,7 @@ tasks:
PYTHON: 3.8
DASK: master
# use the master branch of dask, so prevent reusing any cached layers
- run: --no-cache-leaf conda-python-dask
+ run: --no-leaf-cache conda-python-dask
test-conda-python-3.8-jpype:
ci: github
@@ -1773,7 +1773,7 @@ tasks:
PYTHON: 3.7
TURBODBC: latest
# use the latest turbodbc release, so prevent reusing any cached layers
- run: --no-cache-leaf conda-python-turbodbc
+ run: --no-leaf-cache conda-python-turbodbc
test-conda-python-3.7-turbodbc-master:
ci: github
@@ -1783,7 +1783,7 @@ tasks:
PYTHON: 3.7
TURBODBC: master
# use the master branch of dask, so prevent reusing any cached layers
- run: --no-cache-leaf conda-python-turbodbc
+ run: --no-leaf-cache conda-python-turbodbc
test-conda-python-3.7-kartothek-latest:
ci: github
@@ -1792,7 +1792,7 @@ tasks:
env:
PYTHON: 3.7
KARTOTHEK: latest
- run: --no-cache-leaf conda-python-kartothek
+ run: --no-leaf-cache conda-python-kartothek
test-conda-python-3.7-kartothek-master:
ci: github
@@ -1802,7 +1802,7 @@ tasks:
PYTHON: 3.7
KARTOTHEK: master
# use the master branch of kartothek, so prevent reusing any layers
- run: --no-cache-leaf conda-python-kartothek
+ run: --no-leaf-cache conda-python-kartothek
test-conda-python-3.7-hdfs-2.9.2:
ci: github
@@ -1821,7 +1821,7 @@ tasks:
PYTHON: 3.7
SPARK: master
# use the master branch of spark, so prevent reusing any layers
- run: --no-cache-leaf conda-python-spark
+ run: --no-leaf-cache conda-python-spark
# Remove the "skipped-" prefix in ARROW-8475
skipped-test-conda-cpp-hiveserver2:
diff --git a/docker-compose.yml b/docker-compose.yml
index 1f4bcfc..cd1a896 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -55,29 +55,12 @@
version: '3.5'
-volumes:
- # Named volumes must be predefined, so in order to use other architecture,
- # like ARM64v8 with debian 10, then the appropriate parametrized volume name
- # must be added the list below: arm64v8-debian-10-cache
- amd64-conda-cache:
- amd64-cuda-9.1-cache:
- amd64-cuda-10.0-cache:
- amd64-cuda-10.1-cache:
- amd64-debian-9-cache:
- amd64-debian-10-cache:
- amd64-fedora-30-cache:
- amd64-ubuntu-14.04-cache:
- amd64-ubuntu-16.04-cache:
- amd64-ubuntu-18.04-cache:
- amd64-ubuntu-20.04-cache:
- maven-cache:
-
x-ccache: &ccache
CCACHE_COMPILERCHECK: content
CCACHE_COMPRESS: 1
- CCACHE_COMPRESSLEVEL: 5
+ CCACHE_COMPRESSLEVEL: 6
CCACHE_MAXSIZE: 500M
- CCACHE_DIR: /build/ccache
+ CCACHE_DIR: /ccache
x-hierarchy:
# This section is used by the archery tool to enable building nested images,
@@ -180,7 +163,7 @@ services:
ARROW_BUILD_BENCHMARKS: "ON"
volumes: &conda-volumes
- .:/arrow:delegated
- - ${ARCH}-conda-cache:/build:delegated
+ - .docker/${ARCH}-conda-ccache:/ccache:delegated
command: &cpp-conda-command
["/arrow/ci/scripts/cpp_build.sh /arrow /build &&
/arrow/ci/scripts/cpp_test.sh /arrow /build"]
@@ -237,7 +220,7 @@ services:
ARROW_ENABLE_TIMING_TESTS: # inherit
volumes: &cuda-volumes
- .:/arrow:delegated
- - ${ARCH}-cuda-${CUDA}-cache:/build:delegated
+ - .docker/${ARCH}-cuda-${CUDA}-ccache:/ccache:delegated
command: &cpp-command >
/bin/bash -c "
/arrow/ci/scripts/cpp_build.sh /arrow /build &&
@@ -266,7 +249,7 @@ services:
ARROW_ENABLE_TIMING_TESTS: # inherit
volumes: &debian-volumes
- .:/arrow:delegated
- - ${ARCH}-debian-${DEBIAN}-cache:/build:delegated
+ - .docker/${ARCH}-debian-${DEBIAN}-ccache:/ccache:delegated
command: *cpp-command
ubuntu-cpp:
@@ -293,7 +276,7 @@ services:
ARROW_ENABLE_TIMING_TESTS: # inherit
volumes: &ubuntu-volumes
- .:/arrow:delegated
- - ${ARCH}-ubuntu-${UBUNTU}-cache:/build:delegated
+ - .docker/${ARCH}-ubuntu-${UBUNTU}-ccache:/ccache:delegated
command: *cpp-command
ubuntu-cpp-sanitizer:
@@ -353,7 +336,7 @@ services:
ARROW_ENABLE_TIMING_TESTS: # inherit
volumes: &fedora-volumes
- .:/arrow:delegated
- - ${ARCH}-fedora-${FEDORA}-cache:/build:delegated
+ - .docker/${ARCH}-fedora-${FEDORA}-ccache:/ccache:delegated
command: *cpp-command
############################### C GLib ######################################
@@ -776,12 +759,14 @@ services:
llvm: ${LLVM}
shm_size: *shm-size
environment:
+ <<: *ccache
PYARROW_PARALLEL: 3
PYTHON_VERSION: ${PYTHON_VERSION:-3.6}
UNICODE_WIDTH: ${UNICODE_WIDTH:-16}
volumes:
- .:/arrow:delegated
- ./python/manylinux1:/io:delegated
+ - .docker/centos-python-manylinux1-ccache:/ccache:delegated
command: &manylinux-command /io/build_arrow.sh
centos-python-manylinux2010:
@@ -795,12 +780,14 @@ services:
llvm: ${LLVM}
shm_size: *shm-size
environment:
+ <<: *ccache
PYARROW_PARALLEL: 3
PYTHON_VERSION: ${PYTHON_VERSION:-3.6}
UNICODE_WIDTH: ${UNICODE_WIDTH:-16}
volumes:
- .:/arrow:delegated
- ./python/manylinux201x:/io:delegated
+ - .docker/centos-python-manylinux2010-ccache:/ccache:delegated
command: *manylinux-command
centos-python-manylinux2014:
@@ -814,12 +801,14 @@ services:
llvm: ${LLVM}
shm_size: *shm-size
environment:
+ <<: *ccache
PYARROW_PARALLEL: 3
PYTHON_VERSION: ${PYTHON_VERSION:-3.6}
UNICODE_WIDTH: ${UNICODE_WIDTH:-16}
volumes:
- .:/arrow:delegated
- ./python/manylinux201x:/io:delegated
+ - .docker/centos-python-manylinux2014-ccache:/ccache:delegated
command: *manylinux-command
################################## R ########################################
@@ -1036,7 +1025,7 @@ services:
shm_size: *shm-size
volumes: &java-volumes
- .:/arrow:delegated
- - maven-cache:/root/.m2:delegated
+ - .docker/maven-cache:/root/.m2:delegated
command: &java-command >
/bin/bash -c "
/arrow/ci/scripts/java_build.sh /arrow /build &&
@@ -1062,8 +1051,8 @@ services:
<<: *ccache
volumes:
- .:/arrow:delegated
- - maven-cache:/root/.m2:delegated
- - ${ARCH}-debian-9-cache:/build:delegated
+ - .docker/maven-cache:/root/.m2:delegated
+ - .docker/${ARCH}-debian-9-ccache:/ccache:delegated
command:
/bin/bash -c "
/arrow/ci/scripts/cpp_build.sh /arrow /build &&
@@ -1250,8 +1239,8 @@ services:
shm_size: *shm-size
volumes: &conda-maven-volumes
- .:/arrow:delegated
- - maven-cache:/root/.m2:delegated
- - ${ARCH}-conda-cache:/build:delegated
+ - .docker/maven-cache:/root/.m2:delegated
+ - .docker/${ARCH}-conda-ccache:/ccache:delegated
command:
["/arrow/ci/scripts/cpp_build.sh /arrow /build &&
/arrow/ci/scripts/python_build.sh /arrow /build &&
diff --git a/docs/source/developers/docker.rst
b/docs/source/developers/docker.rst
index e3f125d..7bb4553 100644
--- a/docs/source/developers/docker.rst
+++ b/docs/source/developers/docker.rst
@@ -47,7 +47,6 @@ with the ``--help`` flag:
archery docker --help
archery docker run --help
-
Examples
~~~~~~~~
@@ -68,8 +67,8 @@ Archery calls the following docker-compose commands:
.. code:: bash
docker-compose pull --ignore-pull-failures conda-cpp
- docker-compose build conda-cpp
docker-compose pull --ignore-pull-failures conda-python
+ docker-compose build conda-cpp
docker-compose build conda-python
docker-compose run --rm conda-python
@@ -102,7 +101,7 @@ where the leaf image is ``conda-python-pandas``.
.. code:: bash
- PANDAS=master archery docker run --no-cache-leaf conda-python-pandas
+ PANDAS=master archery docker run --no-leaf-cache conda-python-pandas
Which translates to:
@@ -110,8 +109,8 @@ Which translates to:
export PANDAS=master
docker-compose pull --ignore-pull-failures conda-cpp
- docker-compose build conda-cpp
docker-compose pull --ignore-pull-failures conda-python
+ docker-compose build conda-cpp
docker-compose build conda-python
docker-compose build --no-cache conda-python-pandas
docker-compose run --rm conda-python-pandas
@@ -144,9 +143,9 @@ can be useful to skip the build phases:
archery docker run conda-python
# since the image is properly built with the first command, there is no
- # need to rebuild it, so manually disable the build phase to spare the
- # build time
- archery docker run --no-build conda-python
+ # need to rebuild it, so manually disable the pull and build phases to
+ # spare the some time
+ archery docker run --no-pull --no-build conda-python
**Pass environment variables to the container:**
@@ -173,6 +172,16 @@ The following example starts an interactive ``bash``
session in the container
archery docker run ubuntu-cpp bash
+Docker Volume Caches
+~~~~~~~~~~~~~~~~~~~~
+
+Most of the compose container have specific directories mounted from the host
+to reuse ``ccache`` and ``maven`` artifacts. These docker volumes are placed
+in the ``.docker`` directory.
+
+In order to clean up the cache simply delete one or more directories (or the
+whole ``.docker`` directory).
+
Development
-----------