This is an automated email from the ASF dual-hosted git repository.
jeffreyvo pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-rs.git
The following commit(s) were added to refs/heads/main by this push:
new 8ed2b5246d fix: integration / Archery test With other arrows container
ran out of space (#9043)
8ed2b5246d is described below
commit 8ed2b5246d5de68909695f5953aa2811f3f8ea0d
Author: Lanqing Yang <[email protected]>
AuthorDate: Sat Dec 27 09:53:54 2025 +0900
fix: integration / Archery test With other arrows container ran out of
space (#9043)
# Which issue does this PR close?
- Closes #9024.
# Rationale for this change
the ci container starts with 63gb / 72gb used, the 9GB remaining disk
space is barely enough for a cross build in 7 languages that leads to ci
being stuck.
this is what a debug step after initialize container shows
=== CONTAINER DISK USAGE ===
Filesystem Size Used Avail Use% Mounted on
overlay 72G 63G 9.5G 87% /
# What changes are included in this PR?
- add resource monitoring to build process
- add a clean up step to remove unnecessary software (cuts 6GB of space)
=== Cleaning up host disk space ===
Disk space before cleanup:
Filesystem Size Used Avail Use% Mounted on
overlay 72G 63G 9.5G 87% /
Disk space after cleanup:
Filesystem Size Used Avail Use% Mounted on
overlay 72G 57G 16G 79% /
- add a small optimization to shallow clone (only clone most recent
commit not full history) for github repos
optimization results we have 6.1 GB left after build
=== After Build ===
Filesystem Size Used Avail Use% Mounted on
overlay 72G 66G 6.1G 92% /
# Are these changes tested?
tested by github ci
# Are there any user-facing changes?
no
---------
Signed-off-by: lyang24 <[email protected]>
---
.github/workflows/integration.yml | 66 +++++++++++++++++++++++++++++++++++----
1 file changed, 60 insertions(+), 6 deletions(-)
diff --git a/.github/workflows/integration.yml
b/.github/workflows/integration.yml
index 32c5e78d4f..cc74650812 100644
--- a/.github/workflows/integration.yml
+++ b/.github/workflows/integration.yml
@@ -78,58 +78,112 @@ jobs:
run:
shell: bash
steps:
+ - name: Monitor disk usage - Initial
+ run: |
+ echo "=== Initial Disk Usage ==="
+ df -h /
+ echo ""
+
+ - name: Remove unnecessary preinstalled software
+ run: |
+ echo "=== Cleaning up host disk space ==="
+ echo "Disk space before cleanup:"
+ df -h /
+
+ # Clean apt cache
+ apt-get clean || true
+
+ # Remove GitHub Actions tool cache
+ rm -rf /__t/* || true
+
+ # Remove large packages from host filesystem (mounted at /host/)
+ rm -rf /host/usr/share/dotnet || true
+ rm -rf /host/usr/local/lib/android || true
+ rm -rf /host/usr/local/.ghcup || true
+ rm -rf /host/opt/hostedtoolcache/CodeQL || true
+
+ echo ""
+ echo "Disk space after cleanup:"
+ df -h /
+ echo ""
+
# This is necessary so that actions/checkout can find git
- name: Export conda path
run: echo "/opt/conda/envs/arrow/bin" >> $GITHUB_PATH
# This is necessary so that Rust can find cargo
- name: Export cargo path
run: echo "/root/.cargo/bin" >> $GITHUB_PATH
- - name: Check rustup
- run: which rustup
- - name: Check cmake
- run: which cmake
+
+ # Checkout repos (using shallow clones with fetch-depth: 1)
- name: Checkout Arrow
uses: actions/checkout@v6
with:
repository: apache/arrow
submodules: true
- fetch-depth: 0
+ fetch-depth: 1
- name: Checkout Arrow Rust
uses: actions/checkout@v6
with:
path: rust
submodules: true
- fetch-depth: 0
+ fetch-depth: 1
- name: Checkout Arrow .NET
uses: actions/checkout@v6
with:
repository: apache/arrow-dotnet
path: dotnet
+ fetch-depth: 1
- name: Checkout Arrow Go
uses: actions/checkout@v6
with:
repository: apache/arrow-go
path: go
+ fetch-depth: 1
- name: Checkout Arrow Java
uses: actions/checkout@v6
with:
repository: apache/arrow-java
path: java
+ fetch-depth: 1
- name: Checkout Arrow JavaScript
uses: actions/checkout@v6
with:
repository: apache/arrow-js
path: js
+ fetch-depth: 1
- name: Checkout Arrow nanoarrow
uses: actions/checkout@v6
with:
repository: apache/arrow-nanoarrow
path: nanoarrow
+ fetch-depth: 1
+
+ - name: Monitor disk usage - After checkouts
+ run: |
+ echo "=== After Checkouts ==="
+ df -h /
+ echo ""
+
- name: Build
run: conda run --no-capture-output
ci/scripts/integration_arrow_build.sh $PWD /build
+
+ - name: Monitor disk usage - After build
+ if: always()
+ run: |
+ echo "=== After Build ==="
+ df -h /
+ echo ""
+
- name: Run
run: conda run --no-capture-output ci/scripts/integration_arrow.sh
$PWD /build
+ - name: Monitor disk usage - After tests
+ if: always()
+ run: |
+ echo "=== After Tests ==="
+ df -h /
+ echo ""
+
# test FFI against the C-Data interface exposed by pyarrow
pyarrow-integration-test:
name: Pyarrow C Data Interface