rusackas opened a new pull request, #41068:
URL: https://github.com/apache/superset/pull/41068

   ### SUMMARY
   
   The `Build & publish docker images` workflow fails intermittently on master. 
I classified the 18 most recent failures by their *fatal* error: ~83% are 
infra/runner flakiness, and the largest non-network cluster is **`no space left 
on device`**.
   
   Root cause: the `docker-compose sanity check` step runs `docker compose 
build superset-init`, which is meant to "reuse the CACHED image built in the 
previous steps" but actually rebuilds the image **from scratch** — because 
`docker-compose.yml` builds with `cache_from: apache/superset-cache:<tag>`, and 
that registry cache image is consistently **not found** (`failed to configure 
registry cache importer: docker.io/apache/superset-cache:3.10-slim-trixie: not 
found`). The uncached full rebuild then exhausts the runner's ~14 GB root disk:
   
   ```
   failed to solve: ... write /app/.venv/.../_duckdb...so: no space left on 
device
   ```
   
   This PR reclaims the large preinstalled toolchains the build doesn't use 
(dotnet, android, ghc, ghcup, CodeQL, boost) right after checkout in both the 
`docker-build` and `docker-compose-image-tag` jobs, recovering ~20+ GB of 
headroom. It's a plain shell step — no new third-party action dependency, no 
pinning/zizmor concerns.
   
   This is a targeted mitigation for the DISK cluster. The remaining clusters 
(Docker Hub registry network errors on build pull/push, occasional websocket 
`npm run build` SIGSEGV) are tracked separately; restoring/refreshing the 
`apache/superset-cache` image so the sanity check actually hits cache is the 
deeper fix and is left for a follow-up.
   
   ### TESTING INSTRUCTIONS
   
   CI: the `docker-build (dev)` and `docker-compose-image-tag` jobs log `Disk 
before/after cleanup` and should no longer hit `no space left on device`.
   
   ### ADDITIONAL INFORMATION
   
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [ ] Changes UI
   - [ ] Includes DB Migration (follow approval process in 
[SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to