rusackas opened a new pull request, #40880:
URL: https://github.com/apache/superset/pull/40880

   ### SUMMARY
   
   Groundwork for fixing CI's Docker Hub service-pull flakes **without breaking 
fork PRs** — the failure mode that #40875 hit and #40879 reverts.
   
   **Root cause of the #40875 fork breakage:** adding `credentials:` to a 
`services:` container looks safe, but on fork PRs the 
`DOCKERHUB_USER`/`DOCKERHUB_TOKEN` secrets are unavailable, so the templated 
values resolve to empty strings. GitHub Actions validates the `credentials:` 
block at job-setup time and rejects an empty `username`/`password` with a hard 
template error:
   
   ```
   The template is not valid. superset-python-integrationtest.yml
   (Line 55,56,69,70): Unexpected value ''
   ```
   
   So every fork PR's Python-Integration / E2E / Presto-Hive job died at **"Set 
up job"**. Empty creds do **not** fall back to anonymous pulls — they fail to 
parse. (See run 
[27179055813](https://github.com/apache/superset/actions/runs/27179055813/job/80234090697)
 on a fork.)
   
   **This PR (groundwork):** add a scheduled/dispatchable workflow that mirrors 
the four Docker Hub **service-container** images CI relies on — 
`postgres:17-alpine`, `redis:7-alpine`, `mysql:8.0`, 
`starburstdata/presto:350-e.6` — into this repo's **GHCR** namespace under a 
`ci/` prefix (`ghcr.io/apache/superset/ci/<name>`).
   
   **Why GHCR fixes it for everyone:** public GHCR images are pulled 
**without** Docker Hub's anonymous rate limit **and without any credentials**. 
Once CI points at the mirrored copies, the consuming workflows can drop their 
`credentials:` blocks entirely → no empty-secret parse error → **forks work 
unchanged**, and same-repo/`master` stop flaking.
   
   This PR adds *only* the mirror workflow. The repoint of the 
`services.*.image` refs is the follow-up (ready-to-go diff below), staged so CI 
never points at images that don't exist yet.
   
   ### ⚠️ One-time bootstrap (maintainer) — do these before the repoint
   
   - [ ] Merge this PR so the workflow is on the default branch 
(`workflow_dispatch` requires that).
   - [ ] **Run** "Mirror service images to GHCR" once (Actions → Run workflow).
   - [ ] **Confirm `ghcr.io/apache/superset/*` push works under ASF infra.** 
This is the key unknown — the first run will tell us whether the repo's 
`GITHUB_TOKEN` has `packages: write` to the apache GHCR namespace. If it 
doesn't, that's the blocker to resolve with ASF infra (apache/airflow et al. 
publish to GHCR, so it's likely fine).
   - [ ] Set the four mirrored packages' visibility to **public** (this is what 
lets fork CI pull without auth).
   - [ ] Open the repoint follow-up (diff below).
   
   ### Follow-up repoint (PR B) — for reference, NOT in this PR
   
   For each `services:` image across `superset-e2e.yml`, 
`superset-python-integrationtest.yml`, `superset-python-presto-hive.yml`:
   
   ```diff
          postgres:
   -        image: postgres:17-alpine
   +        image: ghcr.io/apache/superset/ci/postgres:17-alpine
   -        credentials:
   -          username: ${{ secrets.DOCKERHUB_USER }}
   -          password: ${{ secrets.DOCKERHUB_TOKEN }}
   ```
   
   (…same for `redis:7-alpine` → `ci/redis:7-alpine`, `mysql:8.0` → 
`ci/mysql:8.0`, `starburstdata/presto:350-e.6` → `ci/presto:350-e.6`. Every 
`credentials:` block on a mirrored service is removed.)
   
   ### Out of scope
   
   The `bde2020/hive-metastore-postgresql` image is pulled via `docker compose` 
(not a `services:` block), so it never hit the parse error. Mirroring it is a 
separate, optional follow-up.
   
   ### TESTING INSTRUCTIONS
   
   The mirror workflow is exercised by running it from the Actions tab; its job 
summary lists each `docker.io/... → ghcr.io/...` copy. The repoint is validated 
by the existing integration/E2E/Presto-Hive suites once it lands.
   
   ### ADDITIONAL INFORMATION
   
   - [ ] Has associated issue:
   - [ ] Required feature flags:
   - [ ] Changes UI
   - [ ] Includes DB Migration
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to