RussellSpitzer opened a new pull request, #16259: URL: https://github.com/apache/iceberg/pull/16259
Includes 186 bats unit tests covering all shared libraries and end-to-end dry-runs of every script. ### Rationale for this change The current Iceberg release process is documented in [How to Release](https://iceberg.apache.org/how-to-release/) and driven by hand plus the existing `dev/source-release.sh`, `dev/stage-binaries.sh`, and `dev/check-license` scripts. Most of it (Nexus close/release/drop, SVN promotion, old-release cleanup, dist/dev → dist/release move, final tag, GitHub Release, vote/announce emails) is still manual. This PR adds a semi-automated release pipeline modelled after the [Apache Polaris release tooling](https://github.com/apache/polaris/tree/main/releasey) and the recent [Apache Parquet PR #3548](https://github.com/apache/parquet-java/pull/3548), adapted to Iceberg's Gradle build and existing release scripts. It does not change `dev/source-release.sh`, `dev/stage-binaries.sh`, or `site/docs/how-to-release.md`; the manual flow remains a fallback. ### What changes are included in this PR? Scripts (`release/`): - `prepare-rc.sh` — full pre-vote flow: branch validation, RC tag, GitHub CI check, source tarball + GPG signing, SVN dist/dev staging, multi-axis Gradle convenience-binary publish, Nexus staging-repo close, GitHub pre-release, `[VOTE]` email template. - `publish-release.sh` — full post-vote flow: HEAD-vs-RC-tag guard, Nexus staging-repo verification, `svn mv` to dist/release, same-`MAJOR.MINOR` patch-release cleanup, final tag push, Nexus release to Maven Central, GitHub Release, `[ANNOUNCE]` email template. - `cancel-rc.sh` — failed-vote rollback: same staging-repo verification, Nexus drop, dist/dev SVN cleanup, `[RESULT][VOTE]` failure email template. RC git tag is left in place per ASF policy. Shared libraries (`release/libs/`): - `_constants.sh` — config + version regexes. The Scala/Flink/Spark/Kafka matrix is read directly from `gradle.properties` (`systemProp.knownXxxVersions` / `defaultScalaVersion`) so the pipeline cannot drift from the build. - `_log.sh`, `_exec.sh`, `_version.sh`, `_github.sh`. - `_nexus.sh` — Nexus 2 staging API helpers; `nexus_verify_staging_repo` checks profile / state / `iceberg-core-<version>.pom` presence / description. - `_svn.sh` — `svn` helpers that pipe `SVN_PASSWORD` on stdin via `--password-from-stdin` so it never appears in argv (and is not visible via `/proc/<pid>/cmdline`); plus `filter_superseded_patch_releases` for the same-`MAJOR.MINOR` cleanup. - `_gradle.sh` — `verify_jdk_17`, `build_source_tarball` (mirrors `dev/source-release.sh`), and `stage_convenience_binaries` (two-pass publish, see "Safety" below). GitHub Actions workflows: - `release-prepare-rc.yml`, `release-publish.yml`, `release-cancel-rc.yml` — dispatch-only entry points; default to dry-run. - `ci-release-scripts.yml` — runs bats + shellcheck on every PR/push that touches `release/`. All three release workflows share a single `concurrency` group (`iceberg-release`, `cancel-in-progress: false`) so two dispatches cannot overlap on git tags, the Nexus staging repo, or SVN paths. ### Are these changes tested? Locally only: - `bats release/tests` → 186 tests pass (constants, log, exec, version, github, nexus, gradle, svn, end-to-end branch management, `publish-release`, `cancel-rc`). - `shellcheck release/*.sh release/libs/*.sh` → clean. - Dry-run smoke of all three scripts succeeds end-to-end against a hermetic fixture. **To actually exercise this in production we need an Infra ticket to land the appropriate secrets on `apache/iceberg`:** - `NEXUS_USER` / `NEXUS_PW` (Apache Nexus, already used by `publish-snapshot.yml`). - `ICEBERG_SVN_DEV_USERNAME` / `ICEBERG_SVN_DEV_PASSWORD` (`dist.apache.org` credentials). - A `maven-publish` GitHub environment guarding all three release workflows. ### Are there any user-facing changes? No. The release pipeline is invisible to end users; this only affects release managers and contributors. `site/docs/how-to-release.md` is intentionally **not** updated in this PR — we want the workflow to land first, then update the docs in a follow-up PR once the secrets are in place and a real RC has been driven through it. --- ## Release Workflow This PR layers three GitHub Actions workflows + locally-runnable Bash scripts on top of the existing manual flow. All workflows default to **dry-run mode**; the release manager has to explicitly uncheck the `Dry run` checkbox before any external service is touched. ### 1. Prepare RC (Pre-Vote) The release manager launches the **"Release - Prepare Release Candidate"** workflow (`release-prepare-rc.yml`) via `workflow_dispatch` with: - **version** — e.g. `1.10.0` - **rc_number** *(optional)* — auto-detected from existing tags if empty - **dry_run** — defaults to `true` Branching follows Iceberg's existing convention: **X.Y.0 RCs are tagged from `main`**; **X.Y.Z (Z ≥ 1) requires the `X.Y.x` branch to already exist on the remote** (the script does not auto-create branches — the release manager creates `X.Y.x` after the X.Y.0 release per current practice). | # | Step in `prepare-rc.sh` | Replaces (docs / existing script) | |---|--------------------------|-----------------------------------| | 0 | Validate inputs, prerequisites (`gpg`, `svn`, JDK 17, gradlew) | Manual | | 1 | Auto-detect RC number from `apache-iceberg-X.Y.Z-rc*` tags | Manual decision | | 2 | Verify GitHub CI is green at HEAD (`gh api .../check-runs`) | Manual UI check | | 3 | Resolve release commit (X.Y.0 → `origin/main`; X.Y.Z → `origin/X.Y.x`); align workspace via `git checkout --detach` | Manual | | 4 | Create + push `apache-iceberg-X.Y.Z-rcN` tag | `dev/source-release.sh` | | 5 | Build + GPG-sign source tarball + `.sha512` | `dev/source-release.sh` | | 6 | SVN-stage tarball/asc/sha512 to `dist/dev/iceberg/<rc>` | `dev/source-release.sh` | | 7 | Publish convenience binaries to Nexus across the Scala/Flink/Spark/Kafka matrix | `dev/stage-binaries.sh` | | 8 | Close the Nexus staging repository | Manual Nexus UI | | 9 | Create a GitHub pre-release at the RC tag | Manual GitHub UI | | 10 | Emit `[VOTE]` email template into the workflow step summary | Manual email composition | The release manager copies the `[VOTE]` email out of the step summary and **manually sends it** to `[email protected]`. ### 2. Vote The community votes for at least 72 hours. The release manager tallies results manually. --- ### If the vote fails → Cancel RC The release manager launches the **"Release - Cancel Release Candidate"** workflow (`release-cancel-rc.yml`) with: - **version** — e.g. `1.10.0` - **rc_number** — the RC number to cancel - **staging_repo_id** — e.g. `orgapacheiceberg-1234` - **allow_description_mismatch** *(optional)* — recovery override - **dry_run** — defaults to `true` | # | Step in `cancel-rc.sh` | Replaces | |---|-------------------------|----------| | 0 | Verify the staging repository (see "Safety" below) | Manual visual check in Nexus UI | | 1 | Drop the Nexus staging repo | Manual Nexus UI | | 2 | Delete `dist/dev/iceberg/<rc>` from SVN | Manual `svn rm` | | 3 | Emit `[RESULT][VOTE]` failure email template | Manual email composition | The RC git tag is preserved per ASF release policy. --- ### If the vote passes → Publish Release The release manager launches the **"Release - Publish After Vote"** workflow (`release-publish.yml`) with: - **version** — e.g. `1.10.0` - **rc_number** *(optional)* — RC that passed the vote (auto-detects latest; rejects older RCs) - **staging_repo_id** — e.g. `orgapacheiceberg-1234` - **allow_description_mismatch** *(optional)* - **dry_run** — defaults to `true` | # | Step in `publish-release.sh` | Replaces | |---|-------------------------------|----------| | 0 | Verify HEAD == the RC tag commit | Manual checkout | | 1 | Verify the staging repository (see "Safety" below) | Manual visual check in Nexus UI | | 2 | `svn mv` from `dist/dev/iceberg/<rc>` to `dist/release/iceberg/apache-iceberg-X.Y.Z` | Manual `svn mv` | | 3 | Remove superseded patch releases on the same `MAJOR.MINOR` line (e.g. publishing 1.9.2 removes 1.9.0 / 1.9.1; older minor lines are left for the standard `archive.apache.org` flow). `--keep-old-releases` skips this. | Manual / not consistently done today | | 4 | Create + push `apache-iceberg-X.Y.Z` final tag | Manual `git tag` | | 5 | Release the Nexus staging repository to Maven Central | Manual Nexus UI | | 6 | Create the GitHub Release at the final tag | Manual GitHub UI | | 7 | Emit `[ANNOUNCE]` email template + reminders for `revapi`, issue templates, `doap.rdf`, versioned docs/Javadoc, site update | Manual checklist | Iceberg has **no version-bump commit** on the release branch — the `gitVersion` plugin derives the development version from the latest tag — so unlike Parquet there's no `release:update-versions` step. ### Safety: Staging Repository Verification Before any destructive Nexus action (promote in publish, drop in cancel), both scripts verify the staging repo against the requested `(version, rc_number)` pair: 1. **Profile** is `org.apache.iceberg` *(hard fail)* 2. **State** is `closed` *(hard fail)* 3. **Artifact** `iceberg-core-<version>.pom` exists in the repo *(hard fail)* 4. **Description** contains `Apache Iceberg <version> RC<rc>` *(warn; bypass with `--allow-description-mismatch` / the workflow input)* This closes a gap in the current manual flow: typing a staging-repo ID into a form field replaces the human "look at the artifact tree in Nexus" check, so the script does that look-up explicitly. ### Safety: Convenience-binary publish matrix The Scala/Flink/Spark/Kafka matrix is read at module-load time from `gradle.properties` (`systemProp.knownXxxVersions` / `defaultScalaVersion`). Two Gradle passes: 1. **Primary-Scala main pass** — full Flink/Kafka matrix + the Spark versions that still support Scala 2.12 (i.e. Spark 3.x). Spark 4+ is intentionally excluded because it dropped Scala 2.12 support. 2. **Secondary-Scala (Spark-only) sweep** — one `./gradlew` invocation per Spark version in `knownSparkVersions`, publishing the `iceberg-spark-<v>_2.13`, `iceberg-spark-extensions-<v>_2.13`, and `iceberg-spark-runtime-<v>_2.13` modules. This covers Spark 3.x (adding the 2.13 variant) and Spark 4.x (which is published only here). Net artifact set: `iceberg-spark-{3.4,3.5}_{2.12,2.13}` plus `iceberg-spark-{4.0,4.1}_2.13`. A bats drift guard re-parses `gradle.properties` and asserts every Gradle invocation in the dry-run matches; another guard asserts no `gradlew` line ever combines `-DscalaVersion=2.12` with `-DsparkVersions=...4.x`. ### Safety: Credentials - **SVN** — every `svn` call goes through `release/libs/_svn.sh` helpers that pipe `SVN_PASSWORD` on stdin via `--password-from-stdin`. Never on argv. - **Nexus REST** — credentials are passed to `curl` via `-K` on stdin. - **Gradle publish** — credentials are exported as `ORG_GRADLE_PROJECT_mavenUser` / `mavenPassword` env vars rather than `-PmavenUser=` / `-PmavenPassword=`. Never on argv. - `_redact_secrets` covers every secret env var the workflows inject and is exercised by table-driven bats tests. --- ## What this PR is _not_ - **No changes to `dev/source-release.sh`, `dev/stage-binaries.sh`, or `dev/check-license`.** The manual flow remains a supported fallback. - **No changes to `site/docs/how-to-release.md`.** Documentation will be updated in a follow-up PR once the workflows have driven a real RC end to end. - **No new third-party dependencies.** Only requires `bash`, `bats`, `shellcheck`, `gh`, `svn`, `curl`, `jq`, `gpg`, `gradle`/`./gradlew`, Java 17 — all already available on `ubuntu-24.04` runners or installed via `apt`. ## Follow-ups 1. **Apache Infra ticket** to land `NEXUS_USER` / `NEXUS_PW` / `ICEBERG_SVN_DEV_USERNAME` / `ICEBERG_SVN_DEV_PASSWORD` and the `maven-publish` environment on `apache/iceberg`. 2. Drive a real RC end-to-end (with `dry_run` unchecked) on the next release cycle. 3. **Documentation PR**: update `site/docs/how-to-release.md` to describe the semi-automated flow alongside the manual one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
