andygrove opened a new pull request, #87:
URL: https://github.com/apache/datafusion-java/pull/87
## Which issue does this PR close?
- Continues #33 (the binary release JAR build pipeline deferred from #77 —
\"Cross-build CI matrix, Sonatype/Maven Central wiring, and release docs are
deferred to follow-up PRs.\"). Not a full close — GitHub Actions matrix and
Maven Central distributionManagement wiring are still open follow-ups.
## Rationale for this change
PR #77 added the runtime side of the fat-JAR design: `NativeLibraryLoader`
reads `org/apache/datafusion/<os>/<arch>/lib<datafusion_jni>.<ext>` from the
JAR and extracts it on demand. But nothing in the repo actually produces a JAR
with more than one platform's lib inside — `core/pom.xml`'s host-activated
profile bundles only the host's lib, and the existing release scripts
(`dev/release/create-tarball.sh` etc.) only do source tarballs. A consumer
pulling the artifact from Maven Central today gets a JAR that works on the
platform it was built on and nowhere else.
This PR adds the build side: a release-manager script that drives two Docker
containers for the Linux arches and the host's own Rust toolchain for the macOS
arches, assembles all four `.so`/`.dylib` files into a single JAR, and installs
it into a temporary local Maven repo. A second script (not yet exercised in CI)
signs and uploads that repo to Apache Nexus staging.
The structure mirrors `datafusion-comet`'s release tooling
(`dev/release/comet-rm/Dockerfile`, `build-release-comet.sh`,
`publish-to-maven.sh`), simplified for this project: single module pair (no
Spark/Scala matrix) and macOS libs built natively on the RM's macOS host (no
OSXCross / Xcode SDK plumbing).
## What changes are included in this PR?
New release tooling under `dev/release/`:
- **`datafusion-java-rm/Dockerfile`** — Ubuntu 20.04 + Rust + protoc
(arch-aware download). Single-stage, no OSXCross. Built twice via
`--platform=linux/{amd64,arm64}`.
- **`datafusion-java-rm/build-native-libs.sh`** — runs inside the container:
clones the repo, `cargo build --release`. Container platform dictates target
arch.
- **`build-release.sh`** — host orchestrator. Detects host arch via `uname
-m`. Cleans + rebuilds both Docker images, runs each, docker-cp's the linux
libs into `core/target/classes/org/apache/datafusion/linux/{amd64,aarch64}/`,
then on the host runs `cargo build --release` (host arch native) and `cargo
build --release --target <other>-apple-darwin` (cross to the other arch) and
copies both `.dylib` files into
`core/target/classes/org/apache/datafusion/darwin/{x86_64,aarch64}/`. Finishes
with `./mvnw -Ddatafusion.native.profile=release -DskipTests install` into a
temp local Maven repo whose path is printed at the end. Pre-cleans leftover
named containers; traps SIGINT/SIGTERM/EXIT.
- **`publish-to-maven.sh`** — Nexus staging upload: creates staging repo via
REST, signs every artifact with GPG, uploads with `curl --fail-with-body`,
closes the staging repo. Not exercised in the dry run.
- **`README.md`** — new \"Binary Release: Multi-Platform JAR\" section after
the existing source-tarball flow.
Two follow-on fixes uncovered by the dry run:
- **`pom.xml`** — add `.github/**` and `dev/release/rat_exclude_files.txt`
to the `apache-rat-plugin` excludes. The plugin runs at the `verify` phase,
which `mvn install` triggers but `make test` does not. These files were already
exempt in `dev/release/rat_exclude_files.txt` (used by the source-tarball
flow's `check-rat-report.py`); this brings the pom-level RAT check into
alignment.
- **`dev/release/build-release.sh`** — pass
`-Ddatafusion.native.profile=release` to `./mvnw install`. `core/pom.xml`
defaults this property to `debug`, so without the override the antrun
`copy-native-lib` step looks for the host's lib under `native/target/debug/`
and fails the `<fail>` precondition check.
## Are these changes tested?
End-to-end dry run on macOS aarch64 produced
`datafusion-java-0.1.0-SNAPSHOT.jar` (175 MB) containing exactly the four
expected resource entries:
```
org/apache/datafusion/linux/amd64/libdatafusion_jni.so ELF 64-bit LSB
x86-64
org/apache/datafusion/linux/aarch64/libdatafusion_jni.so ELF 64-bit LSB
ARM aarch64
org/apache/datafusion/darwin/x86_64/libdatafusion_jni.dylib Mach-O 64-bit
x86_64
org/apache/datafusion/darwin/aarch64/libdatafusion_jni.dylib Mach-O 64-bit
arm64
```
`./mvnw test` is green on the branch (308 tests run, 0 failures, 13 skipped)
after the standard \`cargo build\` (debug) precondition from CLAUDE.md.
\`publish-to-maven.sh\` is not exercised in this PR. Validating it requires
real Apache Nexus credentials and a real GPG-signed release candidate, both of
which are out of scope for the build-pipeline dry run.
## Are there any user-facing changes?
No code or API changes. Release-manager-facing tooling only.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]