andygrove opened a new pull request, #87:
URL: https://github.com/apache/datafusion-java/pull/87

   ## Which issue does this PR close?
   
   - Continues #33 (the binary release JAR build pipeline deferred from #77 — 
\"Cross-build CI matrix, Sonatype/Maven Central wiring, and release docs are 
deferred to follow-up PRs.\"). Not a full close — GitHub Actions matrix and 
Maven Central distributionManagement wiring are still open follow-ups.
   
   ## Rationale for this change
   
   PR #77 added the runtime side of the fat-JAR design: `NativeLibraryLoader` 
reads `org/apache/datafusion/<os>/<arch>/lib<datafusion_jni>.<ext>` from the 
JAR and extracts it on demand. But nothing in the repo actually produces a JAR 
with more than one platform's lib inside — `core/pom.xml`'s host-activated 
profile bundles only the host's lib, and the existing release scripts 
(`dev/release/create-tarball.sh` etc.) only do source tarballs. A consumer 
pulling the artifact from Maven Central today gets a JAR that works on the 
platform it was built on and nowhere else.
   
   This PR adds the build side: a release-manager script that drives two Docker 
containers for the Linux arches and the host's own Rust toolchain for the macOS 
arches, assembles all four `.so`/`.dylib` files into a single JAR, and installs 
it into a temporary local Maven repo. A second script (not yet exercised in CI) 
signs and uploads that repo to Apache Nexus staging.
   
   The structure mirrors `datafusion-comet`'s release tooling 
(`dev/release/comet-rm/Dockerfile`, `build-release-comet.sh`, 
`publish-to-maven.sh`), simplified for this project: single module pair (no 
Spark/Scala matrix) and macOS libs built natively on the RM's macOS host (no 
OSXCross / Xcode SDK plumbing).
   
   ## What changes are included in this PR?
   
   New release tooling under `dev/release/`:
   
   - **`datafusion-java-rm/Dockerfile`** — Ubuntu 20.04 + Rust + protoc 
(arch-aware download). Single-stage, no OSXCross. Built twice via 
`--platform=linux/{amd64,arm64}`.
   - **`datafusion-java-rm/build-native-libs.sh`** — runs inside the container: 
clones the repo, `cargo build --release`. Container platform dictates target 
arch.
   - **`build-release.sh`** — host orchestrator. Detects host arch via `uname 
-m`. Cleans + rebuilds both Docker images, runs each, docker-cp's the linux 
libs into `core/target/classes/org/apache/datafusion/linux/{amd64,aarch64}/`, 
then on the host runs `cargo build --release` (host arch native) and `cargo 
build --release --target <other>-apple-darwin` (cross to the other arch) and 
copies both `.dylib` files into 
`core/target/classes/org/apache/datafusion/darwin/{x86_64,aarch64}/`. Finishes 
with `./mvnw -Ddatafusion.native.profile=release -DskipTests install` into a 
temp local Maven repo whose path is printed at the end. Pre-cleans leftover 
named containers; traps SIGINT/SIGTERM/EXIT.
   - **`publish-to-maven.sh`** — Nexus staging upload: creates staging repo via 
REST, signs every artifact with GPG, uploads with `curl --fail-with-body`, 
closes the staging repo. Not exercised in the dry run.
   - **`README.md`** — new \"Binary Release: Multi-Platform JAR\" section after 
the existing source-tarball flow.
   
   Two follow-on fixes uncovered by the dry run:
   
   - **`pom.xml`** — add `.github/**` and `dev/release/rat_exclude_files.txt` 
to the `apache-rat-plugin` excludes. The plugin runs at the `verify` phase, 
which `mvn install` triggers but `make test` does not. These files were already 
exempt in `dev/release/rat_exclude_files.txt` (used by the source-tarball 
flow's `check-rat-report.py`); this brings the pom-level RAT check into 
alignment.
   - **`dev/release/build-release.sh`** — pass 
`-Ddatafusion.native.profile=release` to `./mvnw install`. `core/pom.xml` 
defaults this property to `debug`, so without the override the antrun 
`copy-native-lib` step looks for the host's lib under `native/target/debug/` 
and fails the `<fail>` precondition check.
   
   ## Are these changes tested?
   
   End-to-end dry run on macOS aarch64 produced 
`datafusion-java-0.1.0-SNAPSHOT.jar` (175 MB) containing exactly the four 
expected resource entries:
   
   ```
   org/apache/datafusion/linux/amd64/libdatafusion_jni.so       ELF 64-bit LSB 
x86-64
   org/apache/datafusion/linux/aarch64/libdatafusion_jni.so     ELF 64-bit LSB 
ARM aarch64
   org/apache/datafusion/darwin/x86_64/libdatafusion_jni.dylib  Mach-O 64-bit 
x86_64
   org/apache/datafusion/darwin/aarch64/libdatafusion_jni.dylib Mach-O 64-bit 
arm64
   ```
   
   `./mvnw test` is green on the branch (308 tests run, 0 failures, 13 skipped) 
after the standard \`cargo build\` (debug) precondition from CLAUDE.md.
   
   \`publish-to-maven.sh\` is not exercised in this PR. Validating it requires 
real Apache Nexus credentials and a real GPG-signed release candidate, both of 
which are out of scope for the build-pipeline dry run.
   
   ## Are there any user-facing changes?
   
   No code or API changes. Release-manager-facing tooling only.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to