+1 On Thu, 19 Mar 2026 at 09:14, Matteo Merli <[email protected]> wrote:
> https://github.com/apache/pulsar/pull/25359 > > PoC PR: https://github.com/merlimat/pulsar/pull/16 > > > --------------------------------------------------------------------------------------------- > > # PIP-463: Migrate Build System from Maven to Gradle > > # Background Knowledge > > Apache Pulsar currently uses Maven as its build system. The project > has grown to over 100 modules > with complex dependency relationships, shaded JARs, NAR packaging, and > Docker image builds. > Maven's sequential execution model and limited caching capabilities > result in long build times > that impact developer productivity and CI throughput. > > [Gradle](https://gradle.org/) is a modern build system used by > large-scale Java projects > (e.g., Spring Boot, Micronaut, Apache Kafka). It provides parallel > task execution, > fine-grained caching, and incremental compilation out of the box. > > # Motivation > > The current Maven build has several pain points that affect developer > velocity and CI efficiency: > > **Slow local builds.** A full `mvn install -DskipTests` takes 5-8 > minutes on a modern machine. > Developers frequently wait for unrelated modules to rebuild when > iterating on a single component. > Maven has no built-in mechanism to skip unchanged modules — it > rebuilds everything in the reactor. > > **Slow CI.** The CI pipeline takes 50-60 minutes end-to-end. Maven's > lack of caching means > each CI run starts from scratch. Test jobs must either rebuild > everything or rely on fragile > artifact-sharing workarounds. > > **Imprecise dependency tracking.** Maven treats the entire module as > the unit of rebuild. > Changing a test resource file triggers a full recompile of the module. > There is no way to > run "only the tests affected by my change" — developers must run the > entire test suite > for a module or manually specify test classes. > > **Limited parallelism.** Maven's `-T` flag enables module-level > parallelism, but tasks within > a module still run sequentially. The Pulsar build has several > bottleneck modules (e.g., > `pulsar-broker`) where compilation, resource processing, and test > execution could overlap > with other modules but don't. > > **Complex shading and packaging.** The project uses Maven Shade > plugin, NAR plugin, and > custom Ant tasks for packaging. These configurations are verbose, hard > to maintain, and > have subtle interactions (e.g., the `ahc-default.properties` content > replacement for > AsyncHttpClient requires an Ant `<replace>` task in Maven but is a > single `filesMatching` > call in Gradle). > > **Poor IDE integration for multi-module builds.** IntelliJ IDEA's > Maven import for a project > of Pulsar's size is slow and memory-intensive. Gradle's tooling API > provides faster, > more reliable IDE synchronization. > > # Goals > > ## In Scope > > - **1:1 functional equivalence with Maven.** The Gradle build produces > identical artifacts: > - Server distribution tarball (`apache-pulsar-X.Y.Z-bin.tar.gz`) > with the same JARs > - Shell distribution tarball > - IO connectors distribution (NAR files) > - Offloaders distribution (NAR files) > - Docker images (`pulsar`, `pulsar-all`, `java-test-image`, > `pulsar-test-latest-version`) > - Shaded client JARs (`pulsar-client`, `pulsar-client-admin`, > `pulsar-client-all`) > verified to contain the same classes and relocations as Maven output > > - **All CI tests passing.** Unit tests, integration tests, system > tests, shade tests (Java 17/21/24), > and backward compatibility tests all pass on the Gradle build. > > - **Enforced dependency management.** A `pulsar-dependencies` platform > module (Gradle's equivalent > of Maven's `dependencyManagement`) ensures consistent dependency > versions across all modules. > > - **Version catalog.** A single `gradle/libs.versions.toml` file > defines all dependency coordinates > and versions, replacing scattered version properties across 100+ POM > files. > > - **CI workflow migration.** All GitHub Actions workflows converted > from Maven to Gradle commands. > > ## Out of Scope > > - Changing the project's module structure or merging/splitting modules > - Migrating to Kotlin DSL for production source code > - Gradle-specific optimizations beyond what Maven provides (e.g., > build cache server, > remote caching) — these are future enhancements enabled by the migration > - Removing the ability to build individual modules in isolation > > # High Level Design > > The migration introduces Gradle build scripts alongside (and > eventually replacing) the existing > Maven POM files. The approach is: > > 1. **Add Gradle build files** for all modules (`build.gradle.kts`, > `settings.gradle.kts`, > `gradle/libs.versions.toml`) > 2. **Convert CI workflows** from Maven to Gradle commands > 3. **Remove Maven files** (`pom.xml`, `mvnw`, `.mvn/`) > > The Gradle build is structured as: > > ``` > settings.gradle.kts # Module includes and plugin repositories > build.gradle.kts # Root build: common config, enforced > platform > gradle/libs.versions.toml # Version catalog (single source of > truth for versions) > pulsar-dependencies/ # Enforced platform module (replaces > dependencyManagement) > <module>/build.gradle.kts # Per-module build script > ``` > > Key design decisions: > > - **Shadow plugin** for shaded JARs (replaces Maven Shade), with > `filesMatching` for > property file content relocation > - **NAR plugin** (`io.github.merlimat.nar`) for connector packaging > - **LightProto plugin** for protobuf/lightproto code generation > - **Conditional project includes** for shade test modules (avoids > implicit parent project conflicts) > - **Enforced platform** (`pulsar-dependencies`) for version pinning > across all modules > > # Detailed Design > > ## Design & Implementation Details > > ### Build Performance Improvements > > | Aspect | Maven | Gradle | > |--------|-------|--------| > | Incremental compilation | No | Yes — only recompiles changed files | > | Task-level caching | No | Yes — skips tasks whose inputs haven't changed > | > | Parallel execution | Module-level only (`-T`) | Task-level > (automatic dependency graph) | > | Configuration caching | No | Yes — reuses build configuration across > runs | > | Local build cache | No | Yes — persists across builds | > | Remote build cache | No | Yes — shared across CI and developers (future) > | > > **Expected impact:** > - Local incremental builds (after initial): **seconds** instead of minutes > - CI with caching: **30-50% faster** (exact numbers depend on cache hit > rates) > - "Build only what I need to test": `./gradlew :pulsar-broker:test` builds > only > the broker and its dependencies, skipping unrelated modules entirely > > ### Develocity Integration > > Gradle provides native integration with > [Develocity](https://gradle.com/develocity/) > (formerly Gradle Enterprise), hosted by the ASF at > `develocity.apache.org`. Every CI > build automatically publishes a build scan that provides: > > - **Test execution details**: per-test timings, pass/fail status, > output logs, and > stack traces — all searchable and filterable without downloading CI > artifacts > - **Task execution timeline**: visual breakdown of what ran, what was > cached, and what > was up-to-date, making it easy to identify bottleneck tasks > - **Dependency resolution**: full dependency tree with conflict > resolution details > - **Build comparison**: diff two builds to see what changed in task > execution or outputs > - **Failure analysis**: aggregated view of flaky tests across builds > > Example build scan from the PoC CI run: > [ > https://develocity.apache.org/s/h6ckzn3nn4w2s](https://develocity.apache.org/s/h6ckzn3nn4w2s) > > This level of observability is not available with the Maven build today. > > ### Dependency Management > > Maven's `dependencyManagement` in the root POM is replaced by: > > 1. **Version catalog** (`gradle/libs.versions.toml`): Defines all > dependency coordinates > and version numbers in one file. Modules reference dependencies as > `libs.netty.buffer` > instead of hardcoded group:artifact:version strings. > > 2. **Enforced platform** (`pulsar-dependencies`): A `java-platform` > module that creates > version constraints from the catalog. Applied to all subprojects via > `implementation(enforcedPlatform(project(":pulsar-dependencies")))`. > This ensures > transitive dependencies are pinned to the same versions Maven would > resolve. > > ### Shaded JAR Configuration > > The Shadow plugin replaces Maven Shade. Key differences handled: > > - **AsyncHttpClient properties**: Maven uses Ant `<replace>` to fix > property name prefixes > in `ahc-default.properties`. Gradle uses `filesMatching { filter { } }`. > - **Dependency include/exclude**: Shadow's `dependencies { > include/exclude }` DSL replaces > Maven Shade's `<includes>/<excludes>`. > - **Relocation**: Shadow's `relocate()` is functionally identical to > Maven Shade's. > > ### NAR Packaging > > A custom NAR Gradle plugin (`io.github.merlimat.nar`) handles > connector packaging. > Global exclusions for platform modules (provided by > `java-instance.jar` at runtime) > are configured in the root `build.gradle.kts`. > > ### Module-Specific Overrides > > Some modules require version overrides that differ from the enforced > platform: > > - **`kinesis-kpl-shaded`**: Forces `protobuf-java:4.29.0` (KPL > requires protobuf 4.x, > while Pulsar uses 3.x). The protobuf is relocated so no runtime conflict. > - **`jclouds-shaded`**: Forces Guice 7.0.0, `jakarta.annotation-api:3.0.0`, > `jakarta.ws.rs-api:3.1.0`, `jakarta.inject-api:2.0.1` (jclouds 2.6.0 > requires > Jakarta EE 10+ APIs). All are bundled in the shadow JAR. > > ## Public-facing Changes > > ### Configuration > > No new broker/client configuration options. The build system change is > transparent to users. > > ### CLI > > - `mvn` commands replaced by `./gradlew` commands in documentation and > scripts > - `src/set-project-version.sh` updated to modify > `gradle/libs.versions.toml` > > ### Binary Artifacts > > Artifacts are functionally identical. Minor differences: > - Some shaded JARs may have slightly different class counts due to > Shadow vs Shade plugin > differences in handling `package-info.class` files (no runtime impact) > > # Security Considerations > > No security implications. The build system change does not affect > Pulsar's runtime > security model, authentication, or authorization. > > The Gradle wrapper (`gradlew`) is committed to the repository with a > checksum-verified > distribution URL, following the same security model as the Maven wrapper. > > # General Notes > > The implementation PR demonstrates full CI green status across all test > suites, > confirming functional equivalence with the Maven build. > > # Links > > * Proof of Concept PR (CI fully green): > https://github.com/merlimat/pulsar/pull/16 > * Mailing List discussion thread: [link] > * Mailing List voting thread: [link] > > > -- > Matteo Merli > <[email protected]> >
