+1

On Thu, 19 Mar 2026 at 09:14, Matteo Merli <[email protected]> wrote:

> https://github.com/apache/pulsar/pull/25359
>
> PoC PR: https://github.com/merlimat/pulsar/pull/16
>
>
> ---------------------------------------------------------------------------------------------
>
> # PIP-463: Migrate Build System from Maven to Gradle
>
> # Background Knowledge
>
> Apache Pulsar currently uses Maven as its build system. The project
> has grown to over 100 modules
> with complex dependency relationships, shaded JARs, NAR packaging, and
> Docker image builds.
> Maven's sequential execution model and limited caching capabilities
> result in long build times
> that impact developer productivity and CI throughput.
>
> [Gradle](https://gradle.org/) is a modern build system used by
> large-scale Java projects
> (e.g., Spring Boot, Micronaut, Apache Kafka). It provides parallel
> task execution,
> fine-grained caching, and incremental compilation out of the box.
>
> # Motivation
>
> The current Maven build has several pain points that affect developer
> velocity and CI efficiency:
>
> **Slow local builds.** A full `mvn install -DskipTests` takes 5-8
> minutes on a modern machine.
> Developers frequently wait for unrelated modules to rebuild when
> iterating on a single component.
> Maven has no built-in mechanism to skip unchanged modules — it
> rebuilds everything in the reactor.
>
> **Slow CI.** The CI pipeline takes 50-60 minutes end-to-end. Maven's
> lack of caching means
> each CI run starts from scratch. Test jobs must either rebuild
> everything or rely on fragile
> artifact-sharing workarounds.
>
> **Imprecise dependency tracking.** Maven treats the entire module as
> the unit of rebuild.
> Changing a test resource file triggers a full recompile of the module.
> There is no way to
> run "only the tests affected by my change" — developers must run the
> entire test suite
> for a module or manually specify test classes.
>
> **Limited parallelism.** Maven's `-T` flag enables module-level
> parallelism, but tasks within
> a module still run sequentially. The Pulsar build has several
> bottleneck modules (e.g.,
> `pulsar-broker`) where compilation, resource processing, and test
> execution could overlap
> with other modules but don't.
>
> **Complex shading and packaging.** The project uses Maven Shade
> plugin, NAR plugin, and
> custom Ant tasks for packaging. These configurations are verbose, hard
> to maintain, and
> have subtle interactions (e.g., the `ahc-default.properties` content
> replacement for
> AsyncHttpClient requires an Ant `<replace>` task in Maven but is a
> single `filesMatching`
> call in Gradle).
>
> **Poor IDE integration for multi-module builds.** IntelliJ IDEA's
> Maven import for a project
> of Pulsar's size is slow and memory-intensive. Gradle's tooling API
> provides faster,
> more reliable IDE synchronization.
>
> # Goals
>
> ## In Scope
>
> - **1:1 functional equivalence with Maven.** The Gradle build produces
> identical artifacts:
>   - Server distribution tarball (`apache-pulsar-X.Y.Z-bin.tar.gz`)
> with the same JARs
>   - Shell distribution tarball
>   - IO connectors distribution (NAR files)
>   - Offloaders distribution (NAR files)
>   - Docker images (`pulsar`, `pulsar-all`, `java-test-image`,
> `pulsar-test-latest-version`)
>   - Shaded client JARs (`pulsar-client`, `pulsar-client-admin`,
> `pulsar-client-all`)
>     verified to contain the same classes and relocations as Maven output
>
> - **All CI tests passing.** Unit tests, integration tests, system
> tests, shade tests (Java 17/21/24),
>   and backward compatibility tests all pass on the Gradle build.
>
> - **Enforced dependency management.** A `pulsar-dependencies` platform
> module (Gradle's equivalent
>   of Maven's `dependencyManagement`) ensures consistent dependency
> versions across all modules.
>
> - **Version catalog.** A single `gradle/libs.versions.toml` file
> defines all dependency coordinates
>   and versions, replacing scattered version properties across 100+ POM
> files.
>
> - **CI workflow migration.** All GitHub Actions workflows converted
> from Maven to Gradle commands.
>
> ## Out of Scope
>
> - Changing the project's module structure or merging/splitting modules
> - Migrating to Kotlin DSL for production source code
> - Gradle-specific optimizations beyond what Maven provides (e.g.,
> build cache server,
>   remote caching) — these are future enhancements enabled by the migration
> - Removing the ability to build individual modules in isolation
>
> # High Level Design
>
> The migration introduces Gradle build scripts alongside (and
> eventually replacing) the existing
> Maven POM files. The approach is:
>
> 1. **Add Gradle build files** for all modules (`build.gradle.kts`,
> `settings.gradle.kts`,
>    `gradle/libs.versions.toml`)
> 2. **Convert CI workflows** from Maven to Gradle commands
> 3. **Remove Maven files** (`pom.xml`, `mvnw`, `.mvn/`)
>
> The Gradle build is structured as:
>
> ```
> settings.gradle.kts          # Module includes and plugin repositories
> build.gradle.kts              # Root build: common config, enforced
> platform
> gradle/libs.versions.toml     # Version catalog (single source of
> truth for versions)
> pulsar-dependencies/          # Enforced platform module (replaces
> dependencyManagement)
> <module>/build.gradle.kts     # Per-module build script
> ```
>
> Key design decisions:
>
> - **Shadow plugin** for shaded JARs (replaces Maven Shade), with
> `filesMatching` for
>   property file content relocation
> - **NAR plugin** (`io.github.merlimat.nar`) for connector packaging
> - **LightProto plugin** for protobuf/lightproto code generation
> - **Conditional project includes** for shade test modules (avoids
> implicit parent project conflicts)
> - **Enforced platform** (`pulsar-dependencies`) for version pinning
> across all modules
>
> # Detailed Design
>
> ## Design & Implementation Details
>
> ### Build Performance Improvements
>
> | Aspect | Maven | Gradle |
> |--------|-------|--------|
> | Incremental compilation | No | Yes — only recompiles changed files |
> | Task-level caching | No | Yes — skips tasks whose inputs haven't changed
> |
> | Parallel execution | Module-level only (`-T`) | Task-level
> (automatic dependency graph) |
> | Configuration caching | No | Yes — reuses build configuration across
> runs |
> | Local build cache | No | Yes — persists across builds |
> | Remote build cache | No | Yes — shared across CI and developers (future)
> |
>
> **Expected impact:**
> - Local incremental builds (after initial): **seconds** instead of minutes
> - CI with caching: **30-50% faster** (exact numbers depend on cache hit
> rates)
> - "Build only what I need to test": `./gradlew :pulsar-broker:test` builds
> only
>   the broker and its dependencies, skipping unrelated modules entirely
>
> ### Develocity Integration
>
> Gradle provides native integration with
> [Develocity](https://gradle.com/develocity/)
> (formerly Gradle Enterprise), hosted by the ASF at
> `develocity.apache.org`. Every CI
> build automatically publishes a build scan that provides:
>
> - **Test execution details**: per-test timings, pass/fail status,
> output logs, and
>   stack traces — all searchable and filterable without downloading CI
> artifacts
> - **Task execution timeline**: visual breakdown of what ran, what was
> cached, and what
>   was up-to-date, making it easy to identify bottleneck tasks
> - **Dependency resolution**: full dependency tree with conflict
> resolution details
> - **Build comparison**: diff two builds to see what changed in task
> execution or outputs
> - **Failure analysis**: aggregated view of flaky tests across builds
>
> Example build scan from the PoC CI run:
> [
> https://develocity.apache.org/s/h6ckzn3nn4w2s](https://develocity.apache.org/s/h6ckzn3nn4w2s)
>
> This level of observability is not available with the Maven build today.
>
> ### Dependency Management
>
> Maven's `dependencyManagement` in the root POM is replaced by:
>
> 1. **Version catalog** (`gradle/libs.versions.toml`): Defines all
> dependency coordinates
>    and version numbers in one file. Modules reference dependencies as
> `libs.netty.buffer`
>    instead of hardcoded group:artifact:version strings.
>
> 2. **Enforced platform** (`pulsar-dependencies`): A `java-platform`
> module that creates
>    version constraints from the catalog. Applied to all subprojects via
>    `implementation(enforcedPlatform(project(":pulsar-dependencies")))`.
> This ensures
>    transitive dependencies are pinned to the same versions Maven would
> resolve.
>
> ### Shaded JAR Configuration
>
> The Shadow plugin replaces Maven Shade. Key differences handled:
>
> - **AsyncHttpClient properties**: Maven uses Ant `<replace>` to fix
> property name prefixes
>   in `ahc-default.properties`. Gradle uses `filesMatching { filter { } }`.
> - **Dependency include/exclude**: Shadow's `dependencies {
> include/exclude }` DSL replaces
>   Maven Shade's `<includes>/<excludes>`.
> - **Relocation**: Shadow's `relocate()` is functionally identical to
> Maven Shade's.
>
> ### NAR Packaging
>
> A custom NAR Gradle plugin (`io.github.merlimat.nar`) handles
> connector packaging.
> Global exclusions for platform modules (provided by
> `java-instance.jar` at runtime)
> are configured in the root `build.gradle.kts`.
>
> ### Module-Specific Overrides
>
> Some modules require version overrides that differ from the enforced
> platform:
>
> - **`kinesis-kpl-shaded`**: Forces `protobuf-java:4.29.0` (KPL
> requires protobuf 4.x,
>   while Pulsar uses 3.x). The protobuf is relocated so no runtime conflict.
> - **`jclouds-shaded`**: Forces Guice 7.0.0, `jakarta.annotation-api:3.0.0`,
>   `jakarta.ws.rs-api:3.1.0`, `jakarta.inject-api:2.0.1` (jclouds 2.6.0
> requires
>   Jakarta EE 10+ APIs). All are bundled in the shadow JAR.
>
> ## Public-facing Changes
>
> ### Configuration
>
> No new broker/client configuration options. The build system change is
> transparent to users.
>
> ### CLI
>
> - `mvn` commands replaced by `./gradlew` commands in documentation and
> scripts
> - `src/set-project-version.sh` updated to modify
> `gradle/libs.versions.toml`
>
> ### Binary Artifacts
>
> Artifacts are functionally identical. Minor differences:
> - Some shaded JARs may have slightly different class counts due to
> Shadow vs Shade plugin
>   differences in handling `package-info.class` files (no runtime impact)
>
> # Security Considerations
>
> No security implications. The build system change does not affect
> Pulsar's runtime
> security model, authentication, or authorization.
>
> The Gradle wrapper (`gradlew`) is committed to the repository with a
> checksum-verified
> distribution URL, following the same security model as the Maven wrapper.
>
> # General Notes
>
> The implementation PR demonstrates full CI green status across all test
> suites,
> confirming functional equivalence with the Maven build.
>
> # Links
>
> * Proof of Concept PR (CI fully green):
> https://github.com/merlimat/pulsar/pull/16
> * Mailing List discussion thread: [link]
> * Mailing List voting thread: [link]
>
>
> --
> Matteo Merli
> <[email protected]>
>

Reply via email to