Hi all,

I wanted to give some context for PR #4732, which adds the
`gradle/server-test-runner` plugin, and for follow-up work that migrates
some Spark, Ranger, OPA, and runtime Spark integration tests onto it.

The short version:
This is not intended to introduce another general testing framework.
It is a small Gradle utility for one recurring Polaris test problem:
Some integration tests need to exercise a real Polaris server, but they
should not have to become Quarkus application tests themselves.

Today, tests in areas like Spark, Ranger, and OPA tend to blur two concerns:

1. building and booting a Polaris server; and
2. running client/extension-specific integration tests against that server.

That coupling has a few practical costs:

- Test modules often need Quarkus test/application wiring even when the
test is
  really about a client, extension, Spark runtime, or authorization
integration.
- Quarkus dependencies and BOM constraints can leak into test classpaths
that are
  already sensitive, especially Spark/Iceberg/Hadoop/Ranger classpaths.
- Moving integration tests into the module that owns the behavior is harder,
  because the module may then need to carry extra Quarkus runtime-test
shape.
- We build more Quarkus test applications than we need, which hurts
feedback time.
- IntelliJ and Gradle model the tests less cleanly than ordinary
`JvmTestSuite`
  based integration tests.

The plugin addresses that by externalizing the Polaris server lifecycle
from the test JVM.
A test task declares a server artifact, usually the `quarkus-run.jar` from
`:polaris-server`:

```kotlin
dependencies {
  polarisServer(project(path = ":polaris-server", configuration =
"quarkusRunner"))
}

tasks.named<Test>("intTest") {
  withPolarisServer(configurations.polarisServer) {

systemProperties.put("polaris.features.\"ALLOW_INSECURE_STORAGE_TYPES\"",
"true")
  }
}
```

At execution time, the plugin starts that server in a separate JVM with
random HTTP and management ports, waits until Quarkus reports the listen
URLs, passes the ports into the test JVM via system properties, and stops
the server when the test task finishes.
The tests themselves stay ordinary JVM tests.

This gives us a cleaner split:

- `:polaris-server` remains responsible for building a runnable Polaris
server.
- Spark/Ranger/OPA tests can live with the code they exercise.
- The test JVM only gets the dependencies it needs as a client/test.
- The server JVM gets Polaris runtime dependencies and configuration.
- Each test task can have an isolated server instance and isolated config.

Follow-up PRs would then:

- Spark integration tests move into the Spark projects and run against a
  Polaris server artifact.
- Ranger integration tests live in the Ranger extension.
- OPA integration tests live in the OPA extension.
- Runtime Spark tests still add task-specific setup, such as RustFS/Hive
  support, without baking those details into the plugin.

The startup-action SPI is deliberately small.
It exists for test tasks that need to start external resources or compute
dynamic server configuration before Polaris starts.
For example, OPA can start an OPA container, and runtime Spark tests
can prepare RustFS/Hive-specific properties.
The action is loaded from an isolated classpath and only receives mutable
server system properties/environment plus string parameters.

Why a Gradle plugin instead of repeated build-script snippets?

The lifecycle has enough sharp edges that copying it into each module would
be worse:

- selecting exactly one runnable server artifact;
- starting the process with `quarkus.http.port=0` and
`quarkus.management.port=0`;
- detecting the actual ports from Quarkus output;
- passing those ports to the test JVM;
- cleaning up the process on normal completion and build-service cleanup;
- declaring inputs for server config/startup action classpaths;
- avoiding ad hoc task hooks in every consuming module.

Centralizing that in one small plugin gives us one place to test and review
those details.

There is a tradeoff:
This adds custom Gradle build logic.
I think that tradeoff is reasonable because the alternative is not "no
complexity"; the alternative is the same lifecycle complexity spread across
Spark, Ranger, OPA, and future server-backed integration tests.
Keeping it under `gradle/server-test-runner` also makes the scope explicit:
It is internal build/test infrastructure, not a Polaris user-facing API.

I also do not think this should replace all Quarkus tests.
Tests that are specifically validating CDI wiring, Quarkus test resources,
augmentation-time
configuration, or in-process service behavior should stay as Quarkus tests.
The server runner is for tests whose natural boundary is "start a Polaris
server and interact with it as an external client or extension integration."

My proposed direction is:

1. Keep the plugin in PR #4732 as a focused internal test utility.
2. Use it for the migrated Spark/Ranger/OPA/runtime Spark integration tests
where
   the separation is clear.
3. Avoid expanding it into a general process-runner abstraction.
4. Keep startup hooks narrow and task-specific.
5. Continue to use Quarkus tests where in-process Quarkus behavior is what
we are
   actually testing.

I think this gives us better module ownership, cleaner dependency
boundaries, and
faster feedback without changing the Polaris runtime behavior under test.

Thoughts?

Reply via email to