tkaymak opened a new pull request, #38233:
URL: https://github.com/apache/beam/pull/38233
## What
Pure refactor of `runners/spark/`. Two changes:
1. **Hoist** `runners/spark/3/src/.../structuredstreaming/` (the only
sources Spark 3 ships) into the shared `runners/spark/src/`.
`runners/spark/3/src/` is removed entirely.
2. **Replace** the existing `copySourceBase` / `useCopiedSourceSet` block in
`runners/spark/spark_runner.gradle` with the per-version source-overrides
layering already used by `runners/flink/flink_runner.gradle`:
- the lowest `spark_major` (currently `3`) builds straight from the
shared base;
- higher majors get a `Copy` task with `DuplicatesStrategy.INCLUDE` that
merges shared + previous majors + `./src` so per-version files override.
`runners/spark/3/build.gradle` now sets `spark_major = '3'` instead of the
removed `copySourceBase = false` flag, and `gradle.properties` gains
`spark_versions=3` (mirrors the existing `flink_versions`).
## Why
Prepares the tree for the Spark 4 runner (#36841 / #38212), so it can land
as a small overrides layer on top of this restructuring instead of duplicating
the entire structured-streaming source tree. Addresses the review comment on
#38212 asking for the Flink-style layout.
## Compatibility
Pure refactor — Spark 3's compiled output and test classpath are unchanged.
The 73 hoisted files are git-renames (verified with `git diff --find-renames`).
## Stacked
Spark 4 follow-up branch ready:
https://github.com/tkaymak/beam/tree/spark4-runner-slim — diff against this
branch is ~31 added files / ~3.3K lines (down from #38212's ~94 files / ~10.5K).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]