junaiddshaukat opened a new issue, #38465:
URL: https://github.com/apache/beam/issues/38465

   Tracking issue: #18479
   
   ## Summary
   
   First sub-issue under the Kafka Streams Runner GSoC 2026 project. Scope is 
the
   **minimum surface area** needed for a portable pipeline to be *submittable*
   to a `KafkaStreamsRunner`. Pipeline execution will still fail at 
translate-time
   because no transform translators exist yet — that's intentional. Subsequent
   sub-issues will incrementally add translators (Impulse, ExecutableStage,
   Flatten, etc.), each shrinking the set of "unsupported transform" failures.
   
   ## Design doc reference
   
   [Portable Kafka Streams Runner for Apache Beam — design 
doc](https://docs.google.com/document/d/1BBMURhSG4SxPcvvnKMTrmnKCr_jhXL6R4TBDBW7zsy8/edit?usp=sharing)
   (co-authored with @je-ik).
   
   ## Scope of this issue
   
   Create `runners/kafka-streams/` as a new Gradle module with:
   
   - [ ] `runners/kafka-streams/build.gradle` — module build file, declares
         dependencies on `runners-core-java`, `runners-java-fn-execution`,
         `runners-java-job-service`, `kafka-clients` 3.9.x, `kafka-streams` 
3.9.x,
         and pulls in the standard Beam testing dependencies.
   - [ ] Update `settings.gradle.kts` to include `:runners:kafka-streams`.
   - [ ] `KafkaStreamsRunner` extending `PipelineRunner<PipelineResult>` —
         delegates to the portable pipeline runner path.
   - [ ] `KafkaStreamsPipelineOptions` extending `PortablePipelineOptions` —
         `bootstrapServers`, `applicationId`, `processingGuarantee`,
         `maxBundleSize`, `maxBundleTimeMs`, `stateDir`.
   - [ ] `KafkaStreamsRunnerRegistrar` — auto-discovery via
         
`META-INF/services/org.apache.beam.sdk.options.PipelineOptionsRegistrar`
         and `PipelineRunnerRegistrar`.
   - [ ] Empty stubs (just enough to compile):
         - `KafkaStreamsJobServerDriver` extending `JobServerDriver`
         - `KafkaStreamsJobInvoker` extending `JobInvoker`
         - `translation/KafkaStreamsPipelineTranslator`
         - `translation/KafkaStreamsTranslationContext`
   - [ ] `package-info.java` in each new package.
   
   ## Acceptance criteria
   
   - [ ] `./gradlew :runners:kafka-streams:compileJava` passes with no errors.
   - [ ] `./gradlew :runners:kafka-streams:check` passes (style, spotless,
         licence-header checks).
   - [ ] Submitting a trivial pipeline (e.g. `Impulse` only) to the runner via
         its `JobInvoker` does not crash before reaching the translator — it
         should fail *inside* the translator with a clear
         "no translator registered for URN ..." error.
   
   ## Out of scope (deferred to follow-up sub-issues)
   
   - Any actual transform translator (Impulse, ExecutableStage, GBK, etc.)
   - Watermark manager
   - State management
   - Tests beyond compile + style — full unit / integration tests come once at
     least Impulse + one stateless ParDo are wired up.
   
   ## Reference implementation pattern
   
   Follows the Flink portable runner module layout:
   - 
`runners/flink/2.0/src/main/java/org/apache/beam/runners/flink/FlinkPipelineRunner.java`
   - 
`runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobServerDriver.java`
   - 
`runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkJobInvoker.java`
   
   (I'm not copying code — using these as a structural reference for the URN
   dispatch + JobInvoker pattern. The Kafka Streams runner does not need
   multi-version build fan-out like Flink does, so the module layout will be
   single-version.)
   
   cc @je-ik


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to