junaiddshaukat opened a new issue, #38743:
URL: https://github.com/apache/beam/issues/38743
Tracking issue: #18479
Depends on: #38616 (translation framework + Impulse, merged via #38689)
## Summary
Third sub-issue. Adds the ExecutableStage translator
(beam:runner:executable_stage:v1) so fused stateless user code (ParDo etc.)
runs in the SDK harness over the Fn API and its outputs flow back into the
Kafka Streams topology. Per design doc §4.2.
## Scope
- ExecutableStageTranslator: register ExecutableStage.URN, add a
processor node wired to the parent via the PCollection→processor map.
- ExecutableStageProcessor (Kafka Streams Processor): on init build a
StageBundleFactory from the ExecutableStagePayload + JobInfo; for each
KStreamsPayload data record push the WindowedValue to the harness input
receiver and forward harness outputs downstream as data payloads; on a
watermark payload flush the open bundle and propagate the watermark.
- SDK-harness wiring via runners-java-fn-execution
(ExecutableStageContext.Factory, StageBundleFactory, RemoteBundle,
OutputReceiverFactory, FnDataReceiver) — same surface Flink/Spark use.
## Open question (need direction)
ExecutableStage needs a running SDK harness. For unit tests, do we
(a) use the embedded/loopback environment in-process, or
(b) keep this PR to structural translation tests via TopologyTestDriver and
defer true end-to-end execution to a later ValidatesRunner sub-issue? Leaning
(b) + a small embedded smoke test if it's cheap.
## Out of scope (later sub-issues)
- State, timers, side inputs.
- GroupByKey / repartition (first real topic boundary; that's where
KStreamsPayload gets a Coder-based byte[] Serde).
- Watermark manager (comes when the first stateful transform forces it).
- ctx.commit() bundle-boundary tuning (design doc §6).
## Acceptance criteria
- ./gradlew :runners:kafka-streams:check green.
- Impulse → ParDo pipeline translates; topology contains the
executable-stage processor wired to the Impulse output.
- (If embedded env in scope) a trivial DoFn round-trips one element.
cc @je-ik
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]