junaiddshaukat opened a new issue, #38743:
URL: https://github.com/apache/beam/issues/38743

   Tracking issue: #18479
   Depends on: #38616 (translation framework + Impulse, merged via #38689)
   
   ## Summary
   Third sub-issue. Adds the ExecutableStage translator 
(beam:runner:executable_stage:v1) so fused stateless user code (ParDo etc.) 
runs in the SDK harness over the Fn API and its outputs flow back into the 
Kafka Streams topology. Per design doc §4.2.
   
   ## Scope
   - ExecutableStageTranslator: register ExecutableStage.URN, add a
     processor node wired to the parent via the PCollection→processor map.
   - ExecutableStageProcessor (Kafka Streams Processor): on init build a
     StageBundleFactory from the ExecutableStagePayload + JobInfo; for each
     KStreamsPayload data record push the WindowedValue to the harness input
     receiver and forward harness outputs downstream as data payloads; on a
     watermark payload flush the open bundle and propagate the watermark.
   - SDK-harness wiring via runners-java-fn-execution
     (ExecutableStageContext.Factory, StageBundleFactory, RemoteBundle,
     OutputReceiverFactory, FnDataReceiver) — same surface Flink/Spark use.
   
   ## Open question (need direction)
   ExecutableStage needs a running SDK harness. For unit tests, do we
   (a) use the embedded/loopback environment in-process, or 
   (b) keep this PR to structural translation tests via TopologyTestDriver and 
defer true end-to-end execution to a later ValidatesRunner sub-issue? Leaning 
(b) + a small embedded smoke test if it's cheap.
   
   ## Out of scope (later sub-issues)
   - State, timers, side inputs.
   - GroupByKey / repartition (first real topic boundary; that's where
     KStreamsPayload gets a Coder-based byte[] Serde).
   - Watermark manager (comes when the first stateful transform forces it).
   - ctx.commit() bundle-boundary tuning (design doc §6).
   
   ## Acceptance criteria
   - ./gradlew :runners:kafka-streams:check green.
   - Impulse → ParDo pipeline translates; topology contains the
     executable-stage processor wired to the Impulse output.
   - (If embedded env in scope) a trivial DoFn round-trips one element.
   
   cc @je-ik
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to