MansiSingh17 commented on issue #18479: URL: https://github.com/apache/beam/issues/18479#issuecomment-3938357144
Hi @je-ik and @junaiddshaukat , I've been studying the design document for the portable Kafka Streams runner and wanted to share a few observations and questions. I'm also interested in contributing to this project for GSoC 2026 — I've been contributing to the Beam TypeScript SDK (merged PRs #37214 and #37466) and have been ramping up on the portability framework. - On the Impulse bootstrap topic (Section 4.1): The `__beam_impulse` topic approach makes sense for single-pipeline scenarios, but I'm wondering how multi-pipeline isolation works. If two pipelines run concurrently, do they share the same topic and state store? Could this cause interference between pipelines, or is the applicationId sufficient to isolate state per pipeline? - On watermark advancement without data (Section 12, Open Question 3): This is listed as an open question but I didn't see a proposed direction. The Flink runner uses a dedicated watermark emit thread for this — is that a viable approach here, or does Kafka Streams' single-threaded processor model make that problematic? - On error handling for ExecutableStage: I didn't see anything in the design about what happens when the SDK harness fails to process a record. Is the expectation that Kafka's retry mechanism handles this, or would we need explicit dead letter queue handling similar to other runners? - On the ValidatesRunner test scope (Section 10): The doc mentions running "a subset" of ValidatesRunner tests, would it make sense to define the specific test classes that should pass as part of the GSoC deliverable? That would make the acceptance criteria clearer. Happy to dig into any of these further or help with the design doc if useful. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
