Hi everyone,

My name is Elia. I am preparing a proposal for GSoC 2026 to tackle the
project idea originally proposed by Yi Hu: bringing native streaming
primitives—specifically the UnboundedSource wrapper and the Watch transform
to the Python SDK.

*The Problem:* Currently, Python developers lack convenient primitives for
continuous data ingestion. While the Java SDK has mature tools like
UnboundedSource and Watch, Python users often have to implement complex
Splittable DoFn (SDF) logic from scratch just to achieve basic streaming IO
patterns (e.g., polling a model registry or tracking a growing directory).

*The Proposal:* Based on this GSoC project idea, I have drafted a proposal
to implement an UnboundedSource wrapper built on SDF and a native Watch
transform for periodic polling.

The current draft incorporates some helpful early feedback from Yi Hu. I am
now sharing it here to gather broader community feedback, structural
reviews, or any concerns.

You can find the full proposal and technical design here:
https://docs.google.com/document/d/1gi7nqviUjvLIUdeg1I7KslW9qunD1lKEarMMONhyZeQ/edit?tab=t.0

I would deeply appreciate any thoughts or suggestions from the community.
Thank you for your time!

Best regards,

Elia (Zhenyu Liu)

Reply via email to