Eliaaazzz commented on PR #38724:
URL: https://github.com/apache/beam/pull/38724#issuecomment-4594146383

   After review, I updated the PR to keep the public surface closer to Beam 
conventions and the Java `UnboundedSource` contract:
   
   - Documented the intended user path as `beam.io.Read(MySource())`, with 
module-level examples for the required `UnboundedSource`, `UnboundedReader`, 
and `CheckpointMark` methods.
   - Removed the public polling option from `ReadFromUnboundedSource`; the 
wrapper now keeps only an internal defer delay for Python SDF rescheduling, 
while Java's bundle-finalization deadline remains a finalization callback 
concern.
   - Reworked the restriction coder/provider/DoFn shape so the restriction 
coder derives the checkpoint coder from the decoded source, allowing the 
provider and DoFn to stay module-level and stdlib-pickle friendly.
   - Removed the early `CHANGES.md` announcement until the larger milestone / 
ValidatesRunner coverage is ready.
   - Tightened SDF behavior and tests around source watermark vs. record 
timestamp, EOF watermark advancement to close event-time windows, checkpoint 
vs. finalization state separation, reader close paths, `iobase.Read` dispatch, 
runner-api `IsBounded.UNBOUNDED`, and lint.
   
   I also compared the current Python wrapper against Java 
`Read.UnboundedSourceAsSDFWrapperFn` and the existing Python bounded-source SDF 
wrapper. The current shape is intentionally a minimal Python/portable-SDF 
equivalent: it preserves split/create-reader/checkpoint/watermark/finalize 
semantics and uses Beam's existing `iobase.Read` dispatch style. It 
intentionally does not yet add Java-only pieces such as record-id 
deduplication, backlog progress, reader caching, or runner-initiated dynamic 
split fractions; those remain follow-up work under #19137.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to