[
https://issues.apache.org/jira/browse/BEAM-14498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17542193#comment-17542193
]
Yi Hu commented on BEAM-14498:
------------------------------
Periodic Impulse's DoFn uses a restriction_tracker.defer_remainder which is
supposed to make the output pcoll unbounded. Seems like a bug.
Also there is a unbounded_per_element decorator in the sdk, but its value is
never used, and adding this decorator does not change the boundedness.
> Python sdk's PeriodicImpulse generates a bounded PCollection
> ------------------------------------------------------------
>
> Key: BEAM-14498
> URL: https://issues.apache.org/jira/browse/BEAM-14498
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Yi Hu
> Priority: P2
>
> See the dev mail list thread for details:
> https://lists.apache.org/thread/ps3m0jc0ngqp1y2s0mv2n6hxhvgkr3vw
> PeriodicImpluse transform in Java sdk generates unbounded
> PCollection; while in Python sdk it generates bounded PCollection. The
> latter case may cause issues in streaming.
> Per Cham: Note that the primary use-case of PeriodicImpulse (according to the
> design doc) was to
> generate a fixed/bounded input that can slowly change over time but
> changing over time dimension would make it unbounded.
> Seems like we need to make python PeriodicImpulse generates an unbounded
> pcoll, in alignment with Java implementation, and also make sure that the
> change does not break the current implementation of its original use case
> (stream enrichment problem).
--
This message was sent by Atlassian Jira
(v8.20.7#820007)