[ 
https://issues.apache.org/jira/browse/BEAM-8088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chad Dombrova updated BEAM-8088:
--------------------------------
    Summary: [python] PCollection boundedness should be tracked and propagated  
(was: PCollection boundedness should be tracked and propagated)

> [python] PCollection boundedness should be tracked and propagated
> -----------------------------------------------------------------
>
>                 Key: BEAM-8088
>                 URL: https://issues.apache.org/jira/browse/BEAM-8088
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py-core
>            Reporter: Chad Dombrova
>            Assignee: Chad Dombrova
>            Priority: Major
>             Fix For: 2.16.0
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> As far as I can tell Python does not care about boundedness of PCollections 
> even in streaming mode, but external transforms _do_.  In my ongoing effort 
> to get PubsubIO external transforms working I discovered that I could not 
> generate an unbounded write. 
> My pipeline looks like this:
> {code:python}
>     (
>         pipe
>         | 'PubSubInflow' >> 
> external.pubsub.ReadFromPubSub(subscription=subscription, 
> with_attributes=True)
>         | 'PubSubOutflow' >> 
> external.pubsub.WriteToPubSub(topic=OUTPUT_TOPIC, with_attributes=True)
>     )
> {code}
> The PCollections returned from the external Read are Unbounded, as expected, 
> but python is responsible for creating the intermediate PCollection, which is 
> always Bounded, and thus external Write is always Bounded. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to