[ 
https://issues.apache.org/jira/browse/BEAM-11906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-11906:
-----------------------------------
    Issue Type: Bug  (was: Improvement)

> No trigger early repeatedly for session windows
> -----------------------------------------------
>
>                 Key: BEAM-11906
>                 URL: https://issues.apache.org/jira/browse/BEAM-11906
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>    Affects Versions: 2.23.0, 2.28.0
>            Reporter: Ning Kang
>            Priority: P1
>
> Originated from: 
> https://stackoverflow.com/questions/66381608/apache-beam-does-not-trigger-early-repeatedly-for-session-windows-on-google-data
> The following pipeline fires early after each element when running locally 
> using DirectRunner, but there are no early triggers when running on google 
> cloud dataflow. On dataflow it triggers only after the session window has 
> closed.
> {code:python}
> ( p
>         | 'read'   >> beam.io.ReadFromPubSub(subscription = 
> 'projects/xxx/subscriptions/xxx-sub')
>         | 'json'   >> beam.Map(lambda x: json.loads(x.decode('utf-8')))
>         | 'kv'     >> beam.Map(lambda x: (x['id'], x['amount']))
>         | 'window' >> beam.WindowInto(window.Sessions(15*60), 
> trigger=trigger.Repeatedly(trigger.AfterCount(1)), 
> accumulation_mode=AccumulationMode.ACCUMULATING)
>         | 'group'  >> beam.GroupByKey()
>         | 'log'    >> beam.Map(lambda x: logging.info(x))
> )
> {code}
> Apache Beam versions tried: 2.23 and 2.28.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to