[
https://issues.apache.org/jira/browse/BEAM-11906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles updated BEAM-11906:
-----------------------------------
Status: Open (was: Triage Needed)
> No trigger early repeatedly for session windows
> -----------------------------------------------
>
> Key: BEAM-11906
> URL: https://issues.apache.org/jira/browse/BEAM-11906
> Project: Beam
> Issue Type: Bug
> Components: runner-dataflow
> Affects Versions: 2.23.0, 2.28.0
> Reporter: Ning Kang
> Priority: P1
>
> Originated from:
> https://stackoverflow.com/questions/66381608/apache-beam-does-not-trigger-early-repeatedly-for-session-windows-on-google-data
> The following pipeline fires early after each element when running locally
> using DirectRunner, but there are no early triggers when running on google
> cloud dataflow. On dataflow it triggers only after the session window has
> closed.
> {code:python}
> ( p
> | 'read' >> beam.io.ReadFromPubSub(subscription =
> 'projects/xxx/subscriptions/xxx-sub')
> | 'json' >> beam.Map(lambda x: json.loads(x.decode('utf-8')))
> | 'kv' >> beam.Map(lambda x: (x['id'], x['amount']))
> | 'window' >> beam.WindowInto(window.Sessions(15*60),
> trigger=trigger.Repeatedly(trigger.AfterCount(1)),
> accumulation_mode=AccumulationMode.ACCUMULATING)
> | 'group' >> beam.GroupByKey()
> | 'log' >> beam.Map(lambda x: logging.info(x))
> )
> {code}
> Apache Beam versions tried: 2.23 and 2.28.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)