Late event is not dropped due to lateness for session window in direct runner - bug or feature?

2022-01-27 Thread Marcin Kuthan
Hi I observed that on Direct Runner when watermark is advanced programatically (in tests) late event is not discarded for session window. The aggregation: type User = String type Action = String def activitiesInSessionWindow( userActions: SCollection[(User, Action)], gapDuration:

Re: [Question] Can Apache beam used for chaining IO transforms

2022-01-27 Thread Alexey Romanenko
Hi Wei, Thanks for details! Yes, defentively let’s chat about this in the related Jira issue. — Alexey > On 27 Jan 2022, at 00:13, Wei Hsia ☁ wrote: > > Hi Alexey, > > Thanks Cham! > > Sorry, i've been working on this in a vacuum, my apologies. > > My approach was slightly different,

Python pipeline on Flink not finishing cleanly

2022-01-27 Thread Janek Bevendorff
Hi, When I run a Python pipeline with multiple concurrent TaskManagers on Flink, the job hardly ever (or never) finishes properly. At the end, Beam (or Flink?) always throws a seemingly random gRPC IllegalStateException after my last GlobalCombine, so Beam goes into some weird error handling

Re: Unable to record metrics on Flink cluster from Python SDK

2022-01-27 Thread Janek Bevendorff
Hi Kyle, Hi Janek, I think this is a known issue: https://issues.apache.org/jira/browse/BEAM-11212 Yes, that does indeed seem to be the same or at least a similar issue. I was looking, but I didn't find that report earlier, the Jira search bar is terrible. Another thing I couldn't and