Re: Streaming pipeline "most-recent" join

2020-07-09 Thread Reza Rokni
Hya, I never got a chance to finish this one, maybe I will get some time in the summer break... but I think it will help with your use case... https://github.com/rezarokni/beam/blob/BEAM-7386/sdks/java/extensions/timeseries/src/main/java/org/apache/beam/sdk/extensions/timeseries/joins/BiTemporalS

Streaming pipeline "most-recent" join

2020-07-09 Thread Harrison Green
Hi Beam devs, I'm working on a streaming pipeline where we need to do a "most-recent" join between two PCollections. Specifically, something like: out = pcoll1 | beam.Map(lambda a,b: (a,b), b=beam.pvalue.AsSingleton(pcoll2)) The goal is to join each value in pcoll1 with only the most recent valu