Hi Sigalit,

there might be several options, which one is the best would depend on the actual use-case. You might:

 a) use the CoGroupByKey transform with a fixed window [1], or

 b) use stateful processing [2] with timer triggering your output

Which one is the best depends on if you can use windowing semantics provided by Beam [3]. Windowing is needed for the CoGBK approach, the stateful approach works with globally windowed PCollections.

Hope this help, feel free to ask more questions you might have.

Best,

 Jan

[1] https://beam.apache.org/documentation/transforms/java/aggregation/cogroupbykey/

[2] https://beam.apache.org/blog/stateful-processing/

[3] https://beam.apache.org/documentation/programming-guide/#windowing

On 7/25/22 14:09, Sigalit Eliazov wrote:
Hi all,

I have a pipeline that reads input from a few sources, combines them and creates a view of the data.
I need to send an output to kafka every X minutes.
What will be the best way to implement this?
Thanks
Sigalit

Reply via email to