Hi Sigalit,
there might be several options, which one is the best would depend on
the actual use-case. You might:
a) use the CoGroupByKey transform with a fixed window [1], or
b) use stateful processing [2] with timer triggering your output
Which one is the best depends on if you can use windowing semantics
provided by Beam [3]. Windowing is needed for the CoGBK approach, the
stateful approach works with globally windowed PCollections.
Hope this help, feel free to ask more questions you might have.
Best,
Jan
[1]
https://beam.apache.org/documentation/transforms/java/aggregation/cogroupbykey/
[2] https://beam.apache.org/blog/stateful-processing/
[3] https://beam.apache.org/documentation/programming-guide/#windowing
On 7/25/22 14:09, Sigalit Eliazov wrote:
Hi all,
I have a pipeline that reads input from a few sources, combines them
and creates a view of the data.
I need to send an output to kafka every X minutes.
What will be the best way to implement this?
Thanks
Sigalit