Re: Caching data inside DoFn

2020-06-26 Thread Praveen K Viswanathan
Thank you Luke. I will work on implementing my use case with Stateful ParDo itself and come back if I have any questions. Appreciate your help. On Fri, Jun 26, 2020 at 8:14 AM Luke Cwik wrote: > Use a stateful DoFn and buffer the elements in a bag state. You'll want to > use a key that contains

Re: Caching data inside DoFn

2020-06-26 Thread Luke Cwik
Use a stateful DoFn and buffer the elements in a bag state. You'll want to use a key that contains enough data to match your join condition you are trying to match. For example, if your trying to match on a customerId then you would do something like: element 1 -> ParDo(extract customer id) -> KV -

Caching data inside DoFn

2020-06-26 Thread Praveen K Viswanathan
Hi All - I have a DoFn which generates data (KV pair) for each element that it is processing. It also has to read from that KV for other elements based on a key which means, the KV has to retain all the data that's getting added to it while processing every element. I was thinking about the "slow-c