One solution will be to stabilize data read from redshift DB. To this end,
sending your side input through a Reshuffle transform [1] should work for
some runners. Robin is working on a more portable solution for supporting
stable input [2].

Thanks,
Cham

[1]
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Reshuffle.java#L64
[2]
https://lists.apache.org/thread.html/f8093ad5512a7fce668550e1f9cf0921c5d1e7ff6656c7a6c9950165@%3Cdev.beam.apache.org%3E


On Mon, Jul 30, 2018 at 6:16 AM Jean-Baptiste Onofré <[email protected]>
wrote:

> Hi Jose,
>
> so basically, you create two PCollections with the same keys and then
> you join/filter/flatten ?
>
> Regards
> JB
>
> On 30/07/2018 15:09, Jose Bermeo wrote:
> > Hi, question guys.
> >
> > I have to filter an unbounded collection based on data from a redshift
> > DB. I cannot use a side input as redshift data could change. One way to
> > do it would be to group common elements, make a query to filter each
> > group, finally flatten the pipe again.Do you know if this is the best
> > way to do it? and what would be the way to run the query agains
> redshift?.
> >
> > Thaks.
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to