Thank you for looking into this!

Our current workaround to update the side input data is to restart the 
pipeline, this hasn't been a frequent requirement but will become more common 
in the future. We've considered using a Guava cache but a solution within the 
Beam programming model would be great.

On 2018/12/18 17:20:43, Scott Wegner <[email protected]> wrote: 
> Hi Lucas,
> 
> Thanks for the explanation and repro example. This is a bug in the Dataflow
> service; a fix is in progress and once rolled out will apply to all SDK
> versions. I've filed BEAM-6261 to track:
> https://issues.apache.org/jira/browse/BEAM-6261
> 
> On Wed, Dec 12, 2018 at 4:31 PM Bordwell, Lucas-CW <
> [email protected]> wrote:
> 
> > Greetings,
> >
> >
> >
> > I am trying to implement the “Slowly-changing lookup cache” pattern
> > described on this blog post:
> > https://cloud.google.com/blog/products/gcp/guide-to-common-cloud-dataflow-use-case-patterns-part-1
> > but am experiencing issues where the side inputs do not update with the
> > DataflowRunner. I am fine with consistency being eventual on the updates in
> > Dataflow.
> >
> >
> >
> > I see that there is an existing issue:
> > https://issues.apache.org/jira/browse/BEAM-2155 that seems to be related
> > but I also saw a comment by Kenn Knowles on this:
> > https://stackoverflow.com/a/41600466/2048988 Stack Overflow answer where
> > he mentions that there was a side-input caching bug which was fixed. Has
> > anyone else gotten side inputs to update on Dataflow using a pattern
> > similar to the one above?
> >
> >
> >
> > Here is a simplified example pipeline project I created to illustrate the
> > issue using Beam 2.8.0: https://github.com/lbordwell/sideinput
> >
> >
> >
> > Thank you,
> >
> > Lucas Bordwell
> >
> 
> 
> -- 
> 
> 
> 
> 
> Got feedback? tinyurl.com/swegner-feedback
> 

Reply via email to