Thanks for the pointer Luke. In our case, it impacts downstream steps as they are not seeing data. We currently have to drain and redeploy the pipeline. The drain will close the windows and all elements are emitted then.
> On 31 Mar 2022, at 17:50, Luke Cwik <[email protected]> wrote: > > Another user did report an issue with the data freshness when they used > Impulse/Create + Wait.on as seen here[1]. Note that the original pipeline was > never impacted from a correctness point of view, it was a display issue in > the Dataflow UI. The fix was to change the code to be something like: > impulseOut > .apply(WithTimestamps.of(... return GlobalWindow.INSTANCE.maxTimestamp()) > //pseudo code > .apply(Wait.on(results)) > > The work around solved the UI timestamp issue in Dataflow with how data > freshness appeared. > > 1: > https://github.com/hengfengli/beam/commit/e23b78581635f863cd392e3bebf511268bc3ed52#r69435778 > > On Thu, Mar 31, 2022 at 2:21 PM Reynaldo Baquerizo > <[email protected]> wrote: > Hi, > > One of our pipelines run into data freshness issues sometimes, is not > consistent so we are not sure what to attribute it to; however, the steps > were it originate are `Wait.on` transforms of data written to Bigtable. We > are using the BigtableIO connector. My hypothesis was that some writes fail, > so the window is not closed and signals are not emitted. Does this sound > plausible? > > Appreciate any pointers how I could debug this. > > > Thanks, > > Reynaldo
