Knowing when a window is "closed" is based upon having the watermark advance which is based upon even time.
On Wed, Jun 24, 2020 at 9:05 AM Sunny, Mani Kolbe <[email protected]> wrote: > Hi Luke, > > > > Sorry forgot to mention, we override the event timestamp to current using > WithTimestamps.of(Instant.now()) as we don’t really care actual event time. > So FixedWindow closes when current time passes window.end time. > > > > It is a standard practice in oozie world to trigger downstream jobs based > on marker files. So thought someone might have encountered this before > problem before me. > > > > Regards, > > Mani > > > > > > > > *From:* Luke Cwik <[email protected]> > *Sent:* Wednesday, June 24, 2020 4:23 PM > *To:* user <[email protected]> > *Subject:* Re: How to create a marker file when each window completes? > > > > *CAUTION:* This email originated from outside of D&B. Please do not click > links or open attachments unless you recognize the sender and know the > content is safe. > > > > What do you consider complete? (I ask this since you are using element > count and processing time triggers) > > > > Generally the idea is that you can feed the output > PCollection<WriteFilesResult> to a stateful DoFn with > an @OnWindowExpiration setup but this works only if you completeness is > controlled by watermark advancement. Note that @OnWindowExpiration is new > to Beam so it may not yet work with Spark. > > > > On Wed, Jun 24, 2020 at 4:22 AM Sunny, Mani Kolbe <[email protected]> wrote: > > Hello, > > > > We are developing a beam pipeline which runs on SparkRunner on streaming > mode. This pipeline read from Kinesis, do some translations, filtering and > finally output to S3 using AvroIO writer. We are using Fixed windows with > triggers based on element count and processing time intervals. Outputs path > is partitioned by window start timestamp. allowedLateness=0sec > > > > This is all working fine. But to support some legacy downstream services, > we need to ensure that output partitions has marker file to indicate that > data completeness and is ready for downstream conception. Something like > hadoop’s .SUCCESS file or a .COMPLETE. Is there way to create such a marker > file in Beam on window closing event? > > > > Regards, > > Mani > >
