Knowing when a window is "closed" is based upon having the watermark
advance which is based upon even time.

On Wed, Jun 24, 2020 at 9:05 AM Sunny, Mani Kolbe <[email protected]> wrote:

> Hi Luke,
>
>
>
> Sorry forgot to mention, we override the event timestamp to current using
> WithTimestamps.of(Instant.now()) as we don’t really care actual event time.
> So FixedWindow closes when current time passes window.end time.
>
>
>
> It is a standard practice in oozie world to trigger downstream jobs based
> on marker files. So thought someone might have encountered this before
> problem before me.
>
>
>
> Regards,
>
> Mani
>
>
>
>
>
>
>
> *From:* Luke Cwik <[email protected]>
> *Sent:* Wednesday, June 24, 2020 4:23 PM
> *To:* user <[email protected]>
> *Subject:* Re: How to create a marker file when each window completes?
>
>
>
> *CAUTION:* This email originated from outside of D&B. Please do not click
> links or open attachments unless you recognize the sender and know the
> content is safe.
>
>
>
> What do you consider complete? (I ask this since you are using element
> count and processing time triggers)
>
>
>
> Generally the idea is that you can feed the output
> PCollection<WriteFilesResult> to a stateful DoFn with
> an @OnWindowExpiration setup but this works only if you completeness is
> controlled by watermark advancement. Note that @OnWindowExpiration is new
> to Beam so it may not yet work with Spark.
>
>
>
> On Wed, Jun 24, 2020 at 4:22 AM Sunny, Mani Kolbe <[email protected]> wrote:
>
> Hello,
>
>
>
> We are developing a beam pipeline which runs on SparkRunner on streaming
> mode. This pipeline read from Kinesis, do some translations, filtering and
> finally output to S3 using AvroIO writer. We are using Fixed windows with
> triggers based on element count and processing time intervals. Outputs path
> is partitioned by window start timestamp. allowedLateness=0sec
>
>
>
> This is all working fine. But to support some legacy downstream services,
> we need to ensure that output partitions has marker file to indicate that
> data completeness and is ready for downstream conception. Something like
> hadoop’s .SUCCESS file or a .COMPLETE. Is there way to create such a marker
> file in Beam on window closing event?
>
>
>
> Regards,
>
> Mani
>
>

Reply via email to