We need to aggregate in precisely row order. Is there a safe way to do
this? Maybe with some sort of row time sequence number?

As written in another email, we're currently doing the following set of
operations
val compactedUserDocsStream = userDocsStream
.window(TumblingProcessingTimeWindows.of(Time.seconds(1)))
.aggregate(new CompactionAggregate())

I guess my concern is if we restore from a checkpoint or savepoint I don't
understand how the window get's checkpointed and how window alignment works
between runs of a job. Will the window just start over from scratch, and
re-process any rows that may have been inflight but not finished processing
in the previous run's last window?

If so then I guess everything will arrive in row order like we want it to.
But if a window get's checkpointed with its previous proctime, then it may
be misaligned in the next run and drop rows that were in that window.

On Mon, Feb 1, 2021 at 6:37 AM Timo Walther <twal...@apache.org> wrote:

> Hi Rex,
>
> processing-time gives you no alignment of operators across nodes. Each
> operation works with its local machine clock that might be interrupted
> by the OS, Java garbage collector, etc. It is always a best effort timing.
>
> Regards,
> Timo
>
>
> On 27.01.21 18:16, Rex Fenley wrote:
> > Hello,
> >
> > I'm looking at ways to deduplicate data and found [1], but does proctime
> > get committed with operators? How does this work against clock skew on
> > different machines?
> >
> > Thanks
> >
> > [1]
> >
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/sql/queries.html#deduplication
> > <
> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/sql/queries.html#deduplication
> >
> >
> > --
> >
> > Rex Fenley|Software Engineer - Mobile and Backend
> >
> >
> > Remind.com <https://www.remind.com/>| BLOG <http://blog.remind.com/> |
> > FOLLOW US <https://twitter.com/remindhq> | LIKE US
> > <https://www.facebook.com/remindhq>
> >
>
>

-- 

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
 |  FOLLOW
US <https://twitter.com/remindhq>  |  LIKE US
<https://www.facebook.com/remindhq>

Reply via email to