Hi Christophe,

Flink exposes event-time and watermark not only in windows.
The easiest solution would be to use a ProcessFunction [1] which can access
the timestamp of a record.

I would apply a ProcessFunction on a keyed stream (keyBy(id) [2]) and store
the max timestamp per key in a ValueState [3].
When you receive a new record, you check if its timestamp is larger than
the timestamp in the state. If that is the case, you update the state and
forward the record. Otherwise, you drop the record.

Hope that helps,
Fabian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/process_function.html
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/index.html#datastream-transformations
[3]
https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/state.html#using-managed-keyed-state

2018-01-09 12:23 GMT+01:00 Christophe Jolif <cjo...@gmail.com>:

> Hi everyone,
>
> Let's imagine I have a stream of events coming a bit like this:
>
> { id: "1", value: 1, timestamp: 1 }
> { id: "2", value: 2, timestamp: 1 }
> { id: "1", value: 4, timestamp: 3 }
> { id: "1", value: 5, timestamp: 2 }
> { id: "2", value: 5, timestamp: 3 }
> ...
>
> As you can see  with the non monotonically increasing timestamps, for
> various reasons, events can be slightly "un-ordered"
>
> Now I want to use Flink to process this stream, to compute by id (my key)
> the latest value and update it in a DB. But obviously that latest value
> must reflect the original time stamp and not the processing time stamp.
>
> I've seen that Flink can deal with event-time processing, in the sense
> that if I need to do a windowed operation I can ensure an event will be
> assign to the "correct" window.
>
> But here the use-case seems slightly different. How would you proceed to
> do that in Flink?
>
> Thanks!
> --
> Christophe
>

Reply via email to