Thanks for initiating the discussion and wrap-up the conclusion Andrey, and
thanks all for participating.
Just to confirm, that for the out-of-order case, the conclusion is to
update the data and timestamp with the currently-being-processed record w/o
checking whether it's an old data, right? In
Hi everybody,
Thanks a lot for your detailed feedback on this topic.
It looks like we can already do some preliminary wrap-up for this
discussion.
As far as I see we have the following trends:
*Last access timestamp: **Event timestamp of currently being processed
record*
*Current timestamp to
Hi, Andrey
I think ttl state has another scenario to simulate the slide window with the
process function. User can define a state to store the data with the latest
1 day. And trigger calculate on the state every 5min. It is a operator
similar to slidewindow. But i think it is more efficient than
I think so, I just wanted to bring it up again because the question was raised.
> On 8. Apr 2019, at 22:56, Elias Levy wrote:
>
> Hasn't this been always the end goal? It's certainly what we have been
> waiting on for job with very large TTLed state. Beyond timer storage,
> timer processing
Hasn't this been always the end goal? It's certainly what we have been
waiting on for job with very large TTLed state. Beyond timer storage,
timer processing to simply expire stale data that may not be accessed
otherwise is expensive.
On Mon, Apr 8, 2019 at 7:11 AM Aljoscha Krettek wrote:
> I
I had a discussion with Andrey and now think that also the case
event-time-timestamp/watermark-cleanup is a valid case. If you don’t need this
for regulatory compliance but just for cleaning up old state, in case where you
have re-processing of old data.
I think the discussion about whether to
Hi all,
For GDPR: I am not sure about the regulatory requirements of GDPR but I
would assume that the time for deletion starts counting from the time an
organisation received the data (i.e. the wall-clock ingestion time of the
data), and not the "event time" of the data. In other case, an
Oh boy, this is an interesting pickle.
For *last-access-timestamp*, I think only *event-time-of-current-record* makes
sense. I’m looking at this from a GDPR/regulatory compliance perspective. If
you update a state, by say storing the event you just received in state, you
want to use the exact
Hi Andrey,
I agree with Elias. This would be the most natural behavior. I wouldn't add
additional slightly different notions of time to Flink.
As I can also see a use case for the combination
* Timestamp stored: Event timestamp
* Timestamp to check expiration: Processing Time
we could (maybe
My 2c:
Timestamp stored with the state value: Event timestamp
Timestamp used to check expiration: Last emitted watermark
That follows the event time processing model used elsewhere is Flink. E.g.
events are segregated into windows based on their event time, but the
windows do not fire until the
Hi All,
As you might have already seen there is an effort tracked in FLINK-12005
[1] to support event time scale for state with time-to-live (TTL) [2].
While thinking about design, we realised that there can be multiple options
for semantics of this feature, depending on use case. There is also
11 matches
Mail list logo