Re: Incremental state

2020-06-14 Thread Congxian Qiu
Hi

Can process function[1] can meet your needs here?, you can do the TTL logic
using timers in process functions.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html
Best,
Congxian


Timo Walther  于2020年6月10日周三 下午9:36写道:

> Hi Annemarie,
>
> if TTL is what you are looking for and queryable state is what limits
> you, it might make sense to come up with a custom implementation of
> queryable state? TTL might be more difficult to implement. As far as I
> know this feature is more of an experimental feature without any
> consistency guarantees. A Function could offer this functionality using
> some socket/web service library. Or you offer insights through a side
> output into a sink such as Elasticsearch.
>
> Otherwise, it might be useful to "batch" the cleanups. In Flink's SQL
> engine, a user can define a minimum and maximum retention time. So
> timers are always set based on the maximum retention time but during
> cleanup the elements that fall into the minimum retention time are also
> cleaned up on the way (see [1]). This could be a performance improvement.
>
> If the clean up happens based on event-time, it is also possible to use
> timers more efficiently and only set one timer per watermark [2].
>
> I hope this helps.
>
> Regards,
> Timo
>
> [1]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/runtime/join/BaseTwoInputStreamOperatorWithStateRetention.scala
>
> [2]
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#timer-coalescing
>
>
> On 09.06.20 16:29, Annemarie Burger wrote:
> > Hi,
> >
> > What I'm trying to do is the following: I want to incrementally add and
> > delete elements to a state. If the element expires/goes out of the
> window,
> > it needs to be removed from the state. I basically want the
> functionality of
> > TTL, without using it, since I'm also using Queryable State and these two
> > features can't be combined. Ofcourse I can give a "valid untill" time to
> > each element when I'm adding it to the state using a ProcessFunction, and
> > periodically iterate over the state to remove expired elements, but I was
> > wondering if there is a more efficient way. For example to use a timer,
> > which we give the element as a parameter, so that when the timer fires, x
> > seconds after the timer was set, it can just look up the element directly
> > and remove it. But how would I implement this?
> >
> > Thanks!
> >
> >
> >
> > --
> > Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
> >
>
>


Re: Incremental state

2020-06-10 Thread Timo Walther

Hi Annemarie,

if TTL is what you are looking for and queryable state is what limits 
you, it might make sense to come up with a custom implementation of 
queryable state? TTL might be more difficult to implement. As far as I 
know this feature is more of an experimental feature without any 
consistency guarantees. A Function could offer this functionality using 
some socket/web service library. Or you offer insights through a side 
output into a sink such as Elasticsearch.


Otherwise, it might be useful to "batch" the cleanups. In Flink's SQL 
engine, a user can define a minimum and maximum retention time. So 
timers are always set based on the maximum retention time but during 
cleanup the elements that fall into the minimum retention time are also 
cleaned up on the way (see [1]). This could be a performance improvement.


If the clean up happens based on event-time, it is also possible to use 
timers more efficiently and only set one timer per watermark [2].


I hope this helps.

Regards,
Timo

[1] 
https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/runtime/join/BaseTwoInputStreamOperatorWithStateRetention.scala


[2] 
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#timer-coalescing



On 09.06.20 16:29, Annemarie Burger wrote:

Hi,

What I'm trying to do is the following: I want to incrementally add and
delete elements to a state. If the element expires/goes out of the window,
it needs to be removed from the state. I basically want the functionality of
TTL, without using it, since I'm also using Queryable State and these two
features can't be combined. Ofcourse I can give a "valid untill" time to
each element when I'm adding it to the state using a ProcessFunction, and
periodically iterate over the state to remove expired elements, but I was
wondering if there is a more efficient way. For example to use a timer,
which we give the element as a parameter, so that when the timer fires, x
seconds after the timer was set, it can just look up the element directly
and remove it. But how would I implement this?

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/





Incremental state

2020-06-09 Thread Annemarie Burger
Hi,

What I'm trying to do is the following: I want to incrementally add and
delete elements to a state. If the element expires/goes out of the window,
it needs to be removed from the state. I basically want the functionality of
TTL, without using it, since I'm also using Queryable State and these two
features can't be combined. Ofcourse I can give a "valid untill" time to
each element when I'm adding it to the state using a ProcessFunction, and
periodically iterate over the state to remove expired elements, but I was
wondering if there is a more efficient way. For example to use a timer,
which we give the element as a parameter, so that when the timer fires, x
seconds after the timer was set, it can just look up the element directly
and remove it. But how would I implement this? 

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: Incremental state with purging

2020-05-18 Thread Thomas Huang
I’m wondering that why you use a beta feature for production. Why not push the 
latest state into down sink like redis or hbase with Apache phoenix .


From: Annemarie Burger 
Sent: Monday, May 18, 2020 11:19:23 PM
To: user@flink.apache.org 
Subject: Re: Incremental state with purging

Hi,

Thanks for your suggestions!
However, as I'm reading the docs for queryable state, it says that it can
only be used for Processing time, and my windows are defined using event
time. So, I guess this means I should use the KeyedProcessFunction. Could
you maybe suggest a rough implementation for this? I can't seem to get the
implementation working right.

Best,
Annemarie



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: Incremental state with purging

2020-05-18 Thread Annemarie Burger
Hi, 

Thanks for your suggestions!
However, as I'm reading the docs for queryable state, it says that it can
only be used for Processing time, and my windows are defined using event
time. So, I guess this means I should use the KeyedProcessFunction. Could
you maybe suggest a rough implementation for this? I can't seem to get the
implementation working right. 

Best, 
Annemarie 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: Incremental state with purging

2020-05-15 Thread Congxian Qiu
Hi
>From your description,  you want to do two things:
1 update state and remote the state older than x
2 output the state every y second

>From my side, the first can be done by using TTL state as Yun said,
the second can be done by using KeyedProcessFunction[1]

If you want to have complex logic to remove the older state in step 1,
maybe you can also use the KeyedProcessFunction and timer()

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/stream/operators/process_function.html#the-keyedprocessfunction
Best,
Congxian


Yun Tang  于2020年5月13日周三 下午7:42写道:

> Hi
>
> From your description: "output the state every y seconds and remove old
> elements", I think TTL [1] is the proper solution for your scenario. And
> you could define the ttl of your state as y seconds so that processfunction
> could only print elements in the last y seconds.
>
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#state-time-to-live-ttl
>
> Best
> Yun Tang
> --
> *From:* Annemarie Burger 
> *Sent:* Wednesday, May 13, 2020 2:46
> *To:* user@flink.apache.org 
> *Subject:* Incremental state with purging
>
> Hi,
>
> I'm trying to implement the most efficient way to incrementally put
> incoming
> DataStream elements in my (map)state, while removing old elements (older
> that x) from that same state. I then want to output the state every y
> seconds. I've looked into using the ProcessFunction with onTimer, or
> building my own Trigger for a window function, but I struggle with putting
> all this together in a logical and efficient way. Since the state is very
> big I don't want to duplicate it over multiple (sliding)windows. Does
> anybody know the best way to achieve this? Some pseudo code would be very
> helpful.
>
> Thanks!
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>


Re: Incremental state with purging

2020-05-13 Thread Yun Tang
Hi

>From your description: "output the state every y seconds and remove old 
>elements", I think TTL [1] is the proper solution for your scenario. And you 
>could define the ttl of your state as y seconds so that processfunction could 
>only print elements in the last y seconds.


[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#state-time-to-live-ttl

Best
Yun Tang

From: Annemarie Burger 
Sent: Wednesday, May 13, 2020 2:46
To: user@flink.apache.org 
Subject: Incremental state with purging

Hi,

I'm trying to implement the most efficient way to incrementally put incoming
DataStream elements in my (map)state, while removing old elements (older
that x) from that same state. I then want to output the state every y
seconds. I've looked into using the ProcessFunction with onTimer, or
building my own Trigger for a window function, but I struggle with putting
all this together in a logical and efficient way. Since the state is very
big I don't want to duplicate it over multiple (sliding)windows. Does
anybody know the best way to achieve this? Some pseudo code would be very
helpful.

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Incremental state with purging

2020-05-12 Thread Annemarie Burger
Hi,

I'm trying to implement the most efficient way to incrementally put incoming
DataStream elements in my (map)state, while removing old elements (older
that x) from that same state. I then want to output the state every y
seconds. I've looked into using the ProcessFunction with onTimer, or
building my own Trigger for a window function, but I struggle with putting
all this together in a logical and efficient way. Since the state is very
big I don't want to duplicate it over multiple (sliding)windows. Does
anybody know the best way to achieve this? Some pseudo code would be very
helpful. 

Thanks!



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/