Re: How to use 'dynamic' state

2017-03-07 Thread Steve Jerman
Thanks for your reply. It makes things much clearer for me.  I think you are 
right - Side Inputs are probably the right way long term (I  looked at the Team 
definition), but I think I can construct something in the mean time.

Steve

On Mar 7, 2017, at 6:11 AM, Aljoscha Krettek 
> wrote:

Hi Steve,
I think Gordon already gave a pretty good answer, I'll just try and go into the 
specifics a bit.

You mentioned that there would be different kinds of operators required for the 
rules (you hinted at FlatMap and maybe a tumbling window). Do you know each of 
those types before starting your program? If yes, you could have several of 
these "primitive" operations in your pipeline and each of them only listens to 
rule changes (on a second input) that is relevant to their operation.

Side inputs would be very good for this but I think you can also get the same 
level of functionality by using a CoFlatMap (for the window case you would use 
a CoFlatMap chained to a window operation).

Does that help? I'm sure we can figure something out together.

Best,
Aljoscha


On Tue, Mar 7, 2017, at 07:44, Tzu-Li (Gordon) Tai wrote:
Hi Steve,

I’ll try to provide some input for the approaches you’re currently looking into 
(I’ll comment on your email below):

* API based stop and restart of job … ugly.

Yes, indeed ;) I think this is absolutely not necessary.

* Use a co-map function with the rules alone stream and the events as the 
other. This seems better however, I would like to have several ‘trigger’ 
functions changed together .. e.g. a tumbling window for one type of criteria 
and a flat map for a different sort … So I’m not sure how to hook this up for 
more than a simple co-map/flatmap. I did see this suggested in one answer and

Do you mean that operators listen only to certain rules / criteria settings 
changes? You could either have separate stream sources for each kind of 
criteria rule trigger events, or have a single source and split them 
afterwards. Then, you broadcast each of them with the corresponding co-map / 
flat-maps.

* Use broadcast state : this seems reasonable however I couldn’t tell if the 
broadcast state would be available to all of the processing functions. Is it 
generally available?

From the context of your description, I think what you want is that the 
injected rules stream can be seen by all operators (instead of “broadcast 
state”, which in Flink streaming refers to something else).

Aljoscha recently consolidated a FLIP for Side Inputs [1], which I think is 
targeted exactly for what you have in mind here. Perhaps you can take a look at 
that and see if it makes sense for your use case? But of course, this isn’t yet 
available as it is still under discussion. I think Side Inputs may be an ideal 
solution for what you have in mind here, as the rule triggers I assume should 
be slowly changing.

I’ve CCed Aljoscha so that he can probably provide more insights, as he has 
worked a lot on the stuff mentioned here.

Cheers,
Gordon

[1] FLIP-17: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API




On March 7, 2017 at 5:05:04 AM, Steve Jerman 
(st...@kloudspot.com) wrote:


I’ve been reading the code/user goup/SO and haven’t really found a great answer 
to this… so I thought I’d ask.

I have a UI that allows the user to edit rules which include specific criteria 
for example trigger event if X many people present for over a minute.

I would like to have a flink job that processes an event stream and triggers on 
these rules.

The catch is that I don’t want to have to restart the job if the rules change… 
(it would be easy otherwise :))

So I found four ways to proceed:

* API based stop and restart of job … ugly.

* Use a co-map function with the rules alone stream and the events as the 
other. This seems better however, I would like to have several ‘trigger’ 
functions changed together .. e.g. a tumbling window for one type of criteria 
and a flat map for a different sort … So I’m not sure how to hook this up for 
more than a simple co-map/flatmap. I did see this suggested in one answer and

* Use broadcast state : this seems reasonable however I couldn’t tell if the 
broadcast state would be available to all of the processing functions. Is it 
generally available?

* Implement my own operators… seems complicated ;)

Are there other approaches?

Thanks for any advice
Steve





Re: How to use 'dynamic' state

2017-03-06 Thread Tzu-Li (Gordon) Tai
Hi Steve,

I’ll try to provide some input for the approaches you’re currently looking into 
(I’ll comment on your email below):

* API based stop and restart of job … ugly. 
Yes, indeed ;) I think this is absolutely not necessary.

* Use a co-map function with the rules alone stream and the events as the 
other. This seems better however, I would like to have several ‘trigger’ 
functions changed together .. e.g. a tumbling window for one type of criteria 
and a flat map for a different sort … So I’m not sure how to hook this up for 
more than a simple co-map/flatmap. I did see this suggested in one answer and 
Do you mean that operators listen only to certain rules / criteria settings 
changes? You could either have separate stream sources for each kind of 
criteria rule trigger events, or have a single source and split them 
afterwards. Then, you broadcast each of them with the corresponding co-map / 
flat-maps.

* Use broadcast state : this seems reasonable however I couldn’t tell if the 
broadcast state would be available to all of the processing functions. Is it 
generally available? 
From the context of your description, I think what you want is that the 
injected rules stream can be seen by all operators (instead of “broadcast 
state”, which in Flink streaming refers to something else).

Aljoscha recently consolidated a FLIP for Side Inputs [1], which I think is 
targeted exactly for what you have in mind here. Perhaps you can take a look at 
that and see if it makes sense for your use case? But of course, this isn’t yet 
available as it is still under discussion. I think Side Inputs may be an ideal 
solution for what you have in mind here, as the rule triggers I assume should 
be slowly changing.

I’ve CCed Aljoscha so that he can probably provide more insights, as he has 
worked a lot on the stuff mentioned here.

Cheers,
Gordon

[1] FLIP-17: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API

On March 7, 2017 at 5:05:04 AM, Steve Jerman (st...@kloudspot.com) wrote:

I’ve been reading the code/user goup/SO and haven’t really found a great answer 
to this… so I thought I’d ask.  

I have a UI that allows the user to edit rules which include specific criteria 
for example trigger event if X many people present for over a minute.  

I would like to have a flink job that processes an event stream and triggers on 
these rules.  

The catch is that I don’t want to have to restart the job if the rules change… 
(it would be easy otherwise :))  

So I found four ways to proceed:  

* API based stop and restart of job … ugly.  

* Use a co-map function with the rules alone stream and the events as the 
other. This seems better however, I would like to have several ‘trigger’ 
functions changed together .. e.g. a tumbling window for one type of criteria 
and a flat map for a different sort … So I’m not sure how to hook this up for 
more than a simple co-map/flatmap. I did see this suggested in one answer and  

* Use broadcast state : this seems reasonable however I couldn’t tell if the 
broadcast state would be available to all of the processing functions. Is it 
generally available?  

* Implement my own operators… seems complicated ;)  

Are there other approaches?  

Thanks for any advice  
Steve

How to use 'dynamic' state

2017-03-06 Thread Steve Jerman
I’ve been reading the code/user goup/SO and haven’t really found a great answer 
to this… so I thought I’d ask.

I have a UI that allows the user to edit rules which include specific criteria 
for example trigger event if X many people present for over a minute.

I would like to have a flink job that processes an event stream and triggers on 
these rules.

The catch is that I don’t want to have to restart the job if the rules change… 
(it would be easy otherwise :))

So I found four ways to proceed:

* API based stop and restart of job … ugly.

* Use a co-map function with the rules alone stream and the events as the 
other. This seems better however, I would like to have several ‘trigger’ 
functions changed together .. e.g. a tumbling window for one type of criteria 
and a flat map for a different sort … So I’m not sure how to hook this up for 
more than a simple co-map/flatmap. I did see this suggested in one answer and 

* Use broadcast state : this seems reasonable however I couldn’t tell if the 
broadcast state would be available to all of the processing functions. Is it 
generally available?

* Implement my own operators… seems complicated ;)

Are there other approaches?

Thanks for any advice
Steve