Re: more complex patterns for CEP (was: CEP two transitions to the same state)

2016-12-15 Thread Dima Arbuzin
Hey there,

I was investigating CEP functionality and realized that I'm missing some,
which are discussed here, especially: accessing fields from previous
events.

Any progress regarding this question?

I'm working with streaming car location data trying to analyze different
traffic patterns.
Consider the following use-case:
1) detect a traffic jam (cluster) at one spot (event) and then
2) detect another traffic jam (another cluster) later (another event). The
question would be: did they move in space?

So is there is a way to compare these two clusters using CEP? I guess
customizing equals() function would help to some extend, but then what if I
want to compare some extra fields like location?

How hard is it to overload Filter Function storing previously triggered
Event information?

P.S.  Performance is not that critical in this case.

On Tue, Oct 11, 2016 at 5:39 PM,  wrote:

> Thanks, Till. I will wait for your response.
> - LF
>
>
>
>
> --
> *From:* Till Rohrmann 
> *To:* user@flink.apache.org; lg...@yahoo.com
> *Sent:* Tuesday, October 11, 2016 2:49 AM
>
> *Subject:* Re: more complex patterns for CEP (was: CEP two transitions to
> the same state)
>
> The timeline is hard to predict to be honest. It depends a little bit on
> how fast the community can proceed with these things. At the moment I'm
> personally involved in other issues and, thus, cannot work on the CEP
> library. I hope to get back to it soon.
>
> Cheers,
> Till
>
> On Sat, Oct 8, 2016 at 12:42 AM,  wrote:
>
> hi Till,
>
> Thanks for the detailed response.
>
> I'm looking forward to seeing these features implemented in Flink. Can
> anyone provide timelines for the 3 tickets that you mentioned in your
> response?
>
>
> - LF
>
>
>
>
> --
> *From:* Till Rohrmann 
> *To:* user@flink.apache.org
> *Sent:* Tuesday, September 20, 2016 7:13 AM
> *Subject:* Re: more complex patterns for CEP (was: CEP two transitions to
> the same state)
>
> Hi Frank,
>
> thanks for sharing your analysis. It indeed pinpoints some of the current
> CEP library's shortcomings.
>
> Let me address your points:
>
> 1. Lack of not operator
>
> The functionality to express events which must not occur in a pattern is
> missing. We've currently a JIRA [1] which addresses exactly this. For the
> notFollowedBy operator, we should discard all patterns where we've seen a
> matching event for the not state. I think it could be implemented like a
> special terminal state where we prune the partial pattern.
>
> For the notNext operator, we could think about keeping the event which has
> not matched the notNext state and return it as part of the fully matched
> pattern. Alternatively, we could simply forget about it once we've assured
> that it does not match.
>
> 2. Allow functions to access fields of previous events
>
> This hasn't been implemented yet because it is a quite expensive
> operation. Before calling the filter function you always have to
> reconstruct the current partial pattern and then give it to the filter
> function. But I agree that the user should be allowed to use such a
> functionality (and then pay the price for it in terms of efficiency).
> Giving access to the partially matched fields via a Map would be a way to
> solve the problem on the API level.
>
> I think that almost all functionality for this feature is already in
> place. We simply would have to check the filter condition whether they
> require access to previous events and then compute the partial pattern.
>
> 3. Support for recursive patterns
>
> The underlying SharedBuffer implementation should allow recursive event
> patterns. Once we have support for branching CEP patterns [2] which allow
> to connect different states this should also be possible with some minor
> changes.
>
> However, a more interesting way to specify recursive CEP patterns is to
> use regular expression syntax (Kleene star, bounded occurrences) to express
> recursive parts of a pattern. I think this makes specifying such a pattern
> easier and more intuitive for the user. We've also a JIRA issue to track
> the process there [3] and Ivan is already working on this.
>
> If you want to get involved in Flink's CEP development, then feel free to
> take over any free JIRA issue or create one yourself :-)
>
> [1] https://issues.apache.org/ jira/browse/FLINK-3320
> 
> [2] https://issues.apache.org/ jira/browse/FLINK-4641
> 
> [3] https://issues.apache.org/ jira/browse/FLINK-3318
> 
>
> Cheers,
> Till
>
> On Fri, Sep 16, 2016 at 10:04 PM, Frank Dekervel  wrote:
>
> Hello,
>
> i did some more analysis wrt the problem i'm facing and the flink CEP api.
>
> In order to complete the problem i'm facing using flink CEP i would need 3
> 

Re: more complex patterns for CEP (was: CEP two transitions to the same state)

2016-10-11 Thread lgfmt
Thanks, Till. I will wait for your response.
- LF




  From: Till Rohrmann 
 To: user@flink.apache.org; lg...@yahoo.com 
 Sent: Tuesday, October 11, 2016 2:49 AM
 Subject: Re: more complex patterns for CEP (was: CEP two transitions to the 
same state)
   
The timeline is hard to predict to be honest. It depends a little bit on how 
fast the community can proceed with these things. At the moment I'm personally 
involved in other issues and, thus, cannot work on the CEP library. I hope to 
get back to it soon.
Cheers,Till
On Sat, Oct 8, 2016 at 12:42 AM,  wrote:

hi Till,
Thanks for the detailed response.
I'm looking forward to seeing these features implemented in Flink. Can anyone 
provide timelines for the 3 tickets that you mentioned in your response?  - LF




  From: Till Rohrmann 
 To: user@flink.apache.org 
 Sent: Tuesday, September 20, 2016 7:13 AM
 Subject: Re: more complex patterns for CEP (was: CEP two transitions to the 
same state)
  
Hi Frank,
thanks for sharing your analysis. It indeed pinpoints some of the current CEP 
library's shortcomings.
Let me address your points:
1. Lack of not operator
The functionality to express events which must not occur in a pattern is 
missing. We've currently a JIRA [1] which addresses exactly this. For the 
notFollowedBy operator, we should discard all patterns where we've seen a 
matching event for the not state. I think it could be implemented like a 
special terminal state where we prune the partial pattern.
For the notNext operator, we could think about keeping the event which has not 
matched the notNext state and return it as part of the fully matched pattern. 
Alternatively, we could simply forget about it once we've assured that it does 
not match.
2. Allow functions to access fields of previous events
This hasn't been implemented yet because it is a quite expensive operation. 
Before calling the filter function you always have to reconstruct the current 
partial pattern and then give it to the filter function. But I agree that the 
user should be allowed to use such a functionality (and then pay the price for 
it in terms of efficiency). Giving access to the partially matched fields via a 
Map would be a way to solve the problem on the API level.
I think that almost all functionality for this feature is already in place. We 
simply would have to check the filter condition whether they require access to 
previous events and then compute the partial pattern.
3. Support for recursive patterns
The underlying SharedBuffer implementation should allow recursive event 
patterns. Once we have support for branching CEP patterns [2] which allow to 
connect different states this should also be possible with some minor changes.
However, a more interesting way to specify recursive CEP patterns is to use 
regular expression syntax (Kleene star, bounded occurrences) to express 
recursive parts of a pattern. I think this makes specifying such a pattern 
easier and more intuitive for the user. We've also a JIRA issue to track the 
process there [3] and Ivan is already working on this.
If you want to get involved in Flink's CEP development, then feel free to take 
over any free JIRA issue or create one yourself :-)
[1] https://issues.apache.org/ jira/browse/FLINK-3320
[2] https://issues.apache.org/ jira/browse/FLINK-4641[3] 
https://issues.apache.org/ jira/browse/FLINK-3318
Cheers,Till
On Fri, Sep 16, 2016 at 10:04 PM, Frank Dekervel  wrote:

Hello,
i did some more analysis wrt the problem i'm facing and the flink CEP api.
In order to complete the problem i'm facing using flink CEP i would need 3 
additions to the API (i think). I tried to understand the NFA logic, and i 
think 2 of them should be doable without fundamental changes.
First is to add a "negative" pattern (notFollowedBy / notNext):
Reason is the flow below: i have a start and a termination event, and an 
optional "failure" event in between. i want all succesful termination events, 
so i want to express there should not be a failure event between the start and 
the termination event. Note that there is no "success" event in this case on 
which i could match.


To implement, upon checking whether a transition would be possible, one would 
first need to check if it was not already dead-ended by a notFollowedBy / 
notNext. This would add a bit of complexity to the logic (when seeing if a 
transition is valid for a state, first check if on this state there was not 
already a match made to an notFollowedBy/notNext state. in that case one would 
reject the match)
Second is to allow the filterfunction to inspect the partial match made, so one 
would be able to filter based on the already-matched event. Reason is the 
following (hypothetical) example where we would match arrivals of a trains in a 
station. We cannot keyBy train (because the "occupied" events of the station 
don't have train information), neither can we keyBy 

Re: more complex patterns for CEP (was: CEP two transitions to the same state)

2016-10-11 Thread Till Rohrmann
The timeline is hard to predict to be honest. It depends a little bit on
how fast the community can proceed with these things. At the moment I'm
personally involved in other issues and, thus, cannot work on the CEP
library. I hope to get back to it soon.

Cheers,
Till

On Sat, Oct 8, 2016 at 12:42 AM,  wrote:

> hi Till,
>
> Thanks for the detailed response.
>
> I'm looking forward to seeing these features implemented in Flink. Can
> anyone provide timelines for the 3 tickets that you mentioned in your
> response?
>
>
> - LF
>
>
>
>
> --
> *From:* Till Rohrmann 
> *To:* user@flink.apache.org
> *Sent:* Tuesday, September 20, 2016 7:13 AM
> *Subject:* Re: more complex patterns for CEP (was: CEP two transitions to
> the same state)
>
> Hi Frank,
>
> thanks for sharing your analysis. It indeed pinpoints some of the current
> CEP library's shortcomings.
>
> Let me address your points:
>
> 1. Lack of not operator
>
> The functionality to express events which must not occur in a pattern is
> missing. We've currently a JIRA [1] which addresses exactly this. For the
> notFollowedBy operator, we should discard all patterns where we've seen a
> matching event for the not state. I think it could be implemented like a
> special terminal state where we prune the partial pattern.
>
> For the notNext operator, we could think about keeping the event which has
> not matched the notNext state and return it as part of the fully matched
> pattern. Alternatively, we could simply forget about it once we've assured
> that it does not match.
>
> 2. Allow functions to access fields of previous events
>
> This hasn't been implemented yet because it is a quite expensive
> operation. Before calling the filter function you always have to
> reconstruct the current partial pattern and then give it to the filter
> function. But I agree that the user should be allowed to use such a
> functionality (and then pay the price for it in terms of efficiency).
> Giving access to the partially matched fields via a Map would be a way to
> solve the problem on the API level.
>
> I think that almost all functionality for this feature is already in
> place. We simply would have to check the filter condition whether they
> require access to previous events and then compute the partial pattern.
>
> 3. Support for recursive patterns
>
> The underlying SharedBuffer implementation should allow recursive event
> patterns. Once we have support for branching CEP patterns [2] which allow
> to connect different states this should also be possible with some minor
> changes.
>
> However, a more interesting way to specify recursive CEP patterns is to
> use regular expression syntax (Kleene star, bounded occurrences) to express
> recursive parts of a pattern. I think this makes specifying such a pattern
> easier and more intuitive for the user. We've also a JIRA issue to track
> the process there [3] and Ivan is already working on this.
>
> If you want to get involved in Flink's CEP development, then feel free to
> take over any free JIRA issue or create one yourself :-)
>
> [1] https://issues.apache.org/jira/browse/FLINK-3320
> [2] https://issues.apache.org/jira/browse/FLINK-4641
> [3] https://issues.apache.org/jira/browse/FLINK-3318
>
> Cheers,
> Till
>
> On Fri, Sep 16, 2016 at 10:04 PM, Frank Dekervel  wrote:
>
> Hello,
>
> i did some more analysis wrt the problem i'm facing and the flink CEP api.
>
> In order to complete the problem i'm facing using flink CEP i would need 3
> additions to the API (i think). I tried to understand the NFA logic, and i
> think 2 of them should be doable without fundamental changes.
>
> First is to add a "negative" pattern (notFollowedBy / notNext):
>
> Reason is the flow below: i have a start and a termination event, and an
> optional "failure" event in between. i want all succesful termination
> events, so i want to express there should not be a failure event between
> the start and the termination event. Note that there is no "success" event
> in this case on which i could match.
>
> [image: Inline image 1]
>
> To implement, upon checking whether a transition would be possible, one
> would first need to check if it was not already dead-ended by a
> notFollowedBy / notNext. This would add a bit of complexity to the logic
> (when seeing if a transition is valid for a state, first check if on this
> state there was not already a match made to an notFollowedBy/notNext state.
> in that case one would reject the match)
>
> Second is to allow the filterfunction to inspect the partial match made,
> so one would be able to filter based on the already-matched event. Reason
> is the following (hypothetical) example where we would match arrivals of a
> trains in a station. We cannot keyBy train (because the "occupied" events
> of the station don't have train information), neither can we keyBy station
> (as the start of the sequence is outside the station), 

Re: more complex patterns for CEP (was: CEP two transitions to the same state)

2016-10-07 Thread lgfmt
hi Till,
Thanks for the detailed response.
I'm looking forward to seeing these features implemented in Flink. Can anyone 
provide timelines for the 3 tickets that you mentioned in your response?  - LF




  From: Till Rohrmann 
 To: user@flink.apache.org 
 Sent: Tuesday, September 20, 2016 7:13 AM
 Subject: Re: more complex patterns for CEP (was: CEP two transitions to the 
same state)
   
Hi Frank,
thanks for sharing your analysis. It indeed pinpoints some of the current CEP 
library's shortcomings.
Let me address your points:
1. Lack of not operator
The functionality to express events which must not occur in a pattern is 
missing. We've currently a JIRA [1] which addresses exactly this. For the 
notFollowedBy operator, we should discard all patterns where we've seen a 
matching event for the not state. I think it could be implemented like a 
special terminal state where we prune the partial pattern.
For the notNext operator, we could think about keeping the event which has not 
matched the notNext state and return it as part of the fully matched pattern. 
Alternatively, we could simply forget about it once we've assured that it does 
not match.
2. Allow functions to access fields of previous events
This hasn't been implemented yet because it is a quite expensive operation. 
Before calling the filter function you always have to reconstruct the current 
partial pattern and then give it to the filter function. But I agree that the 
user should be allowed to use such a functionality (and then pay the price for 
it in terms of efficiency). Giving access to the partially matched fields via a 
Map would be a way to solve the problem on the API level.
I think that almost all functionality for this feature is already in place. We 
simply would have to check the filter condition whether they require access to 
previous events and then compute the partial pattern.
3. Support for recursive patterns
The underlying SharedBuffer implementation should allow recursive event 
patterns. Once we have support for branching CEP patterns [2] which allow to 
connect different states this should also be possible with some minor changes.
However, a more interesting way to specify recursive CEP patterns is to use 
regular expression syntax (Kleene star, bounded occurrences) to express 
recursive parts of a pattern. I think this makes specifying such a pattern 
easier and more intuitive for the user. We've also a JIRA issue to track the 
process there [3] and Ivan is already working on this.
If you want to get involved in Flink's CEP development, then feel free to take 
over any free JIRA issue or create one yourself :-)
[1] https://issues.apache.org/jira/browse/FLINK-3320
[2] https://issues.apache.org/jira/browse/FLINK-4641[3] 
https://issues.apache.org/jira/browse/FLINK-3318
Cheers,Till
On Fri, Sep 16, 2016 at 10:04 PM, Frank Dekervel  wrote:

Hello,
i did some more analysis wrt the problem i'm facing and the flink CEP api.
In order to complete the problem i'm facing using flink CEP i would need 3 
additions to the API (i think). I tried to understand the NFA logic, and i 
think 2 of them should be doable without fundamental changes.
First is to add a "negative" pattern (notFollowedBy / notNext):
Reason is the flow below: i have a start and a termination event, and an 
optional "failure" event in between. i want all succesful termination events, 
so i want to express there should not be a failure event between the start and 
the termination event. Note that there is no "success" event in this case on 
which i could match.


To implement, upon checking whether a transition would be possible, one would 
first need to check if it was not already dead-ended by a notFollowedBy / 
notNext. This would add a bit of complexity to the logic (when seeing if a 
transition is valid for a state, first check if on this state there was not 
already a match made to an notFollowedBy/notNext state. in that case one would 
reject the match)
Second is to allow the filterfunction to inspect the partial match made, so one 
would be able to filter based on the already-matched event. Reason is the 
following (hypothetical) example where we would match arrivals of a trains in a 
station. We cannot keyBy train (because the "occupied" events of the station 
don't have train information), neither can we keyBy station (as the start of 
the sequence is outside the station), so we need to add an additional condition 
for the second event: the train number must equal the train number of the first 
one. And in the third event, the station number should equal the station number 
of the second one.
I think this could be accomplished by overloading the where function with a 
second filterfunction variant that takes 2 parameters: the event considered + 
the partial match (as a Map with T the class of the event)


Third one is - i think - more difficult to accomplish, and that's more complex 
graphs i asked in my