I’m delighted that Flink is getting full SQL support for MATCH_RECOGNIZE.

Sounds like it might be challenging to share the implementation, but could we 
perhaps share the test suite? (I.e. a set of SQL queries and their expected 
results.)

I added a simple test in 
https://github.com/julianhyde/calcite/commit/ee460847643ec17544f310088affd99be4028bb6
 
<https://github.com/julianhyde/calcite/commit/ee460847643ec17544f310088affd99be4028bb6>
 that could be extended.

Julian
 

> On Jul 31, 2018, at 8:07 AM, Fabian Hueske <[email protected]> wrote:
> 
> Hi everyone,
> 
> I'd like to share the plans for MATCH_RECOGNIZE support in Flink.
> 
> Flink features a so-called CEP library for quite some time [1]. The CEP
> features is a popular feature and frequently used.
> In a nutshell, the library provides a domain-specific API to define event
> patterns. The patterns are translated into a state machine and evaluated in
> a streaming program.
> 
> Even before, we learned about about MATCH_RECOGNIZE, Till (another Flink
> committer) and I gave a few talks about unifying SQL and CEP [2].
> Hence, we were quite excited when we learned about MATCH_RECOGNIZE and even
> more when it was added to Calcite.
> Shortly after that, we got a PR [3] which translated the parsed
> MATCH_RECOGNIZE clause into patterns of our CEP library.
> However, we never really got to the point of merging that contribution,
> mainly because there were some inconsistencies in the semantics of
> MATCH_RECOGNIZE and Flink's CEP library.
> 
> Recently, a Flink committers picked up this feature again, validated the
> the semantics, and made a few corrections [4].
> The CEP library is now ready to support a subset of the MATCH_RECOGNIZE
> features.
> Unfortunately, MATCH_RECOGNIZE support won't make it into the upcoming
> 1.6.0 release, but the plans are to add it for the 1.7.0 release.
> 
> Regarding the idea of sharing parts of the evaluation logic.
> Flink has runtime support for a subset of the MATCH_RECOGNIZE clause.
> Unfortunately, I am not familiar with the internals of Flink's CEP library
> and don't know how portable it is.
> 
> Best, Fabian
> 
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/cep.html 
> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/cep.html>
> [2]
> https://www.slideshare.net/tillrohrmann/streaming-analytics-cep-two-sides-of-the-same-coin
>  
> <https://www.slideshare.net/tillrohrmann/streaming-analytics-cep-two-sides-of-the-same-coin>
> [3] https://github.com/apache/flink/pull/4502 
> <https://github.com/apache/flink/pull/4502>
> [4] https://issues.apache.org/jira/browse/FLINK-9593 
> <https://issues.apache.org/jira/browse/FLINK-9593>
> 
> 2018-07-23 21:03 GMT+02:00 Sergey Nuyanzin <[email protected] 
> <mailto:[email protected]>>:
> 
>> looks exciting.
>> If it is possible I would like to take a part of it however I'm not sure
>> about this week (I could since August)
>> 
>> On Mon, Jul 23, 2018 at 9:10 PM, Michael Mior <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>>> This does sound like my idea of fun, but unfortunately I won't have
>>> the time to contribute in the near future. I'll keep this on my radar
>>> though. I also shared this message with all the students in our
>>> research group and I wouldn't be surprised if there was someone
>>> willing to jump in. Thanks for keeping this moving Julian!
>>> 
>>> --
>>> Michael Mior
>>> [email protected] <mailto:[email protected]>
>>> Le lun. 23 juil. 2018 à 13:54, Julian Hyde <[email protected] 
>>> <mailto:[email protected]>> a écrit :
>>>> 
>>>> For quite a while we have had partial support for MATCH_RECOGNIZE. We
>>> support it in the parser and validator, but there is no runtime
>>> implementation. It’s a shame, because MATCH_RECOGNIZE is an incredibly
>>> powerful SQL feature for both traditional SQL (it’s in Oracle 12c) and
>> for
>>> continuous query (aka complex event processing - CEP).
>>>> 
>>>> I figure it’s time to change that. My plan is to implement it
>>> incrementally, getting simple queries working to start with, then allow
>>> people to add more complex queries.
>>>> 
>>>> In a dev branch [1], I’ve added a method Enumerables.match[2]. The idea
>>> is that if you supply an Enumerable of input data, a finite state machine
>>> to figure out when a sequence of rows makes a match (represented by a
>>> transition function: (state, row) -> state), and a function to convert a
>>> matched set of rows to a set of output rows. The match method is fairly
>>> straightforward, and I almost have it finished.
>>>> 
>>>> The complexity is in generating the finite state machine, emitter
>>> function, and so forth.
>>>> 
>>>> Can someone help me with this task? If your idea of fun is implementing
>>> database algorithms, this is about as much fun as it gets. You learned
>>> about finite state machines in college - this is your chance to actually
>>> write one!
>>>> 
>>>> This might be a good joint project with the Flink community. I know
>>> Flink are thinking of implementing CEP, and the algorithm we write here
>>> could be shared with Flink (for use via Flink SQL or via the Flink API).
>>>> 
>>>> Julian
>>>> 
>>>> [1] https://github.com/julianhyde/calcite/commits/1935-match-recognize
>> <
>>> https://github.com/julianhyde/calcite/commits/1935-match-recognize 
>>> <https://github.com/julianhyde/calcite/commits/1935-match-recognize>>
>>>> 
>>>> [2] https://github.com/julianhyde/calcite/commit/ 
>>>> <https://github.com/julianhyde/calcite/commit/>
>>> 4dfaf1bbee718aa6694a8ce67d829c32d04c7e87#diff-
>>> 8a97a64204db631471c563df7551f408R73 <https://github.com/ 
>>> <https://github.com/>
>>> julianhyde/calcite/commit/4dfaf1bbee718aa6694a8ce67d829c32d04c7e87#diff-
>>> 8a97a64204db631471c563df7551f408R73>
>>> 
>> 
>> 
>> 
>> --
>> Best regards,
>> Sergey

Reply via email to