Hi everyone,

I'd like to share the plans for MATCH_RECOGNIZE support in Flink.

Flink features a so-called CEP library for quite some time [1]. The CEP
features is a popular feature and frequently used.
In a nutshell, the library provides a domain-specific API to define event
patterns. The patterns are translated into a state machine and evaluated in
a streaming program.

Even before, we learned about about MATCH_RECOGNIZE, Till (another Flink
committer) and I gave a few talks about unifying SQL and CEP [2].
Hence, we were quite excited when we learned about MATCH_RECOGNIZE and even
more when it was added to Calcite.
Shortly after that, we got a PR [3] which translated the parsed
MATCH_RECOGNIZE clause into patterns of our CEP library.
However, we never really got to the point of merging that contribution,
mainly because there were some inconsistencies in the semantics of
MATCH_RECOGNIZE and Flink's CEP library.

Recently, a Flink committers picked up this feature again, validated the
the semantics, and made a few corrections [4].
The CEP library is now ready to support a subset of the MATCH_RECOGNIZE
features.
Unfortunately, MATCH_RECOGNIZE support won't make it into the upcoming
1.6.0 release, but the plans are to add it for the 1.7.0 release.

Regarding the idea of sharing parts of the evaluation logic.
Flink has runtime support for a subset of the MATCH_RECOGNIZE clause.
Unfortunately, I am not familiar with the internals of Flink's CEP library
and don't know how portable it is.

Best, Fabian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/cep.html
[2]
https://www.slideshare.net/tillrohrmann/streaming-analytics-cep-two-sides-of-the-same-coin
[3] https://github.com/apache/flink/pull/4502
[4] https://issues.apache.org/jira/browse/FLINK-9593

2018-07-23 21:03 GMT+02:00 Sergey Nuyanzin <[email protected]>:

> looks exciting.
> If it is possible I would like to take a part of it however I'm not sure
> about this week (I could since August)
>
> On Mon, Jul 23, 2018 at 9:10 PM, Michael Mior <[email protected]> wrote:
>
> > This does sound like my idea of fun, but unfortunately I won't have
> > the time to contribute in the near future. I'll keep this on my radar
> > though. I also shared this message with all the students in our
> > research group and I wouldn't be surprised if there was someone
> > willing to jump in. Thanks for keeping this moving Julian!
> >
> > --
> > Michael Mior
> > [email protected]
> > Le lun. 23 juil. 2018 à 13:54, Julian Hyde <[email protected]> a écrit :
> > >
> > > For quite a while we have had partial support for MATCH_RECOGNIZE. We
> > support it in the parser and validator, but there is no runtime
> > implementation. It’s a shame, because MATCH_RECOGNIZE is an incredibly
> > powerful SQL feature for both traditional SQL (it’s in Oracle 12c) and
> for
> > continuous query (aka complex event processing - CEP).
> > >
> > > I figure it’s time to change that. My plan is to implement it
> > incrementally, getting simple queries working to start with, then allow
> > people to add more complex queries.
> > >
> > > In a dev branch [1], I’ve added a method Enumerables.match[2]. The idea
> > is that if you supply an Enumerable of input data, a finite state machine
> > to figure out when a sequence of rows makes a match (represented by a
> > transition function: (state, row) -> state), and a function to convert a
> > matched set of rows to a set of output rows. The match method is fairly
> > straightforward, and I almost have it finished.
> > >
> > > The complexity is in generating the finite state machine, emitter
> > function, and so forth.
> > >
> > > Can someone help me with this task? If your idea of fun is implementing
> > database algorithms, this is about as much fun as it gets. You learned
> > about finite state machines in college - this is your chance to actually
> > write one!
> > >
> > > This might be a good joint project with the Flink community. I know
> > Flink are thinking of implementing CEP, and the algorithm we write here
> > could be shared with Flink (for use via Flink SQL or via the Flink API).
> > >
> > > Julian
> > >
> > > [1] https://github.com/julianhyde/calcite/commits/1935-match-recognize
> <
> > https://github.com/julianhyde/calcite/commits/1935-match-recognize>
> > >
> > > [2] https://github.com/julianhyde/calcite/commit/
> > 4dfaf1bbee718aa6694a8ce67d829c32d04c7e87#diff-
> > 8a97a64204db631471c563df7551f408R73 <https://github.com/
> > julianhyde/calcite/commit/4dfaf1bbee718aa6694a8ce67d829c32d04c7e87#diff-
> > 8a97a64204db631471c563df7551f408R73>
> >
>
>
>
> --
> Best regards,
> Sergey
>

Reply via email to