Hi Martijn,

There are many features available in the DataStream API of CEP not
supported in SQL, e.g. followedBy, notFollowedBy, followedByAny, etc. The
main reason is that the MATCH_RECOGNIZE clause which comes from SQL
standard doesn't define grammars for these semantics.

There are two ways to handle this:
1) Only support the features defined in MATCH_RECOGNIZE
2) Extend MATCH_RECOGNIZE to align with the other features in the
DataStream API of CEP

Regarding to 2), it means that we need to extend MATCH_RECOGNIZE in
calcite firstly and then support it in Flink. Personally I think it's worth
to do it. However, we need to think about it thoroughly, e.g. which
functionalities should be supported, what the grammar will be, etc. This
definitely deserves a separate FLIP and maybe should be discussed in
calcite community instead of Flink.

Regarding to the scope of DataStream API of CEP VS CEP SQL, I don't think
that SQL should align with all the functionalities provided in DataStream
API. DataStream API is more flexibility and so usually should be more
powerful.

Regards,
Dian


On Tue, Jun 7, 2022 at 8:37 PM Martijn Visser <martijnvis...@apache.org>
wrote:

> Hi Nicholas,
>
> It is disappointing that we can't support this in SQL. I am under the
> impression that currently all CEP capabilities are supported in both
> DataStream/Table API as well as SQL. If that's indeed the case, then I
> would rather have this also fixed for SQL to avoid introducing feature
> sparsity.
>
> Best regards,
>
> Martijn
>
> Op ma 6 jun. 2022 om 09:49 schreef Dian Fu <dian0511...@gmail.com>:
>
> > Hi Nicholas,
> >
> > Thanks a lot for the update.
> >
> > Regarding the pattern API, should we also introduce APIs such as
> > Pattern.times(int from, int to, Time windowTime) to indicate the time
> > interval between events matched in the loop?
> >
> > Regarding the naming of the classes, does it make sense to rename
> > `WithinType` to `InternalType` or `WindowType`? For the enum values
> inside
> > it, the current values(`BEFORE_AND_AFTER` and `FIRST_AND_LAST`) are not
> > intuitive for me. The candidates that come to my mind:
> > - `RELATIVE_TO_FIRST` and `RELATIVE_TO_PREVIOUS`
> > - `WHOLE_MATCH` and `RELATIVE_TO_PREVIOUS`
> >
> > Regards,
> > Dian
> >
> > On Tue, May 31, 2022 at 2:56 PM Nicholas Jiang <nicholasji...@apache.org
> >
> > wrote:
> >
> > > Hi Martijn,
> > >
> > > Sorry for later reply. This feature is only supported in DataStream and
> > > doesn't be supported in MATCH_RECOGNIZE because the SQL syntax of
> > > MATCH_RECOGNIZE does not contain the semantics of this feature, which
> > > requires modification of the SQL syntax. The support above
> > MATCH_RECOGNIZE
> > > is suitable for new FLIP to discuss.
> > >
> > > Regards,
> > > Nicholas Jiang
> > >
> > > On 2022/05/25 11:36:33 Martijn Visser wrote:
> > > > Hi Nicholas,
> > > >
> > > > Thanks for creating the FLIP, I can imagine that there will be many
> use
> > > > cases who can be created using this new feature.
> > > >
> > > > The FLIP doesn't mention anything with regards to SQL, could this
> > feature
> > > > also be supported when using MATCH_RECOGNIZE?
> > > >
> > > > Best regards,
> > > >
> > > > Martijn
> > > > https://twitter.com/MartijnVisser82
> > > > https://github.com/MartijnVisser
> > > >
> > > >
> > > > On Sat, 7 May 2022 at 11:17, Dian Fu <dian0511...@gmail.com> wrote:
> > > >
> > > > > Hi Nicholas,
> > > > >
> > > > > Thanks a lot for bringing up this discussion. If I recall it
> > correctly,
> > > > > this feature has been requested many times by the users and is
> among
> > > one of
> > > > > the most requested features in CEP. So big +1 to this feature
> > overall.
> > > > >
> > > > > Regarding the API, the name `partialWithin` sounds a little weird.
> Is
> > > it
> > > > > possible to find a name which is more intuitive? Other possible
> > > solutions:
> > > > > - Reuse the existing `Pattern.within` method and change its
> semantic
> > > to the
> > > > > maximum time interval between patterns. Currently `Pattern.within`
> is
> > > used
> > > > > to define the maximum time interval between the first event and the
> > > last
> > > > > event. However, the Pattern object represents only one node in a
> > > pattern
> > > > > sequence and so it doesn't make much sense to define the maximum
> time
> > > > > interval between the first event and the last event on the Pattern
> > > object,
> > > > > e.g. we could move it to  `PatternStreamBuilder`. However, if we
> > choose
> > > > > this option, we'd better consider how to keep backward
> compatibility.
> > > > > - Introduce a series of methods when appending a new pattern to the
> > > > > existing one, e.g. `Pattern.followedBy(Pattern<T, F> group, Time
> > > > > timeInterval)`. As timeInterval is a property between patterns and
> so
> > > it
> > > > > makes sense to define this property when appending a new pattern.
> > > However,
> > > > > the drawback is that we need to introduce a series of methods
> instead
> > > of
> > > > > only one method.
> > > > >
> > > > > We need also to make the semantic clear in a few corner cases, e.g.
> > > > > - What's the semantic of `A.followedBy(B).times(3).partialWithin(1
> > > min)`?
> > > > > Doesn't it mean that all three B events should occur in 1 minute or
> > > only
> > > > > the first B event should occur in 1 minute?
> > > > > - What's the semantic of
> > > > >
> > `A.followedBy(GroupPattern.begin("B").followedBy("C")).partialWithin(1
> > > > > min)``? Doesn't it mean that B and C should occur after A in 1
> > minute?
> > > > >
> > > > > Besides, this FLIP only describes how the newly introduced API will
> > be
> > > > > used, however, it lacks details about how you will implement it. It
> > > doesn't
> > > > > need to be very detailed, however, you should describe the basic
> > ideas
> > > > > behind it, e.g. how will you support
> > A.notFollowedBy(B).partialWithin(1
> > > > > min)? It could make sure that you have considered it thoroughly and
> > > also
> > > > > makes others confident that this feature could be implemented in a
> > > clean
> > > > > way.
> > > > >
> > > > > Regards,
> > > > > Dian
> > > > >
> > > > >
> > > > >
> > > > > On Fri, May 6, 2022 at 7:32 PM yue ma <mayuefi...@gmail.com>
> wrote:
> > > > >
> > > > > > hi Nicholas,
> > > > > >
> > > > > > Thanks for bringing this discussion, we also think it's a useful
> > > feature.
> > > > > > Some fine-grained timeout pattern matching  can be implemented in
> > CEP
> > > > > which
> > > > > > makes Flink CEP more powerful
> > > > > >
> > > > > > Nicholas <programg...@163.com> 于2022年5月5日周四 14:28写道:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Pattern#withIn interface in CEP defines the maximum time
> interval
> > > in
> > > > > > which
> > > > > > > a matching pattern has to be completed in order to be
> considered
> > > valid,
> > > > > > > which interval corresponds to the maximum time gap between
> first
> > > and
> > > > > the
> > > > > > > last event. The interval representing the maximum time gap
> > between
> > > > > events
> > > > > > > is required to define in the scenario like purchasing good
> > within a
> > > > > > maximum
> > > > > > > of 5 minutes after browsing.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I would like to start a discussion about FLIP-228[1], in which
> > > within
> > > > > > > between events is proposed in Pattern to support the definition
> > of
> > > the
> > > > > > > maximum time interval in which a completed partial matching
> > > pattern is
> > > > > > > considered valid, which interval represents the maximum time
> gap
> > > > > between
> > > > > > > events for partial matching Pattern.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hence we propose the Pattern#partialWithin interface to define
> > the
> > > > > > maximum
> > > > > > > time interval in which a completed partial matching pattern is
> > > > > considered
> > > > > > > valid. Please take a look at the FLIP page [1] to get more
> > > details. Any
> > > > > > > feedback about the FLIP-228 would be appreciated!
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-228%3A+Support+Within+between+events+in+CEP+Pattern
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > > Nicholas Jiang
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to