Hi everyone,

I very much would value these improvements, thanks for bringing this up for
discussion!

I recalled that a discussion on timeouts was once brought up before, so I
looked up the thread. See
https://lists.apache.org/thread/tw0q4cno3og7hjo5ok1o5w4ytogh5o7l for
details.

Does anyone know if timeouts and its syntax was discussed during the
ISO/IEC review https://www.iso.org/standard/84485.html or perhaps in the
currently under development https://www.iso.org/standard/76583.html ?

Thanks,

Martijn

Op ma 18 jul. 2022 om 07:53 schreef Nicholas Jiang <[email protected]
>:

> Hi Julian Hyde,
>
> Thanks for your feedback about the MATCH_RECOGNIZE functionality
> enhancement. I'll give some explanations for the enhancement:
>
> - In the MATCH_RECOGNIZE, the WITHIN clause is an optional clause that
> outputs a pattern_clause match if and only if the match occurs within the
> specified time duration. Hence if the match occurs beyond the specified
> time, an optional clause that outputs a timeout pattern_clause match should
> be introduced for this situation.
>
> - [^ <symbol>], ?? and {<symbol>} are proposed as enhancement of the
> Pattern expression to support notNext, opposite of consecutive and until
> semantics for CEP scenarios. PTAL.
>
> Regards,
> Nicholas Jiang
>
> On 2022/07/13 18:17:59 Julian Hyde wrote:
> > I couldn’t tell whether timeout is the only enhancement proposed. If
> there are others let me know.
> >
> > Timeout is controversial. Some streaming systems use timeout, whereas
> others have more declarative ways of making progress, such as watermarks.
> In my experience, timeout-based logic in distributed systems tends to
> accumulate like duct tape. Therefore I would like to see evidence that the
> timeout-based approach is the right one for a significant fraction of
> Calcite projects.
> >
> > Julian
> >
> >
> > > On Jul 12, 2022, at 7:09 PM, Nicholas <[email protected]> wrote:
> > >
> > > Hi everyone,
> > >
> > >
> > >
> > >
> > > After investigating the usage of MATCH_RECOGNIZE, I have created a
> JIRA ticket '[CALCITE-5202] Support for MATCH_RECOGNIZE functionality
> enhancement'.
> > >
> > >
> > >
> > >
> > > A MATCH_RECOGNIZE clause enables the following tasks:
> > >
> > >
> > >
> > >
> > > - Logically partition and order the data that is used with the
> PARTITION BY and ORDER BY clauses.
> > >
> > >
> > >
> > >
> > > - Define patterns of rows to seek using the PATTERN clause. These
> patterns use a syntax similar to that of regular expressions.
> > >
> > >
> > >
> > >
> > > - The logical components of the row pattern variables are specified in
> the DEFINE clause.
> > >
> > >
> > >
> > >
> > > - Define measures, which are expressions usable in other parts of the
> SQL query, in the MEASURES clause.
> > >
> > >
> > >
> > >
> > > MATCH_RECOGNIZE doesn't support to output the timeout matches at
> present, which is a common requirement in CEP scenarios. Meanwhile
> MATCH_RECOGNIZE doesn't support notNext, opposite of consecutive and until
> semantics:
> > >
> > >
> > >
> > >
> > > - notNext represents that the new pattern enforces that there is no
> event matching this pattern right after the preceding matched event.
> > >
> > >
> > >
> > >
> > > - consecutive means that works in conjunction with mutiple times
> matching, which specifies that any not matching element breaks the loop.
> > >
> > >
> > >
> > >
> > > - until applies a stop condition for a looping state that allows
> cleaning the underlying state.
> > >
> > >
> > >
> > >
> > > The syntax of enhanced MATCH_RECOGNIZE is proposed as follows:
> > >
> > >
> > >
> > >
> > > MATCH_RECOGNIZE (
> > >
> > >    [ PARTITION BY <expr> [, ... ] ]
> > >
> > >    [ ORDER BY <expr> [, ... ] ]
> > >
> > >    [ MEASURES <expr> [AS] <alias> [, ... ] ]
> > >
> > >    [ ONE ROW PER MATCH [ { SHOW TIMEOUT MATCHES } ] |
> > >
> > >      ALL ROWS PER MATCH [ { SHOW TIMEOUT MATCHES } ]
> > >
> > >    ]
> > >
> > >    [ AFTER MATCH SKIP
> > >
> > >          {
> > >
> > >          PAST LAST ROW   |
> > >
> > >          TO NEXT ROW   |
> > >
> > >          TO [ { FIRST | LAST} ] <symbol>
> > >
> > >          }
> > >
> > >    ]
> > >
> > >    PATTERN ( <pattern> )
> > >
> > >    DEFINE <symbol> AS <expr> [, ... ]
> > >
> > > )
> > >
> > >
> > >
> > >
> > > - SHOW TIMEOUT MATCHES is introduced to add timeout matches to the
> output.
> > >
> > >
> > >
> > >
> > > - [^ <symbol>] is proposed in <pattern> to express the notNext
> semantic. For example, A [^B] is translated to A.notNext(B).
> > >
> > >
> > >
> > >
> > > Usage Example:
> > >
> > >
> > >
> > >
> > > MEASURES
> > >
> > > A.id as aid
> > >
> > > ONE ROW PER MATCH
> > >
> > > PATTERN (A [^B])
> > >
> > > DEFINE
> > >
> > >    A as A.id = 'a'
> > >
> > >    B as B.id = 'b'
> > >
> > >
> > >
> > >
> > > - ?? is introduced in <pattern> to support the opposite of consecutive
> semantic. For example, A B+?? is translated to A.next(B).oneOrMore(). On
> the contrary, A B+ is translated to A.next(B).oneOrMore().consecutive().
> > >
> > >
> > >
> > >
> > > Usage Example:
> > >
> > >
> > >
> > >
> > > MEASURES
> > >
> > > SUM(B.price) as amount
> > >
> > > ONE ROW PER MATCH
> > >
> > > PATTERN (A B+??)
> > >
> > > DEFINE
> > >
> > >    A as A.id = 'a'
> > >
> > >    A as B.id = 'b'
> > >
> > >
> > >
> > >
> > > - {<symbol>} is proposed in <pattern> to represent the until semantic.
> For example, A {- B*? -} C+ {D} is translated to
> A.followedBy(C).oneOrMore().until(D).
> > >
> > >
> > >
> > >
> > > Usage Example:
> > >
> > >
> > >
> > >
> > > MEASURES
> > >
> > > A.id as aid
> > >
> > > SUM(C.price) as amount
> > >
> > > ONE ROW PER MATCH
> > >
> > > PATTERN (A {- B*? -} C+{D})
> > >
> > > DEFINE
> > >
> > >    A as A.id = 'a'
> > >
> > >    C as C.id = 'c',
> > >
> > >    D as SUM(C.price) > 100
> > >
> > >
> > >
> > >
> > > The above is the syntax of the functional enhancement design of
> MATCH_RECOGNIZE. Looking forward to any feedback of the enhanced
> MATCH_RECOGNIZE syntax.
> > >
> > >
> > >
> > >
> > > Best Regards,
> > >
> > > Nicholas Jiang
> >
> >
>

Reply via email to