[
https://issues.apache.org/jira/browse/CALCITE-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566455#comment-17566455
]
Julian Hyde commented on CALCITE-5202:
--------------------------------------
I couldn’t tell whether timeout is the only enhancement proposed. If there are
others let me know.
Timeout is controversial. Some streaming systems use timeout, whereas others
have more declarative ways of making progress, such as watermarks. In my
experience, timeout-based logic in distributed systems tends to accumulate like
duct tape. Therefore I would like to see evidence that the timeout-based
approach is the right one for a significant fraction of Calcite projects.
> Support for MATCH_RECOGNIZE functionality enhancement
> -----------------------------------------------------
>
> Key: CALCITE-5202
> URL: https://issues.apache.org/jira/browse/CALCITE-5202
> Project: Calcite
> Issue Type: New Feature
> Reporter: Nicholas Jiang
> Priority: Major
>
> A MATCH_RECOGNIZE clause enables the following tasks:
> * Logically partition and order the data that is used with the PARTITION BY
> and ORDER BY clauses.
> * Define patterns of rows to seek using the PATTERN clause. These patterns
> use a syntax similar to that of regular expressions.
> * The logical components of the row pattern variables are specified in the
> DEFINE clause.
> * Define measures, which are expressions usable in other parts of the SQL
> query, in the MEASURES clause.
> MATCH_RECOGNIZE doesn't support to output the timeout matches at present,
> which is a common requirement in CEP scenarios. Meanwhile MATCH_RECOGNIZE
> doesn't support notNext, opposite of consecutive and until semantics:
> * notNext represents that the new pattern enforces that there is no event
> matching this pattern right after the preceding matched event.
> * consecutive means that works in conjunction with mutiple times matching,
> which specifies that any not matching element breaks the loop.
> * until applies a stop condition for a looping state that allows cleaning the
> underlying state.
> The syntax of enhanced MATCH_RECOGNIZE is proposed as follows:
> {code:sql}
> MATCH_RECOGNIZE (
> [ PARTITION BY <expr> [, ... ] ]
> [ ORDER BY <expr> [, ... ] ]
> [ MEASURES <expr> [AS] <alias> [, ... ] ]
> [ ONE ROW PER MATCH [ { SHOW TIMEOUT MATCHES } ] |
> ALL ROWS PER MATCH [ { SHOW TIMEOUT MATCHES } ]
> ]
> [ AFTER MATCH SKIP
> {
> PAST LAST ROW |
> TO NEXT ROW |
> TO [ { FIRST | LAST} ] <symbol>
> }
> ]
> PATTERN ( <pattern> )
> DEFINE <symbol> AS <expr> [, ... ]
> )
> {code}
> * SHOW TIMEOUT MATCHES is introduced to add timeout matches to the output.
> * [^ <symbol>] is proposed in <pattern> to express the notNext semantic. For
> example, A [^B] is translated to A.notNext(B).
> * ?? is introduced in <pattern> to support the opposite of consecutive
> semantic. For example, A B+?? is translated to A.next(B).oneOrMore(). On the
> contrary, A B+ is translated to A.next(B).oneOrMore().consecutive().
> * {<symbol>} is proposed in <pattern> to represent the until semantic. For
> example, A {- B*? -} C+{D} is translated to
> A.followedBy(C).oneOrMore().until(D).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)