[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690908#comment-16690908 ] ASF GitHub Bot commented on FLINK-10625: dawidwys closed pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/docs/dev/table/sql.md b/docs/dev/table/sql.md index b1bd572d4d2..90e20065726 100644 --- a/docs/dev/table/sql.md +++ b/docs/dev/table/sql.md @@ -163,6 +163,7 @@ joinCondition: tableReference: tablePrimary + [ matchRecognize ] [ [ AS ] alias [ '(' columnAlias [, columnAlias ]* ')' ] ] tablePrimary: @@ -196,6 +197,45 @@ windowSpec: ] ')' +matchRecognize: + MATCH_RECOGNIZE '(' + [ PARTITION BY expression [, expression ]* ] + [ ORDER BY orderItem [, orderItem ]* ] + [ MEASURES measureColumn [, measureColumn ]* ] + [ ONE ROW PER MATCH ] + [ AFTER MATCH +( SKIP TO NEXT ROW +| SKIP PAST LAST ROW +| SKIP TO FIRST variable +| SKIP TO LAST variable +| SKIP TO variable ) + ] + PATTERN '(' pattern ')' + DEFINE variable AS condition [, variable AS condition ]* + ')' + +measureColumn: + expression AS alias + +pattern: + patternTerm [ '|' patternTerm ]* + +patternTerm: + patternFactor [ patternFactor ]* + +patternFactor: + variable [ patternQuantifier ] + +patternQuantifier: + '*' + | '*?' + | '+' + | '+?' + | '?' + | '??' + | '{' { [ minRepeat ], [ maxRepeat ] } '}' ['?'] + | '{' repeat '}' + {% endhighlight %} Flink SQL uses a lexical policy for identifier (table, attribute, function names) similar to Java: @@ -756,7 +796,6 @@ Group windows are defined in the `GROUP BY` clause of a SQL query. Just like que - Time Attributes For SQL queries on streaming tables, the `time_attr` argument of the group window function must refer to a valid time attribute that specifies the processing time or event time of rows. See the [documentation of time attributes](streaming/time_attributes.html) to learn how to define time attributes. @@ -902,6 +941,52 @@ val result4 = tableEnv.sqlQuery( {% top %} +### Pattern Recognition + + + + + + Operation + Description + + + + + +MATCH_RECOGNIZE +Streaming + + +Searches for a given pattern in a streaming table according to the MATCH_RECOGNIZE https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip;>ISO standard. This makes it possible to express complex event processing (CEP) logic in SQL queries. +For a more detailed description, see the dedicated page for detecting patterns in tables. + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + + + + + + + +{% top %} + Data Types -- diff --git a/docs/dev/table/streaming/match_recognize.md b/docs/dev/table/streaming/match_recognize.md new file mode 100644 index 000..b12cbe5e0d9 --- /dev/null +++ b/docs/dev/table/streaming/match_recognize.md @@ -0,0 +1,842 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with the `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688345#comment-16688345 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r23328 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. Review comment: ```suggestion * The logical components of the row pattern variables are specified in the `DEFINE` clause. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688370#comment-16688370 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233922000 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688351#comment-16688351 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233905708 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688354#comment-16688354 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233901927 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688357#comment-16688357 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233917000 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688365#comment-16688365 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233922429 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688347#comment-16688347 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233889947 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). Review comment: ```suggestion Attention Flink's implementation of the `MATCH_RECOGNIZE` clause is a subset of the full standard. Only those features documented in the following sections are supported. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688364#comment-16688364 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233916241 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688342#comment-16688342 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233884735 ## File path: docs/dev/table/sql.md ## @@ -902,6 +941,52 @@ val result4 = tableEnv.sqlQuery( {% top %} +### Pattern Recognition + + + + + + Operation + Description + + + + + +MATCH_RECOGNIZE +Streaming + + +Searches for a given pattern in a streaming table according to the MATCH_RECOGNIZE https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip;>ISO standard. This enables to express complex event processing (CEP) logic in SQL queries. Review comment: ```suggestion Searches for a given pattern in a streaming table according to the MATCH_RECOGNIZE https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip;>ISO standard. This makes it possible to express complex event processing (CEP) logic in SQL queries. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688358#comment-16688358 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233902190 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688352#comment-16688352 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233898951 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688346#comment-16688346 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233895200 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. Review comment: ```suggestion * [DEFINE](#define--measures) - this section defines the conditions that the pattern variables must satisfy. ``` This is an automated message from the
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688343#comment-16688343 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233885365 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink Review comment: ```suggestion It is a common use case to search for a set of event patterns, especially in case of data streams. Flink ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688349#comment-16688349 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233899832 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688362#comment-16688362 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233917454 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688350#comment-16688350 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233892257 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. Review comment: ```suggestion * [AFTER MATCH SKIP](#after-match-strategy) - specifies where the next match should start; this is also a way to control how many distinct matches a single event can belong to. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688359#comment-16688359 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233920041 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688363#comment-16688363 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233920293 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688368#comment-16688368 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233914439 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688355#comment-16688355 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233903497 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688367#comment-16688367 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233922552 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688366#comment-16688366 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233913599 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688369#comment-16688369 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233918263 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688356#comment-16688356 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233905360 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688361#comment-16688361 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233908441 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688360#comment-16688360 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233921745 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustrates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688353#comment-16688353 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233900892 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688344#comment-16688344 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233885804 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. Review comment: ```suggestion * Logically partition and order the data that is used with the `PARTITION BY` and `ORDER BY` clauses. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688348#comment-16688348 ] ASF GitHub Bot commented on FLINK-10625: alpinegizmo commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233896842 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,843 @@ +--- +title: 'Detecting Patterns in Tables' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting Patterns' +nav-pos: 5 +is_beta: true +--- + + +It is a common use-case to search for a set of event patterns, especially in case of data streams. Flink +comes with a [complex event processing (CEP) library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. Furthermore, Flink's +SQL API provides a relational way of expressing queries with a large set of built-in functions and rule-based optimizations that can be used out of the box. + +In December 2016, the International Organization for Standardization (ISO) released a new version of the SQL standard which includes _Row Pattern Recognition in SQL_ ([ISO/IEC TR 19075-5:2016](https://standards.iso.org/ittf/PubliclyAvailableStandards/c065143_ISO_IEC_TR_19075-5_2016.zip)). It allows Flink to consolidate CEP and SQL API using the `MATCH_RECOGNIZE` clause for complex event processing in SQL. + +A `MATCH_RECOGNIZE` clause enables the following tasks: +* Logically partition and order the data that is used with `PARTITION BY` and `ORDER BY` clauses. +* Define patterns of rows to seek using the `PATTERN` clause. These patterns use a syntax similar to that of regular expressions. +* Specify logical conditions required to map a row to a row pattern variable in the `DEFINE` clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the `MEASURES` clause. + +The following example illustates the syntax for basic pattern recognition: + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + PARTITION BY userid + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + +This page will explain each keyword in more detail and will illustrate more complex examples. + +Attention The `MATCH_RECOGNIZE` clause implementation in Flink is just a subset of the bigger pattern recognition standard. Only the features that are documented in the following sections are supported so far. Since the development is still in an early phase, please also take a look at the [known limitations](#known-limitations). + +* This will be replaced by the TOC +{:toc} + +Introduction and Examples +- + +### Installation Guide + +The pattern recognition feature uses the Apache Flink's CEP library internally. In order to be able to use the `MATCH_RECOGNIZE` clause, +the library needs to be added as a dependency to your Maven project. + +{% highlight xml %} + + org.apache.flink + flink-cep{{ site.scala_version_suffix }} + {{ site.version }} + +{% endhighlight %} + +Alternatively, you can also add the dependency to the cluster classpath (see the [dependency section]({{ site.baseurl}}/projectsetup/dependencies.html) for more information). + +If you want to use the `MATCH_RECOGNIZE` clause in the [SQL Client]({{ site.baseurl}}/dev/table/sqlClient.html), +you don't have to do anything as all the dependencies are included by default. + +### SQL Semantics + +Every `MATCH_RECOGNIZE` query consists of the following clauses: + +* [PARTITION BY](#partitioning) - defines the logical partitioning of the table; similar to a `GROUP BY` operation. +* [ORDER BY](#order-of-events) - specifies how the incoming rows should be ordered; this is essential as patterns depend on an order. +* [MEASURES](#define--measures) - defines output of the clause; similar to a `SELECT` clause. +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match should be produced. +* [AFTER MATCH SKIP](#after-match-strategy) - allows to specify where the next match should start; this is also a way to control how many distinct matches a single event can belong to. +* [PATTERN](#defining-pattern) - allows constructing patterns that will be searched for using a _regular expression_-like syntax. +* [DEFINE](#define--measures) - this section defines conditions on rows that should be met in order to be qualified to the corresponding pattern variable. + +Attention Currently, the `MATCH_RECOGNIZE` clause can only be applied to an [append table](dynamic_tables.html#update-and-append-queries). Furthermore, it always produces +an append table as well. + +### Examples + +For our examples, we assume that a
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16688178#comment-16688178 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on issue #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#issuecomment-439070596 Thanks for the update @dawidwys. I did a full pass through the document again and corrected a lot of minor things and some bugs. I propose to cherry-pick my changes on top of this PR. Such that other contributors can have the chance for a final pass. You can find my changes here: https://github.com/twalthr/flink/tree/FLINK-10625 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685295#comment-16685295 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233059805 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685293#comment-16685293 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233059298 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685287#comment-16685287 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233057636 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685286#comment-16685286 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233057636 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685241#comment-16685241 ] ASF GitHub Bot commented on FLINK-10625: dawidwys commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233047753 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{%
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685253#comment-16685253 ] ASF GitHub Bot commented on FLINK-10625: dawidwys commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233049601 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{%
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685248#comment-16685248 ] ASF GitHub Bot commented on FLINK-10625: dawidwys commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233048748 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{%
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685246#comment-16685246 ] ASF GitHub Bot commented on FLINK-10625: dawidwys commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r233048167 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{%
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683626#comment-16683626 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232599246 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: Review comment: Move this explanation into an `Overview` section before the example and add a clause skeleton with all the keywords. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683628#comment-16683628 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232604689 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683611#comment-16683611 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232597810 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: Review comment: Use backticks throughout the document to mark clauses. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683616#comment-16683616 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232608080 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683614#comment-16683614 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232598253 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. Review comment: remove "of/in the MATCH_RECOGNIZE clause" for all list items This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683621#comment-16683621 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232595246 ## File path: docs/dev/table/sql.md ## @@ -756,6 +796,51 @@ Group windows are defined in the `GROUP BY` clause of a SQL query. Just like que +### Match_recognize + + + + + + Operation + Description + + + + + +MATCH_RECOGNIZE +Streaming + + +Search for given event pattern in an incoming stream. For more though description see Detecting event patterns Review comment: "Searches for a given pattern in a streaming table according to the `MATCH_RECOGNIZE` standard [link]." "For a more detailed description see..." This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683634#comment-16683634 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232619371 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683623#comment-16683623 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232595474 ## File path: docs/dev/table/sql.md ## @@ -756,6 +796,51 @@ Group windows are defined in the `GROUP BY` clause of a SQL query. Just like que +### Match_recognize + + + + + + Operation + Description + + + + + +MATCH_RECOGNIZE +Streaming + + +Search for given event pattern in an incoming stream. For more though description see Detecting event patterns + +{% highlight sql %} +SELECT T.aid, T.bid, T.cid +FROM MyTable +MATCH_RECOGNIZE ( + ORDER BY proctime + MEASURES +A.id AS aid, +B.id AS bid, +C.id AS cid + PATTERN (A B C) + DEFINE +A AS name = 'a', +B AS name = 'b', +C AS name = 'c' +) AS T +{% endhighlight %} + + + + + + + +{% top %} + Review comment: nit: tripple empty line This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683625#comment-16683625 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232596586 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' Review comment: Should we just call it `Detecting Patterns`? Because there are no events in the table world but only rows that are appended to a table. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683631#comment-16683631 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232601307 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683624#comment-16683624 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232597283 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and Review comment: only SQL not Table API This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683627#comment-16683627 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232604261 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro Review comment: `pro`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > >
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683619#comment-16683619 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232603954 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: Review comment: Also mention that match recognize can only be applied on a append table with time attributes and produces an append table. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683633#comment-16683633 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232623515 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683636#comment-16683636 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232617404 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683635#comment-16683635 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232621932 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683630#comment-16683630 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232601742 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683617#comment-16683617 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232593513 ## File path: docs/dev/table/sql.md ## @@ -756,6 +796,51 @@ Group windows are defined in the `GROUP BY` clause of a SQL query. Just like que +### Match_recognize Review comment: `Match_recognize` -> `Pattern Recognition` This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683632#comment-16683632 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232618583 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683622#comment-16683622 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232607491 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683629#comment-16683629 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232602260 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683615#comment-16683615 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232619065 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683612#comment-16683612 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232595772 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' Review comment: Use the `isBeta` tag instead. See SQL Client. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683637#comment-16683637 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232617901 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. +* [ORDER BY](#order-of-events) - specifies how should the incoming events be order, this is essential as patterns define order. +* [MEASURES](#define--measures) - defines output of the clause, similar to `SELECT` clause +* [ONE ROW PER MATCH](#output-mode) - output mode which defines how many rows per match will be produced +* [AFTER MATCH SKIP](#after-match-skip) - allows to specify where next match should start, this is also a way to control to how many distinct matches a single event can belong +* [PATTERN](#defining-pattern) - clause that allows constructing patterns that will be searched for, pro +* [DEFINE](#define--measures) - this section defines conditions on events that should be met in order to be qualified to corresponding pattern variable + + +Installation guide +-- + +Match recognize uses Apache Flink's CEP library internally. In order to be able to use this clause one has to add +this library as dependency. Either by adding it to your uber-jar by adding dependency on: + +{% highlight
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683618#comment-16683618 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232605051 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: + +{% highlight text %} +SYMBOL ROWTIME PRICE +== === +'ACME' '01-Apr-11 10:00:00' 12 +'ACME' '01-Apr-11 10:00:01' 17 +'ACME' '01-Apr-11 10:00:02' 19 +'ACME' '01-Apr-11 10:00:03' 21 +'ACME' '01-Apr-11 10:00:04' 25 +'ACME' '01-Apr-11 10:00:05' 12 +'ACME' '01-Apr-11 10:00:06' 15 +'ACME' '01-Apr-11 10:00:07' 20 +'ACME' '01-Apr-11 10:00:08' 24 +'ACME' '01-Apr-11 10:00:09' 25 +'ACME' '01-Apr-11 10:00:10' 19 +{% endhighlight %} + +will produce a summary row for each found period in which the price was constantly decreasing. + +{% highlight text %} +SYMBOL START_TST BOTTOM_TS END_TSTAM += == == == +ACME 01-APR-11 10:00:04 01-APR-11 10:00:05 01-APR-11 10:00:09 +{% endhighlight %} + +The aforementioned query consists of following clauses: + +* [PARTITION BY](#partitioning) - defines logical partitioning of the stream, similar to `GROUP BY` operations. Review comment: This should also part of an `Overview` part, because it nicely summarizes the feature. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16683620#comment-16683620 ] ASF GitHub Bot commented on FLINK-10625: twalthr commented on a change in pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070#discussion_r232600187 ## File path: docs/dev/table/streaming/match_recognize.md ## @@ -0,0 +1,654 @@ +--- +title: 'Detecting event patterns Experimental' +nav-parent_id: streaming_tableapi +nav-title: 'Detecting event patterns' +nav-pos: 5 +--- + + +It is a common use-case to search for set event patterns, especially in case of data streams. Apache Flink +comes with [CEP library]({{ site.baseurl }}/dev/libs/cep.html) which allows for pattern detection in event streams. On the other hand Flink's +Table API & SQL provides a relational way to express queries that comes with multiple functions and +optimizations that can be used out of the box. In December 2016, ISO released a new version of the +international SQL standard (ISO/IEC 9075:2016) including the Row Pattern Recognition for complex event processing, +which allowed to consolidate those two APIs using MATCH_RECOGNIZE clause. + +* This will be replaced by the TOC +{:toc} + +Example query +- + +Row Pattern Recognition in SQL is performed using the MATCH_RECOGNIZE clause. MATCH_RECOGNIZE enables you to do the following tasks: +* Logically partition and order the data that is used in the MATCH_RECOGNIZE clause with its PARTITION BY and ORDER BY clauses. +* Define patterns of rows to seek using the PATTERN clause of the MATCH_RECOGNIZE clause. + These patterns use regular expression syntax, a powerful and expressive feature, applied to the pattern variables you define. +* Specify the logical conditions required to map a row to a row pattern variable in the DEFINE clause. +* Define measures, which are expressions usable in other parts of the SQL query, in the MEASURES clause. + +For example to find periods of constantly decreasing price of a Ticker one could write a query like this: + +{% highlight sql %} +SELECT * +FROM Ticker +MATCH_RECOGNIZE ( +PARTITION BY symbol +ORDER BY rowtime +MEASURES + STRT_ROW.rowtime AS start_tstamp, + LAST(PRICE_DOWN.rowtime) AS bottom_tstamp, + LAST(PRICE_UP.rowtime) AS end_tstamp +ONE ROW PER MATCH +AFTER MATCH SKIP TO LAST UP +PATTERN (STRT_ROW PRICE_DOWN+ PRICE_UP+) +DEFINE + PRICE_DOWN AS PRICE_DOWN.price < LAST(PRICE_DOWN.price, 1) OR + (LAST(PRICE_DOWN.price, 1) IS NULL AND PRICE_DOWN.price < STRT_ROW.price)) + PRICE_UP AS PRICE_UP.price > LAST(PRICE_UP.price, 1) OR LAST(PRICE_UP.price, 1) IS NULL +) MR; +{% endhighlight %} + +This query given following input data: Review comment: Introduce the input data first and explain each column. Esp. that rowtime is a time attribute. Then show the query with explanation. Then the output data with explanation. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10625) Add MATCH_RECOGNIZE documentation
[ https://issues.apache.org/jira/browse/FLINK-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681311#comment-16681311 ] ASF GitHub Bot commented on FLINK-10625: dawidwys opened a new pull request #7070: [FLINK-10625] Documentation for MATCH_RECOGNIZE clause URL: https://github.com/apache/flink/pull/7070 This PR adds documentation for MATCH_RECOGNIZE clause in SQL. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Add MATCH_RECOGNIZE documentation > - > > Key: FLINK-10625 > URL: https://issues.apache.org/jira/browse/FLINK-10625 > Project: Flink > Issue Type: Sub-task > Components: Documentation, Table API SQL >Affects Versions: 1.7.0 >Reporter: Till Rohrmann >Assignee: Dawid Wysakowicz >Priority: Major > Labels: pull-request-available > Fix For: 1.7.0 > > > The newly added {{MATCH_RECOGNIZE}} functionality needs to be documented. -- This message was sent by Atlassian JIRA (v7.6.3#76005)