[
https://issues.apache.org/jira/browse/CALCITE-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15893803#comment-15893803
]
Zhiqiang He edited comment on CALCITE-1641 at 3/3/17 6:20 AM:
--------------------------------------------------------------
[~julianhyde] I add a commit for this request. please review it.
1,2,3,5,6 already finished.
4:Fix SqlValidatorTest.testMatchRecognizeInternals (operator "FIRST" is being
seen in regular SQL)
I am not understand your mean . I add some test case for FIRST in
sqlvalidatorTest.
you can give me more details is this test case it not right.
7.In reference.md, describe measureColumn and condition
I already add measure column define in refernce.md. and more details infomation
or match recognize I will write it after all functions finished.
8:Run "mvn site" under JDK 1.7 and 9 and fix javadoc errors
I am already execute mvn sit in JDK 7 , but not found some error. can you give
me some details or example?
was (Author: ransom):
I add a commit for this request. please review it.
1,2,3,5,6 already finished.
4:Fix SqlValidatorTest.testMatchRecognizeInternals (operator "FIRST" is being
seen in regular SQL)
I am not understand your mean . I add some test case for FIRST in
sqlvalidatorTest.
you can give me more details is this test case it not right.
7.In reference.md, describe measureColumn and condition
I already add measure column define in refernce.md. and more details infomation
or match recognize I will write it after all functions finished.
8:Run "mvn site" under JDK 1.7 and 9 and fix javadoc errors
I am already execute mvn sit in JDK 7 , but not found some error. can you give
me some details or example?
> Base functions support for MATCH_RECOGNIZE
> ------------------------------------------
>
> Key: CALCITE-1641
> URL: https://issues.apache.org/jira/browse/CALCITE-1641
> Project: Calcite
> Issue Type: Sub-task
> Components: core
> Affects Versions: 1.11.0
> Reporter: Zhiqiang He
> Assignee: Zhiqiang He
> Labels: features
>
> MATCH_RECOGNIZE syntax like
> this:https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8980
> Only pattern and define is supported in first step.
> h1. PATTERN: Defining the Row Pattern to Be Matched
> The PATTERN keyword specifies the pattern to be recognized in the ordered
> sequence of rows in a partition. Each variable name in a pattern corresponds
> to a Boolean condition, which is specified later using the DEFINE component
> of the syntax.
> The PATTERN clause is used to specify a regular expression. It is outside the
> scope of this material to explain regular expression concepts and details. If
> you are not familiar with regular expressions, you are encouraged to
> familiarize yourself with the topic using other sources.
> The regular expression in a PATTERN clause is enclosed in parentheses.
> PATTERN may use the following operators:
> * Concatenation
> Concatenation is used to list two or more items in a pattern to be matched in
> that order. Items are concatenated when there is no operator sign between two
> successive items. For example: PATTERN (A B C).
> * Quantifiers
> Quantifiers define the number of iterations accepted for a match. Quantifiers
> are postfix operators with the following choices:
> ** \* — 0 or more iterations
> ** + — 1 or more iterations
> ** ? — 0 or 1 iterations
> ** {n} — n iterations (n > 0)
> ** {n,} — n or more iterations (n >= 0)
> ** {n,m} — between n and m (inclusive) iterations (0 <= n <= m, 0 < m)
> ** {,m} — between 0 and m (inclusive) iterations (m > 0)
> ** reluctant quantifiers — indicated by an additional question mark following
> a quantifier (*?, +?, ??, {n,}?, { n, m }?, {,m}?). See "Reluctant Versus
> Greedy Quantifier" for the difference between reluctant and non-reluctant
> quantifiers.
> The following are examples of using quantifier operators:
> ** A* matches 0 or more iterations of A
> ** A{3,6} matches 3 to 6 iterations of A
> ** A{,4} matches 0 to 4 iterations of A
> * Alternation
> Alternation matches a single regular expression from a list of several
> possible regular expressions. The alternation list is created by placing a
> vertical bar (|) between each regular expression. Alternatives are preferred
> in the order they are specified. As an example, PATTERN (A | B | C) attempts
> to match A first. If A is not matched, it attempts to match B. If B is not
> matched, it attempts to match C.
> * Grouping
> Grouping treats a portion of the regular expression as a single unit,
> enabling you to apply regular expression operators such as quantifiers to
> that group. Grouping is created with parentheses. As an example, PATTERN ((A
> B){3} C) attempts to match the group (A B) three times and then seeks one
> occurrence of C.
> * PERMUTE
> See "How to Express All Permutations" for more information.
> * Exclusion
> Parts of the pattern to be excluded from the output of ALL ROWS PER MATCH are
> enclosed between {- and -}. See "How to Exclude Portions of the Pattern from
> the Output".
> * Anchors
> Anchors work in terms of positions rather than rows. They match a position
> either at the start or end of a partition.
> ** ^ matches the position before the first row in the partition.
> ** $ matches the position after the last row in the partition.
> As an example, PATTERN (^A+$) will match only if all rows in a partition
> satisfy the condition for A. The resulting match spans the entire partition.
> * Empty pattern (), matches an empty set of rows
> This section contains the following topics:
> Reluctant Versus Greedy Quantifier
> Operator Precedence
> h2. Reluctant Versus Greedy Quantifier
> Pattern quantifiers are referred to as greedy; they will attempt to match as
> many instances of the regular expression on which they are applied as
> possible. The exception is pattern quantifiers that have a question mark ? as
> a suffix, and those are referred to as reluctant. They will attempt to match
> as few instances as possible of the regular expression on which they are
> applied.
> The difference between greedy and reluctant quantifiers appended to a single
> pattern variable is illustrated as follows: A* tries to map as many rows as
> possible to A, whereas A*? tries to map as few rows as possible to A. For
> example:
> {code:sql}
> PATTERN (X Y* Z)
> {code}
> The pattern consists of three variable names, X, Y, and Z, with Y quantified
> with *. This means a pattern match will be recognized and reported when the
> following condition is met by consecutive incoming input rows:
> ** A row satisfies the condition that defines variable X followed by zero or
> more rows that satisfy the condition that defines the variable Y followed by
> a row that satisfies the condition that defines the variable Z.
> During the pattern matching process, after a row was mapped to X and 0 or
> more rows were mapped to Y, if the following row can be mapped to both
> variables Y and Z (which satisfies the defining condition of both Y and Z),
> then, because the quantifier * for Y is greedy, the row is preferentially
> mapped to Y instead of Z. Due to this greedy property, Y gets preference over
> Z and a greater number of rows to Y are mapped. If the pattern expression was
> PATTERN (X Y*? Z), which uses a reluctant quantifier *? over Y, then Z gets
> preference over Y.
> h2. Operator Precedence
> The precedence of the elements in a regular expression, in decreasing order,
> is as follows:
> * row_pattern_primary
> These elements include primary pattern variables (pattern variables not
> created with the SUBSET clause described in "SUBSET: Defining Union Row
> Pattern Variables"), anchors, PERMUTE, parenthetic expressions, exclusion
> syntax, and empty pattern
> * Quantifier
> A row_pattern_primary may have zero or one quantifier.
> * Concatenation
> * Alternation
> Precedence of alternation is illustrated by PATTERN(A B | C D), which is
> equivalent to PATTERN ((A B) | (C D)). It is not, however, equivalent to
> PATTERN (A (B | C) D).
> Precedence of quantifiers is illustrated by PATTERN (A B *), which is
> equivalent to PATTERN (A (B*)). It is not, however, PATTERN ((A B)*).
> A quantifier may not immediately follow another quantifier. For example,
> PATTERN(A**) is prohibited.
> It is permitted for a primary pattern variable to occur more than once in a
> pattern, for example, PATTERN (X Y X).
> h1. **DEFINE**: Defining Primary Pattern Variables
> DEFINE is a mandatory clause, used to specify the conditions that define
> primary pattern variables. In the example:
> {code:sql}
> DEFINE UP AS UP.Price > PREV(UP.Price),
> DOWN AS DOWN.Price < PREV(DOWN.Price)
> {code}
> UP is defined by the condition UP.Price > PREV (UP.Price), and DOWN is
> defined by the condition DOWN.Price < PREV (DOWN.Price). (PREV is a row
> pattern navigation operation which evaluates an expression in the previous
> row; see "Row Pattern Navigation Operations" regarding the complete set of
> row pattern navigation operations.)
> A pattern variable does not require a definition; if there is no definition,
> any row can be mapped to the pattern variable.
> A union row pattern variable (see discussion of SUBSET in "SUBSET: Defining
> Union Row Pattern Variables") cannot be defined by DEFINE, but can be
> referenced in the definition of a pattern variable.
> The definition of a pattern variable can reference another pattern variable,
> which is illustrated in Example 20-6.
> Example 20-6 Defining Pattern Variables
> {code:sql}
> SELECT *
> FROM Ticker MATCH_RECOGNIZE (
> PARTITION BY Symbol
> FROM Ticker
> MATCH_RECOGNIZE (
> PARTITION BY Symbol
> ORDER BY tstamp
> MEASURES FIRST (A.tstamp) AS A_Firstday,
> LAST (D.tstamp) AS D_Lastday,
> AVG (B.Price) AS B_Avgprice,
> AVG (D.Price) AS D_Avgprice
> PATTERN (A B+ C+ D)
> SUBSET BC = (B,C)
> DEFINE A AS Price > 100,
> B AS B.Price > A.Price,
> C AS C.Price < AVG (B.Price),
> D AS D.Price > MAX (BC.Price)
> ) M
> {code}
> In this example:
> The definition of A implicitly references the universal row pattern variable
> (because of the unqualified column reference Price).
> The definition of B references the pattern variable A.
> The definition of C references the pattern variable B.
> The definition of D references the union row pattern variable BC.
> The conditions are evaluated on successive rows of a partition in a trial
> match, with the current row being tentatively mapped to a pattern variable as
> permitted by the pattern. To be successfully mapped, the condition must
> evaluate to true.
> In the previous example:
> {code:sql}
> A AS Price > 100
> {code}
> Price refers to the Price in the current row, because the last row mapped to
> any primary row pattern variable is the current row, which is tentatively
> mapped to A. Alternatively, in this example, using A.Price would have led to
> the same results.
> {code:sql}
> B AS B.Price > A.Price
> {code}
> B.Price refers to the Price in the current row (because B is being defined),
> whereas A.Price refers to the last row mapped to A. In view of the pattern,
> the only row mapped to A is the first row to be mapped.
> {code:sql}
> C AS C.Price < AVG(B.Price)
> {code}
> Here C.Price refers to the Price in the current row, because C is being
> defined. The aggregate AVG (that is, insert Price) is computed as the average
> of all rows that are already mapped to B.
> {code:sql}
> D AS D.Price > MAX(BC.Price)
> {code}
> The pattern variable D is similar to pattern variable C, though it
> illustrates the use of a union row pattern variable in the Boolean condition.
> In this case, MAX(BC.Price) returns the maximum price value of the rows
> matched to variable B or variable C. The semantics of Boolean conditions are
> discussed in more detail in "Expressions in MEASURES and DEFINE".
> h1. MATCH_RECOGNIZE syntax :
> {code:sql}
> table_reference ::=
> {only (query_table_expression) | query_table_expression
> }[flashback_query_clause]
> [pivot_clause|unpivot_clause|row_pattern_recognition_clause] [t_alias]
> row_pattern_recognition_clause ::=
> MATCH_RECOGNIZE (
> PATTERN (row_pattern)
> DEFINE row_pattern_definition_list
> )
> row_pattern ::=
> row_pattern_term
> | row_pattern "|" row_pattern_term
> row_pattern_term ::=
> row_pattern_factor
> | row_pattern_term row_pattern_factor
> row_pattern_factor ::=
> row_pattern_primary [row_pattern_quantifier]
> row_pattern_quantifier ::=
> *[?]
> |+[?]
> |?[?]
> |"{"[unsigned_integer ],[unsigned_integer]"}"[?]
> |"{"unsigned_integer "}"
> row_pattern_primary ::=
> variable_name
> |$
> |^
> |([row_pattern])
> |"{-" row_pattern"-}"
> | row_pattern_permute
> row_pattern_permute ::=
> PERMUTE (row_pattern [, row_pattern] ...)
> row_pattern_subset_clause ::=
> SUBSET row_pattern_subset_item [, row_pattern_subset_item] ...
> row_pattern_subset_item ::=
> variable_name = (variable_name[ , variable_name]...)
> row_pattern_definition_list ::=
> row_pattern_definition[, row_pattern_definition]...
> row_pattern_definition ::=
> variable_name AS condition
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)