Hi Ishii-san,

Thank you for the quick feedback and helpful spec clarifications!

2026년 1월 3일 (토) AM 10:05, Tatsuo Ishii <[email protected]>님이 작성:

> Hi Henson,
>
> Thank you for the detailed design documentation. While learning the
> document, just a quick comment and question.
>
> > Hi hackers,
> >
> > I've been working on an alternative implementation approach for Row
> > Pattern Recognition (RPR) that addresses some limitations in the
> > current regex-based patch.
> >
> > Key differences from the existing approach:
> >
> > 1. Streaming NFA instead of regex engine
> >    - Process rows incrementally without accumulating encoded strings
> >    - Avoids O(V^N) combinatorial explosion in multi-variable scenarios
>
> Sounds good. As you already pointed out, current approach has a memory
> space problem. If Streaming NFA could solve the issue, it would be
> really good.
>
> > 2. No 26-variable limit
> >    - Variables identified by index, not alphabet encoding
>
> Sounds good.
>
> > 3. Proper Lexical Order support
> >    - Respects PATTERN alternative order for ONE ROW PER MATCH
>
> Here do you have "ONE ROW PER MATCH" option of RPR in your mind? In my
> understanding it's in MATCH_RECOGNIZE clause, but not in RPR in Window
> clause. (ALL ROWS PER MATCH is not applicable either in RPR in
> Window clause).
>
>
You're absolutely right - thank you for the correction!

I'm currently exploring what this NFA structure could theoretically
support, rather than strictly binding to the spec yet. Since I'm
still learning from Oracle manuals and your code without access to
the full SQL:2016 standard, I mixed up MATCH_RECOGNIZE syntax with
Window clause RPR - sorry for the confusion!

Once I have a clearer understanding of the spec, I'll properly
define how it integrates with PostgreSQL's grammar.

Your spec knowledge is invaluable as I continue learning. Thank you!


> > 4. GREEDY/RELUCTANT quantifier support
> >    - Longest vs shortest match semantics
>
> Sounds good.
>
> > 5. Incremental MEASURES computation
> >    - Aggregate values computed during matching, no rescan needed
> >
> > The attached document describes the design in detail, including:
> > - Data structures (Pattern, MatchState, MatchContext)
> > - Execution flow and state transitions
> > - Context absorption optimization for greedy patterns
> > - AST optimization passes
> >
> > This is still a work in progress. I'd appreciate any feedback on
> > the approach before proceeding with PostgreSQL integration.
>
> I understand you are still in the design phase and sorry for jumping
> into an implementation question. but If you have an idea, please
> advice:
>
>
Not at all - implementation questions are valuable advice that
help catch what I might miss in the design phase!


> How do you handle '{' and '}' in the PATTERN clause in the raw parser?
> I am not a parser expert but it seems it's not easy to handle '{' and
> '}' in gram.y.
>
>
You're right - it's not straightforward. Both gram.y and scan.l
need to be modified together.

I've attached a rough patch showing one possible approach. However,
since the OR handling has also been modified in SQL/PGQ, there will
likely be conflicts that need to be resolved when integrating.


> Best regards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp
>
From 987a3fce6fb7823b42631658079e8ec2d0b432e4 Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Sat, 3 Jan 2026 19:51:38 +0900
Subject: [PATCH] parser: Add "|" and "{}"

---
 src/backend/parser/gram.y | 27 ++++++++++++++++++++++++++-
 src/backend/parser/scan.l |  4 ++--
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 99c27dc178c..bd7c2820265 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -900,7 +900,7 @@ static Node *makeRecursiveViewSelect(char *relname, List 
*aliases, Node *query);
 %nonassoc      IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE 
ROLLUP
                        SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
                        AFTER INITIAL SEEK PATTERN_P
-%left          Op OPERATOR             /* multi-character ops and user-defined 
operators */
+%left          Op OPERATOR '|' /* multi-character ops and user-defined 
operators */
 %left          '+' '-'
 %left          '*' '/' '%'
 %left          '^'
@@ -16871,6 +16871,7 @@ opt_row_pattern_initial_or_seek:
 row_pattern:
                        row_pattern_term                                        
{ $$ = list_make1($1); }
                        | row_pattern row_pattern_term          { $$ = 
lappend($1, $2); }
+                       | row_pattern '|' row_pattern_term      { $$ = 
lappend($1, $3); /* TODO: mark as alternation */ }
                ;
 
 row_pattern_term:
@@ -16892,6 +16893,27 @@ row_pattern_term:
 
                                        $$ = (Node *) 
makeSimpleA_Expr(AEXPR_OP, "?", (Node *)makeString($1), NULL, @1);
                                }
+                       /* {m} - exactly m times */
+                       | ColId '{' Iconst '}'
+                               {
+                                       $$ = (Node *) 
makeSimpleA_Expr(AEXPR_OP, "{m}", (Node *)makeString($1), (Node 
*)makeInteger($3), @1);
+                               }
+                       /* {m,} - at least m times */
+                       | ColId '{' Iconst ',' '}'
+                               {
+                                       $$ = (Node *) 
makeSimpleA_Expr(AEXPR_OP, "{m,}", (Node *)makeString($1), (Node 
*)makeInteger($3), @1);
+                               }
+                       /* {m,n} - between m and n times */
+                       | ColId '{' Iconst ',' Iconst '}'
+                               {
+                                       List *bounds = 
list_make2(makeInteger($3), makeInteger($5));
+                                       $$ = (Node *) 
makeSimpleA_Expr(AEXPR_OP, "{m,n}", (Node *)makeString($1), (Node *)bounds, @1);
+                               }
+                       /* {,n} - at most n times (0 to n) */
+                       | ColId '{' ',' Iconst '}'
+                               {
+                                       $$ = (Node *) 
makeSimpleA_Expr(AEXPR_OP, "{,n}", (Node *)makeString($1), (Node 
*)makeInteger($4), @1);
+                               }
                ;
 
 row_pattern_definition_list:
@@ -16953,12 +16975,15 @@ MathOp:                '+'                            
                                        { $$ = "+"; }
                        | LESS_EQUALS                                           
        { $$ = "<="; }
                        | GREATER_EQUALS                                        
        { $$ = ">="; }
                        | NOT_EQUALS                                            
        { $$ = "<>"; }
+                       | '|'                                                   
                { $$ = "|"; }
                ;
 
 qual_Op:       Op
                                        { $$ = list_make1(makeString($1)); }
                        | OPERATOR '(' any_operator ')'
                                        { $$ = $3; }
+                       | '|'
+                                       { $$ = list_make1(makeString("|")); }
                ;
 
 qual_all_Op:
diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index a67815339b7..c7b984df34c 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -363,7 +363,7 @@ not_equals          "!="
  * If you change either set, adjust the character lists appearing in the
  * rule for "operator"!
  */
-self                   [,()\[\].;\:\+\-\*\/\%\^\<\>\=]
+self                   [,()\[\].;\:\+\-\*\/\%\^\<\>\=\|\{\}]
 op_chars               [\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
 operator               {op_chars}+
 
@@ -955,7 +955,7 @@ other                       .
                                                 * that the "self" rule would 
have.
                                                 */
                                                if (nchars == 1 &&
-                                                       
strchr(",()[].;:+-*/%^<>=", yytext[0]))
+                                                       
strchr(",()[].;:+-*/%^<>=|{}", yytext[0]))
                                                        return yytext[0];
                                                /*
                                                 * Likewise, if what we have 
left is two chars, and
-- 
2.50.1 (Apple Git-155)

Reply via email to