Hi hackers
I converted PostgreSQL RPR regression test queries to Oracle
MATCH_RECOGNIZE syntax and executed them on both systems to perform
cross-validation.
The tests were based on the following PostgreSQL regression test files:
rpr_base.sql
rpr_nfa.sql
PostgreSQL bug: zero-min reluctant quantifier
During cross-validation a PostgreSQL bug was discovered involving
reluctant quantifiers whose minimum repetition is 0.
Example pattern:
PATTERN (A*?)
DEFINE A AS val > 0
Result comparison:
pattern PostgreSQL (cnt) Oracle (cnt)
A*? 1,1,1 0,0,0
A?? 1,1,1 0,0,0
A+? 1,1,1 1,1,1
For reluctant quantifiers with min=0 (such as *? and ??), PostgreSQL
always consumes at least one row, while Oracle allows a zero-length
match. When min>=1 (e.g., A+?), both systems behave the same.
This behavior was consistently observed across the converted tests.
Design difference: unused DEFINE variables
Example:
PATTERN (A+)
DEFINE A AS id > 0, B AS id > 5
PostgreSQL executes the query successfully and ignores the unused
variable B.
Oracle raises:
ORA-62503: illegal variable definition
Oracle limitations observed
Bounded quantifier limit
A{200} -> works
A{201} -> ORA-62518
Oracle appears to limit the upper bound of bounded quantifiers to 200,
while PostgreSQL does not impose this restriction.
Nested nullable quantifiers
Examples:
(A*)*
(A*)+
(((A)*)*)*
(A?|B){1,2}
((A?){2,3}){2,3}
(A?){n,m}
(A? B?){2,3}
Oracle raises:
ORA-62513
when a nullable subpattern is wrapped by an outer quantifier, while
PostgreSQL executes these patterns successfully.
These results come from running the converted PostgreSQL RPR regression
tests on Oracle for comparison.
Best regards,
SugJun