beliefer opened a new pull request #32477: URL: https://github.com/apache/spark/pull/32477
### What changes were proposed in this pull request? `ANSI SQL: SIMILAR TO ... ESCAPE` is very useful. There are some mainstream database support the syntax. **PostgreSQL**: https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP **Redshift**: https://docs.aws.amazon.com/redshift/latest/dg/pattern-matching-conditions-similar-to.html **Sybase**: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.help.sqlanywhere.12.0.0/dbreference/like-regexp-similarto.html **Firebird**: http://firebirdsql.org/file/documentation/html/en/refdocs/fblangref25/firebird-25-language-reference.html#fblangref25-commons-predsiimilarto This util supports the following pattern-matching metacharacters: Operator | Description -- | -- % | Matches any sequence of zero or more characters. _ | Matches any single character. \| | Denotes alternation (either of two alternatives). \* | Repeat the previous item zero or more times. \+ | Repeat the previous item one or more times. ? | Repeat the previous item zero or one time. {m} | Repeat the previous item exactly m times. {m,} | Repeat the previous item m or more times. {m,n} | Repeat the previous item at least m and not more than n times. () | Parentheses group items into a single logical item. [...] | A bracket expression specifies a character class, just as in POSIX regular expressions. **Note** `SIMILAR TO` is similar to `RLIKE`, but with the following differences: 1. The `SIMILAR TO` operator returns true only if its pattern matches the entire string, unlike `RLIKE` behavior, where the pattern can match any portion of the string. 2. The regex string allow use _ and % as wildcard characters denoting any single character and any string, respectively (these are comparable to . and .* in POSIX regular expressions). 3. The regex string allow use escape character like `LIKE` behavior. 4. '.', '^' and '$' is not a meta character for `SIMILAR TO`. ### Why are the changes needed? `ANSI SQL: SIMILAR TO ... ESCAPE` is very useful. ### Does this PR introduce _any_ user-facing change? Yes, a new feature. ### How was this patch tested? New tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
