Fix conversion of SIMILAR TO regexes for character classes
The code that translates SIMILAR TO pattern matching expressions to
POSIX-style regular expressions did not consider that square brackets
can be nested. For example, in an expression like [[:alpha:]%_], the
logic replaced the placeholders '_' and '%' but it should not.
This commit fixes the conversion logic by tracking the nesting level of
square brackets marking character class areas, while considering that
in expressions like []] or [^]] the first closing square bracket is a
regular character. Multiple tests are added to show how the conversions
should or should not apply applied while in a character class area, with
specific cases added for all the characters converted outside character
classes like an opening parenthesis '(', dollar sign '$', etc.
Author: Laurenz Albe <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Michael Paquier <[email protected]>
Discussion:
https://postgr.es/m/[email protected]
Backpatch-through: 13
Branch
------
REL_16_STABLE
Details
-------
https://git.postgresql.org/pg/commitdiff/e9e535d611204266a3c2b587afd9ffbe346fc067
Modified Files
--------------
src/backend/utils/adt/regexp.c | 38 +++++++++++++++++----
src/test/regress/expected/strings.out | 62 +++++++++++++++++++++++++++++++++++
src/test/regress/sql/strings.sql | 20 +++++++++++
3 files changed, 114 insertions(+), 6 deletions(-)