Le 30/07/2021 à 23:38, Tom Lane a écrit :
> Gilles Darold <[email protected]> writes:
>> Le 26/07/2021 à 21:56, Tom Lane a écrit :
>>> I'm inclined to just drop the regexp_replace additions. I don't think
>>> that the extra parameters Oracle provides here are especially useful.
>>> They're definitely not useful enough to justify creating compatibility
>>> hazards for.
>> I would not say that being able to replace the Nth occurrence of a
>> pattern matching is not useful but i agree that this is not a common
>> case with replacement. Both Oracle [1] and IBM DB2 [2] propose this form
>> and I have though that we can not have compatibility issues because of
>> the different data type at the 4th parameter.
> Well, here's an example of the potential issues:
>
> [...]
Thanks for pointing me this case, I did not think that the prepared
statement could lead to this confusion.
>> Anyway, maybe we can just
>> rename the function even if I would prefer that regexp_replace() be
>> extended. For example:
>> regexp_replace(source, pattern, replacement [, flags ]);
>> regexp_substitute(source, pattern, replacement [, position ] [,
>> occurrence ] [, flags ]);
> Hmm. Of course the entire selling point of this patch seems to be
> bug-compatibility with Oracle, so using different names is largely
> defeating the point :-(
>
> Maybe we should just hold our noses and do it. The point that
> you'd get a recognizable failure if the wrong function were chosen
> reassures me a little bit. We've seen a lot of cases where this
> sort of ambiguity results in the system just silently doing something
> different from what you expected, and I was afraid that that could
> happen here.
I join a new version of the patch that include a check of the option
parameter in the basic form of regexp_replace() and return an error in
ambiguous cases.
PREPARE rr AS SELECT regexp_replace('healthy, wealthy, and
wise','(\w+)thy', '\1ish', $1);
EXECUTE rr(1);
ERROR: ambiguous use of the option parameter in regex_replace(),
value: 1
HINT: you might set the occurrence parameter to force the use of
the extended form of regex_replace()
This is done by checking if the option parameter value is an integer and
throw the error in this case. I don't think of anything better.
Best regards,
--
Gilles Darold
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index a5b6adc4bb..02d1f72e1e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3108,6 +3108,66 @@ repeat('Pg', 4) <returnvalue>PgPgPgPg</returnvalue>
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>regexp_count</primary>
+ </indexterm>
+ <function>regexp_count</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type> [, <parameter>position</parameter> <type>integer</type> ] [, <parameter>flags</parameter> <type>text</type> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Returns the number of times a pattern occurs for a match of a POSIX
+ regular expression to the <parameter>string</parameter>; see
+ <xref linkend="functions-posix-regexp"/>.
+ </para>
+ <para>
+ <literal>regexp_count('123456789012', '\d{3}', 3)</literal>
+ <returnvalue>3</returnvalue>
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>regexp_instr</primary>
+ </indexterm>
+ <function>regexp_instr</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type> [, <parameter>position</parameter> <type>integer</type> ] [, <parameter>occurence</parameter> <type>integer</type> ] [, <parameter>returnopt</parameter> <type>integer</type> ] [, <parameter>flags</parameter> <type>text</type> ] [, <parameter>group</parameter> <type>integer</type> ] )
+ <returnvalue>integer</returnvalue>
+ </para>
+ <para>
+ Returns the position within <parameter>string</parameter> where the
+ match of a POSIX regular expression occurs. It returns an integer
+ indicating the beginning or ending position of the matched substring,
+ depending on the value of the <parameter>returnopt</parameter> argument
+ (default beginning). If no match is found the function returns 0;
+ see <xref linkend="functions-posix-regexp"/>.
+ </para>
+ <para>
+ <literal>regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 4)</literal>
+ <returnvalue>7</returnvalue>
+ </para></entry>
+ </row>
+
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>regexp_like</primary>
+ </indexterm>
+ <function>regexp_like</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type> [, <parameter>flags</parameter> <type>text</type> ] )
+ <returnvalue>boolean</returnvalue>
+ </para>
+ <para>
+ Evaluates the existence of a match to a POSIX regular expression
+ in <parameter>string</parameter>; see <xref linkend="functions-posix-regexp"/>.
+ </para>
+ <para>
+ <literal>regexp_like('Hello'||chr(10)||'world', '^world$', 'm')</literal>
+ <returnvalue>t</returnvalue>
+ </para></entry>
+ </row>
+
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
@@ -3156,7 +3216,7 @@ repeat('Pg', 4) <returnvalue>PgPgPgPg</returnvalue>
<indexterm>
<primary>regexp_replace</primary>
</indexterm>
- <function>regexp_replace</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>, <parameter>replacement</parameter> <type>text</type> [, <parameter>flags</parameter> <type>text</type> ] )
+ <function>regexp_replace</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>, <parameter>replacement</parameter> <type>text</type> [, <parameter>position</parameter> <type>integer</type> ] [, <parameter>occurence</parameter> <type>integer</type> ] [, <parameter>flags</parameter> <type>text</type> ] )
<returnvalue>text</returnvalue>
</para>
<para>
@@ -3171,6 +3231,24 @@ repeat('Pg', 4) <returnvalue>PgPgPgPg</returnvalue>
</para></entry>
</row>
+ <row>
+ <entry role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>regexp_substr</primary>
+ </indexterm>
+ <function>regexp_substr</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type> [, <parameter>position</parameter> <type>integer</type> ] [, <parameter>occurence</parameter> <type>integer</type> ] [, <parameter>flags</parameter> <type>text</type> ] [, <parameter>group</parameter> <type>integer</type> ] )
+ <returnvalue>text</returnvalue>
+ </para>
+ <para>
+ Return the substring within <parameter>string</parameter> corresponding to the
+ match of a POSIX regular expression; see <xref linkend="functions-posix-regexp"/>.
+ </para>
+ <para>
+ <literal>regexp_substr('1234567890 1234557890', '(123)(4(5[56])(78))', 1, 2, 'i', 3)</literal>
+ <returnvalue>55</returnvalue>
+ </para></entry>
+ </row>
+
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
@@ -5392,6 +5470,18 @@ substring('foobar' similar '#"o_b#"%' escape '#') <lineannotation>NULL</linea
<indexterm>
<primary>regexp_split_to_array</primary>
</indexterm>
+ <indexterm>
+ <primary>regexp_like</primary>
+ </indexterm>
+ <indexterm>
+ <primary>regexp_count</primary>
+ </indexterm>
+ <indexterm>
+ <primary>regexp_instr</primary>
+ </indexterm>
+ <indexterm>
+ <primary>regexp_substr</primary>
+ </indexterm>
<para>
<xref linkend="functions-posix-table"/> lists the available
@@ -5650,6 +5740,8 @@ SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
It has the syntax
<function>regexp_replace</function>(<replaceable>source</replaceable>,
<replaceable>pattern</replaceable>, <replaceable>replacement</replaceable>
+ <optional>, <replaceable>position</replaceable> </optional>
+ <optional>, <replaceable>occurrence</replaceable> </optional>
<optional>, <replaceable>flags</replaceable> </optional>).
The <replaceable>source</replaceable> string is returned unchanged if
there is no match to the <replaceable>pattern</replaceable>. If there is a
@@ -5663,12 +5755,19 @@ SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
substring matching the entire pattern should be inserted. Write
<literal>\\</literal> if you need to put a literal backslash in the replacement
text.
- The <replaceable>flags</replaceable> parameter is an optional text
- string containing zero or more single-letter flags that change the
+ <replaceable>pattern</replaceable> is searched in <replaceable>string</replaceable> starting
+ from an optional <replaceable>position</replaceable> or from the beginning
+ of <replaceable>source</replaceable> by default. Optional <replaceable>occurrence</replaceable>
+ parameter indicates which occurrence of <replaceable>pattern</replaceable> in
+ <replaceable>source</replaceable> should be replaced. Default value for <replaceable>occurrence</replaceable>
+ is 1, replace only the first occurrence. The <replaceable>flags</replaceable> parameter
+ is an optional text string containing zero or more single-letter flags that change the
function's behavior. Flag <literal>i</literal> specifies case-insensitive
matching, while flag <literal>g</literal> specifies replacement of each matching
- substring rather than only the first one. Supported flags (though
- not <literal>g</literal>) are
+ substring rather than only the first one. When <replaceable>occurrence</replaceable>
+ is set flag <literal>g</literal> has no effect. If <replaceable>occurrence</replaceable>
+ is set to zero, all occurrences are replaced which is similar to flag <literal>g</literal>.
+ Supported flags (though not <literal>g</literal>) are
described in <xref linkend="posix-embedded-options-table"/>.
</para>
@@ -5681,6 +5780,10 @@ regexp_replace('foobarbaz', 'b..', 'X', 'g')
<lineannotation>fooXX</lineannotation>
regexp_replace('foobarbaz', 'b(..)', 'X\1Y', 'g')
<lineannotation>fooXarYXazY</lineannotation>
+regexp_replace('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 0, 'i')
+ <lineannotation>X PXstgrXSQL fXnctXXn</lineannotation>
+regexp_replace('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 3, 'i')
+ <lineannotation>A PostgrXSQL function</lineannotation>
</programlisting>
</para>
@@ -5766,6 +5869,163 @@ SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo;
in practice. Other software systems such as Perl use similar definitions.
</para>
+ <para>
+ The <function>regexp_like</function> function evaluates the existence of a match
+ to a POSIX regular expression in <parameter>string</parameter>; returns a boolean
+ resulting from matching a POSIX regular expression pattern to a string. It has
+ the syntax <function>regexp_like</function>(<replaceable>string</replaceable>,
+ <replaceable>pattern</replaceable> <optional>, <replaceable>flags</replaceable> </optional>).
+ <replaceable>pattern</replaceable> is searched in <replaceable>string</replaceable> starting
+ from the beginning of <replaceable>string</replaceable>.
+ The <replaceable>flags</replaceable> parameter is an optional text string
+ containing zero or more single-letter flags that change the function's behavior.
+ <function>regexp_like</function> accepts all the flags
+ shown in <xref linkend="posix-embedded-options-table"/>.
+ This function is similar to regexp operator <literal>~</literal> when used without
+ <replaceable>flags</replaceable> and similar to operator <literal>~*</literal> when
+ used with <replaceable>flags</replaceable> <literal>'i'</literal> only.
+ </para>
+
+ <para>
+ Some examples:
+<programlisting>
+SELECT 'found' FROM (values('Hello'||chr(10)||'world') as f(col) WHERE regexp_like(col, '^world$', 'm');
+ ?column?
+----------
+ found
+(1 row)
+</programlisting>
+ </para>
+
+
+ <para>
+ The <function>regexp_count</function> function returns the number of
+ captured substring(s) resulting from matching a POSIX regular
+ expression pattern to a string. It has the syntax <function>regexp_count</function>(
+ <replaceable>string</replaceable>, <replaceable>pattern</replaceable> <optional>,
+ <replaceable>position</replaceable> </optional> <optional>, <replaceable>flags</replaceable> </optional>).
+ <replaceable>pattern</replaceable> is searched in <replaceable>string</replaceable> starting
+ from an optional <replaceable>position</replaceable> or from the beginning of <replaceable>string</replaceable>
+ by default. The <replaceable>flags</replaceable> parameter is an optional text string
+ containing zero or more single-letter flags that change the function's behavior.
+ <function>regexp_count</function> accepts all the flags
+ shown in <xref linkend="posix-embedded-options-table"/>.
+ The <literal>g</literal> flag is forced internally to count all matches.
+ This function returns 0 if there is no match or the number of matches as
+ an integer.
+ </para>
+
+ <para>
+ Some examples:
+<programlisting>
+SELECT regexp_count('123123123123', '\d{3}', 1);
+ regexp_count
+--------------
+ 4
+(1 row)
+
+SELECT regexp_count('Hello'||CHR(10)||'world!', '^world!$', 1, 'm');
+ regexp_count
+--------------
+ 1
+(1 row)
+</programlisting>
+ </para>
+
+ <para>
+ The <function>regexp_instr</function> function returns the beginning or ending
+ position of the matched substring resulting from matching a POSIX regular
+ expression pattern to a string. It has the syntax <function>regexp_instr</function>(
+ <replaceable>string</replaceable>, <replaceable>pattern</replaceable> <optional>,
+ <replaceable>position</replaceable> </optional> <optional>, <replaceable>occurrence</replaceable> </optional>
+ <optional>, <replaceable>returnopt</replaceable> </optional>
+ <optional>, <replaceable>flags</replaceable> </optional>
+ <optional>, <replaceable>group</replaceable> </optional>).
+ <replaceable>pattern</replaceable> is searched in <replaceable>string</replaceable> starting
+ from an optional <replaceable>position</replaceable> or from the beginning
+ of <replaceable>string</replaceable> by default. Optional <replaceable>occurrence</replaceable> parameter
+ indicates which occurrence of <replaceable>pattern</replaceable> in <replaceable>string</replaceable>
+ should be searched. When optional <replaceable>returnopt</replaceable> parameter is set to 0 (default)
+ the function returns the position of the first character of the occurrence. When set to 1 returns
+ the position of the character after the occurrence.
+ The <replaceable>flags</replaceable> parameter is an optional text string
+ containing zero or more single-letter flags that change the function's behavior.
+ <function>regexp_instr</function> accepts all the flags
+ shown in <xref linkend="posix-embedded-options-table"/>.
+ The <literal>g</literal> flag is forced internally to track all matches.
+ For a pattern with capture groups, <replaceable>group</replaceable> is an integer indicating
+ which capture in <replaceable>pattern</replaceable> is the target of the function.
+ A capture group is a part of the pattern enclosed in parentheses. Capture groups can be nested.
+ They are numbered in the order in which their left parentheses appear in <replaceable>pattern</replaceable>.
+ If <replaceable>group</replaceable> is zero, then the position
+ of the entire substring that matches the pattern is returned. If <replaceable>pattern</replaceable>
+ does not have at least <replaceable>group</replaceable> capture groups, the function returns zero.
+ This function returns 0 if there is no match or the starting or ending position
+ of a match as an integer.
+ </para>
+
+ <para>
+ Some examples:
+<programlisting>
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[^ ]+', 1, 6);
+ regexp_instr
+--------------
+ 32
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 3);
+ regexp_instr
+--------------
+ 5
+(1 row)
+</programlisting>
+ </para>
+
+ <para>
+ The <function>regexp_substr</function> function returns the
+ matched substring resulting from matching a POSIX regular
+ expression pattern to a string. It has the syntax <function>regexp_substr</function>(
+ <replaceable>string</replaceable>, <replaceable>pattern</replaceable> <optional>,
+ <replaceable>position</replaceable> </optional> <optional>, <replaceable>occurrence</replaceable> </optional>
+ <optional>, <replaceable>flags</replaceable> </optional>
+ <optional>, <replaceable>group</replaceable> </optional>).
+ <replaceable>pattern</replaceable> is searched in <replaceable>string</replaceable> starting
+ from an optional <replaceable>position</replaceable> or from the beginning
+ of <replaceable>string</replaceable> by default. Optional parameter <replaceable>occurrence</replaceable>
+ indicates which occurrence of <replaceable>pattern</replaceable> in <replaceable>string</replaceable>
+ should be searched. The <replaceable>flags</replaceable> parameter is an optional text string
+ containing zero or more single-letter flags that change the function's behavior.
+ <function>regexp_substr</function> accepts all the flags
+ shown in <xref linkend="posix-embedded-options-table"/>.
+ The <literal>g</literal> flag is forced internally to track all matches.
+ For a pattern with capture groups, optional <replaceable>group</replaceable> is an integer indicating
+ which capture in <replaceable>pattern</replaceable> is the target of the function. A capture group
+ is a part of the pattern enclosed in parentheses. Capture groups can be nested. They are numbered in
+ the order in which their left parentheses appear in <replaceable>pattern</replaceable>.
+ If <replaceable>group</replaceable> is zero, then the position
+ of the entire substring that matches the pattern is returned. If <replaceable>pattern</replaceable>
+ does not have at least <replaceable>group</replaceable> capture groups, the function returns zero.
+ This function returns NULL if there is no match or the substring of the match.
+ </para>
+
+ <para>
+ Some examples:
+<programlisting>
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+', 1, 2);
+ regexp_substr
+---------------
+ , FR
+(1 row)
+
+SELECT regexp_substr('1234567890 1234557890', '(123)(4(5[56])(78))', 1, 2, 'i', 3);
+ regexp_substr
+---------------
+ 55
+(1 row)
+</programlisting>
+ </para>
+
+
<!-- derived from the re_syntax.n man page -->
<sect3 id="posix-syntax-details">
diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c
index a32c5c82ab..f34cce18ef 100644
--- a/src/backend/utils/adt/regexp.c
+++ b/src/backend/utils/adt/regexp.c
@@ -120,6 +120,7 @@ static regexp_matches_ctx *setup_regexp_matches(text *orig_str, text *pattern,
static ArrayType *build_regexp_match_result(regexp_matches_ctx *matchctx);
static Datum build_regexp_split_result(regexp_matches_ctx *splitctx);
+static text *enclose_with_parenthesis(text *str);
/*
* RE_compile_and_cache - compile a RE, caching if possible
@@ -555,7 +556,6 @@ texticregexne(PG_FUNCTION_ARGS)
0, NULL));
}
-
/*
* textregexsubstr()
* Return a substring matched by a regular expression.
@@ -646,6 +646,32 @@ textregexreplace(PG_FUNCTION_ARGS)
regex_t *re;
pg_re_flags flags;
+
+ /*
+ * When the function is called with four parameters in a prepared
+ * statement the basic regexp_replace() function can be called
+ * instead of the extended form. This is because the parameter type
+ * is unknown at prepare time and the basic form is chosen. Check
+ * that the fourth parameter is not an integer otherwise return an
+ * error that we call the extended form.
+ */
+ if (PG_NARGS() == 4)
+ {
+ char *badopt;
+
+ (void) strtol(TextDatumGetCString(opt), &badopt, 10);
+ if (strcmp(badopt, "") == 0)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("ambiguous use of the option parameter"
+ " in regex_replace(), value: %s",
+ TextDatumGetCString(opt)),
+ errhint("you might set the occurrence parameter to force"
+ " the use of the extended form of regex_replace()")));
+ }
+ }
+
parse_re_flags(&flags, opt);
re = RE_compile_and_cache(p, flags.cflags, PG_GET_COLLATION());
@@ -1063,6 +1089,508 @@ regexp_matches_no_flags(PG_FUNCTION_ARGS)
return regexp_matches(fcinfo);
}
+/*
+ * regexp_like()
+ * Return the true if a pattern match within a string.
+ */
+Datum
+regexp_like(PG_FUNCTION_ARGS)
+{
+ text *str = PG_GETARG_TEXT_PP(0);
+ text *pattern = PG_GETARG_TEXT_PP(1);
+ text *flags = PG_GETARG_TEXT_PP_IF_EXISTS(2);
+
+ pg_re_flags re_flags;
+ regexp_matches_ctx *matchctx;
+
+ /* Determine options */
+ parse_re_flags(&re_flags, flags);
+
+ matchctx = setup_regexp_matches(str, pattern, &re_flags,
+ PG_GET_COLLATION(), true, false, false);
+
+ if (matchctx->nmatches > 0)
+ PG_RETURN_BOOL(true);
+
+ PG_RETURN_BOOL(false);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_like_no_flags(PG_FUNCTION_ARGS)
+{
+ return regexp_like(fcinfo);
+}
+
+/*
+ * regexp_count()
+ * Return the number of matches of a pattern within a string.
+ */
+Datum
+regexp_count(PG_FUNCTION_ARGS)
+{
+ text *orig_str = NULL;
+ text *pattern = PG_GETARG_TEXT_PP(1);
+ int start = 1;
+ text *flags = PG_GETARG_TEXT_PP_IF_EXISTS(3);
+
+ pg_re_flags re_flags;
+ regexp_matches_ctx *matchctx;
+
+ if (PG_NARGS() > 2)
+ start = PG_GETARG_INT32(2);
+
+ if (start < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "start_position", start)));
+
+ /* regexp_count(string, pattern, start[, flags]) */
+ if (start > 1)
+ orig_str = DatumGetTextPP(DirectFunctionCall2(text_substr_no_len,
+ PG_GETARG_DATUM(0),
+ Int32GetDatum(start)
+ ));
+ else
+ orig_str = PG_GETARG_TEXT_PP(0);
+
+ /* Determine options */
+ parse_re_flags(&re_flags, flags);
+
+ /* this function require flag 'g' */
+ re_flags.glob = true;
+
+ matchctx = setup_regexp_matches(orig_str, pattern, &re_flags,
+ PG_GET_COLLATION(), true, false, false);
+
+ PG_RETURN_INT32(matchctx->nmatches);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_count_no_start(PG_FUNCTION_ARGS)
+{
+ return regexp_count(fcinfo);
+}
+
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_count_no_flags(PG_FUNCTION_ARGS)
+{
+ return regexp_count(fcinfo);
+}
+
+/*
+ * textregexreplace_extended()
+ * Return a string matched by a regular expression, with replacement.
+ * Extends textregexreplace by allowing a start position and the
+ * choice of the occurrence to replace (0 means all occurrences).
+ */
+Datum
+textregexreplace_extended(PG_FUNCTION_ARGS)
+{
+ text *s = NULL;
+ text *p = PG_GETARG_TEXT_PP(1);
+ text *pattern = enclose_with_parenthesis(PG_GETARG_TEXT_PP(1));
+ text *r = PG_GETARG_TEXT_PP(2);
+ int start = 1;
+ int occurrence = 1;
+ int pos = 0;
+ text *flags = PG_GETARG_TEXT_PP_IF_EXISTS(5);
+ StringInfoData str;
+ regex_t *re;
+ pg_re_flags re_flags;
+ regexp_matches_ctx *matchctx;
+ text *after = NULL;
+
+ /* start position */
+ if (PG_NARGS() > 3)
+ start = PG_GETARG_INT32(3);
+ if (start < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "start_position", start)));
+ if (start > 1)
+ s = DatumGetTextPP(DirectFunctionCall2(text_substr_no_len,
+ PG_GETARG_DATUM(0),
+ Int32GetDatum(start)
+ ));
+ else
+ s = PG_GETARG_TEXT_PP(0);
+
+ /* occurrence to replace */
+ if (PG_NARGS() > 4)
+ occurrence = PG_GETARG_INT32(4);
+
+ if (occurrence < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "occurrence", occurrence)));
+
+ /* Determine options */
+ parse_re_flags(&re_flags, flags);
+ re_flags.glob = true;
+
+ /* lookup for pattern */
+ matchctx = setup_regexp_matches(s, pattern, &re_flags,
+ PG_GET_COLLATION(), true,
+ false, re_flags.glob);
+
+ /* If no match is found, then the function returns the original string */
+ if (matchctx->nmatches == 0)
+ PG_RETURN_DATUM(PG_GETARG_DATUM(0));
+
+ /* Get the position of the occurence to replace */
+ if (PG_NARGS() > 4 && occurrence > 0)
+ {
+ /* When occurrence exceed matches return the original string */
+ if (occurrence > matchctx->nmatches)
+ PG_RETURN_DATUM(PG_GETARG_DATUM(0));
+
+ pos = (occurrence*matchctx->npatterns*2)-(matchctx->npatterns*2);
+ pos = matchctx->match_locs[pos]+start;
+ }
+ else
+ pos = start;
+
+ /* Normal case without explicit VARIADIC marker */
+ initStringInfo(&str);
+
+ /* Get the string before the occurrence starting */
+ if (pos > 1)
+ {
+ text *before = DatumGetTextPP(DirectFunctionCall3(text_substr,
+ PG_GETARG_DATUM(0),
+ Int32GetDatum(1),
+ Int32GetDatum(pos - 1)));
+ appendStringInfoString(&str, TextDatumGetCString(before));
+ }
+
+ /* all occurences must be replaced? */
+ if (occurrence == 0)
+ re_flags.glob = true;
+ else
+ re_flags.glob = false;
+
+ /* Compile the regular expression */
+ re = RE_compile_and_cache(p, re_flags.cflags, PG_GET_COLLATION());
+
+ /* Get the substring starting at the right occurrence position */
+ after = DatumGetTextPP(DirectFunctionCall2(text_substr_no_len,
+ PG_GETARG_DATUM(0),
+ Int32GetDatum(pos)
+ ));
+
+ appendStringInfoString(&str, TextDatumGetCString(replace_text_regexp(
+ after,
+ (void *) re,
+ r,
+ re_flags.glob)));
+
+ PG_RETURN_TEXT_P(CStringGetTextDatum(str.data));
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+textregexreplace_extended_no_occurrence(PG_FUNCTION_ARGS)
+{
+ return textregexreplace_extended(fcinfo);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+textregexreplace_extended_no_flags(PG_FUNCTION_ARGS)
+{
+ return textregexreplace_extended(fcinfo);
+}
+
+/*
+ * Return text string between parenthesis
+ */
+static text *
+enclose_with_parenthesis(text *str)
+{
+ int len = VARSIZE_ANY_EXHDR(str);
+ text *result;
+ char *ptr;
+
+ result = palloc(len + VARHDRSZ + 2);
+ SET_VARSIZE(result, len + VARHDRSZ + 2);
+ ptr = VARDATA(result);
+ memcpy(ptr, "(", 1);
+ memcpy(ptr+1, VARDATA_ANY(str), len);
+ memcpy(ptr+len+1, ")", 1);
+
+ return result;
+}
+
+/*
+ * regexp_instr()
+ * Return the position within the string where the match was located
+ */
+Datum
+regexp_instr(PG_FUNCTION_ARGS)
+{
+ text *orig_str = NULL;
+ text *pattern = NULL;
+ text *flags = PG_GETARG_TEXT_PP_IF_EXISTS(5);
+ int start = 1;
+ int occurrence = 1;
+ int return_opt = 0;
+ int subexpr = 0;
+ int pos;
+
+ pg_re_flags re_flags;
+ regexp_matches_ctx *matchctx;
+
+ /* regexp_instr(string, pattern, start) */
+ if (PG_NARGS() > 2)
+ start = PG_GETARG_INT32(2);
+
+ if (start < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "start_position", start)));
+
+ if (start > 1)
+ orig_str = DatumGetTextPP(DirectFunctionCall2(text_substr_no_len,
+ PG_GETARG_DATUM(0),
+ Int32GetDatum(start)
+ ));
+ else
+ /* regexp_instr(string, pattern) */
+ orig_str = PG_GETARG_TEXT_PP(0);
+
+ if (orig_str == NULL)
+ PG_RETURN_NULL();
+
+ /* regexp_instr(string, pattern, start, occurrence) */
+ if (PG_NARGS() > 3)
+ occurrence = PG_GETARG_INT32(3);
+
+ if (occurrence <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "occurrence", occurrence)));
+
+ /* regexp_instr(string, pattern, start, occurrence, return_opt) */
+ if (PG_NARGS() > 4)
+ return_opt = PG_GETARG_INT32(4);
+
+ if (return_opt != 0 && return_opt != 1)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "return_option", return_opt)));
+
+ /* regexp_instr(string, pattern, start, occurrence, return_opt, flags, subexpr) */
+ if (PG_NARGS() > 6)
+ subexpr = PG_GETARG_INT32(6);
+ if (subexpr < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "group", subexpr)));
+
+ /*
+ * If subexpr is zero (default), then the position of the entire
+ * substring that matches the pattern is returned. Otherwise we
+ * will exactly register the subexpressions given in the pattern.
+ * Enclose pattern between parenthesis to register the position
+ * of the entire substring.
+ */
+ pattern = enclose_with_parenthesis(PG_GETARG_TEXT_PP(1));
+
+ /* Determine options */
+ parse_re_flags(&re_flags, flags);
+ /* this function require flag 'g' */
+ re_flags.glob = true;
+
+ matchctx = setup_regexp_matches(orig_str, pattern, &re_flags,
+ PG_GET_COLLATION(), true, false, false);
+
+ /* If no match is found, then the function returns 0 */
+ if (matchctx->nmatches == 0)
+ PG_RETURN_INT32(0);
+
+ /* When occurrence exceed matches return 0 */
+ if (occurrence > matchctx->nmatches)
+ PG_RETURN_INT32(0);
+
+ /* When subexpr exceed number of subexpression return 0 */
+ if (subexpr > matchctx->npatterns - 1)
+ PG_RETURN_INT32(0);
+
+ /*
+ * Returns the position of the first character of the occurrence
+ * or for subexpression in this occurrence.
+ */
+ pos = (occurrence*matchctx->npatterns*2)-((matchctx->npatterns-subexpr)*2);
+ if (return_opt == 1)
+ pos += 1;
+
+ PG_RETURN_INT32(matchctx->match_locs[pos]+start);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_instr_no_start(PG_FUNCTION_ARGS)
+{
+ return regexp_instr(fcinfo);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_instr_no_occurrence(PG_FUNCTION_ARGS)
+{
+ return regexp_instr(fcinfo);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_instr_no_return_opt(PG_FUNCTION_ARGS)
+{
+ return regexp_instr(fcinfo);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_instr_no_flags(PG_FUNCTION_ARGS)
+{
+ return regexp_instr(fcinfo);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_instr_no_subexpr(PG_FUNCTION_ARGS)
+{
+ return regexp_instr(fcinfo);
+}
+
+/*
+ * regexp_substr()
+ * Return the substring within the string that match a regular
+ * expression pattern
+ */
+Datum
+regexp_substr(PG_FUNCTION_ARGS)
+{
+ text *orig_str = NULL;
+ text *pattern = NULL;
+ text *flags = PG_GETARG_TEXT_PP_IF_EXISTS(4);
+ int start = 1;
+ int occurrence = 1;
+ int subexpr = 0;
+ int so, eo, pos;
+
+ pg_re_flags re_flags;
+ regexp_matches_ctx *matchctx;
+
+ if (PG_NARGS() > 2)
+ start = PG_GETARG_INT32(2);
+
+ if (start < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "start_position", start)));
+
+ /* regexp_substr(string, pattern, start) */
+ if (start > 1)
+ orig_str = DatumGetTextPP(DirectFunctionCall2(text_substr_no_len,
+ PG_GETARG_DATUM(0),
+ Int32GetDatum(start)
+ ));
+ else
+ /* regexp_substr(string, pattern) */
+ orig_str = PG_GETARG_TEXT_PP(0);
+
+ if (orig_str == NULL)
+ PG_RETURN_NULL();
+
+ /* regexp_substr(string, pattern, start, occurrence) */
+ if (PG_NARGS() > 3)
+ occurrence = PG_GETARG_INT32(3);
+
+ if (occurrence <= 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "occurrence", occurrence)));
+
+ /* regexp_substr(string, pattern, start, occurrence, flags, subexpr) */
+ if (PG_NARGS() > 5)
+ subexpr = PG_GETARG_INT32(5);
+ if (subexpr < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid value for parameter \"%s\": %d", "group", subexpr)));
+
+ /*
+ * If subexpr is zero (default), then the position of the entire
+ * substring that matches the pattern is returned. Otherwise we
+ * will exactly register the subexpressions given in the pattern.
+ * Enclose pattern between parenthesis to register the position
+ * of the entire substring.
+ */
+ pattern = enclose_with_parenthesis(PG_GETARG_TEXT_PP(1));
+
+ /* Determine options */
+ parse_re_flags(&re_flags, flags);
+ /* this function require flag 'g' */
+ re_flags.glob = true;
+
+ matchctx = setup_regexp_matches(orig_str, pattern, &re_flags,
+ PG_GET_COLLATION(), true, false, false);
+
+ /* If no match is found, then the function returns NULL */
+ if (matchctx->nmatches == 0)
+ PG_RETURN_NULL();
+
+ /* When occurrence exceed matches return NULL */
+ if (occurrence > matchctx->nmatches)
+ PG_RETURN_NULL();
+
+ /* When subexpr exceed number of subexpression return NULL */
+ if (subexpr > matchctx->npatterns - 1)
+ PG_RETURN_NULL();
+
+ /* Returns the substring corresponding to the occurrence. */
+ pos = (occurrence*matchctx->npatterns*2)-((matchctx->npatterns-subexpr)*2);
+ so = matchctx->match_locs[pos]+1;
+ eo = matchctx->match_locs[pos+1]+1;
+
+ PG_RETURN_DATUM(DirectFunctionCall3(text_substr,
+ PointerGetDatum(matchctx->orig_str),
+ Int32GetDatum(so),
+ Int32GetDatum(eo - so)));
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_substr_no_start(PG_FUNCTION_ARGS)
+{
+ return regexp_substr(fcinfo);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_substr_no_occurrence(PG_FUNCTION_ARGS)
+{
+ return regexp_substr(fcinfo);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_substr_no_flags(PG_FUNCTION_ARGS)
+{
+ return regexp_substr(fcinfo);
+}
+
+/* This is separate to keep the opr_sanity regression test from complaining */
+Datum
+regexp_substr_no_subexpr(PG_FUNCTION_ARGS)
+{
+ return regexp_substr(fcinfo);
+}
+
/*
* setup_regexp_matches --- do the initial matching for regexp_match
* and regexp_split functions
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 8cd0252082..f45c98c05f 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3579,6 +3579,63 @@
proname => 'regexp_matches', prorows => '10', proretset => 't',
prorettype => '_text', proargtypes => 'text text text',
prosrc => 'regexp_matches' },
+{ oid => '9614', descr => 'count match(es) for regexp',
+ proname => 'regexp_count', prorettype => 'int4', proargtypes => 'text text',
+ prosrc => 'regexp_count_no_start' },
+{ oid => '9615', descr => 'count match(es) for regexp',
+ proname => 'regexp_count', prorettype => 'int4', proargtypes => 'text text int4',
+ prosrc => 'regexp_count_no_flags' },
+{ oid => '9616', descr => 'count match(es) for regexp',
+ proname => 'regexp_count', prorettype => 'int4',
+ proargtypes => 'text text int4 text', prosrc => 'regexp_count' },
+{ oid => '9617', descr => 'position where the match for regexp was located',
+ proname => 'regexp_instr', prorettype => 'int4', proargtypes => 'text text',
+ prosrc => 'regexp_instr_no_start' },
+{ oid => '9618', descr => 'position where the match for regexp was located',
+ proname => 'regexp_instr', prorettype => 'int4', proargtypes => 'text text int4',
+ prosrc => 'regexp_instr_no_occurrence' },
+{ oid => '9619', descr => 'position where the match for regexp was located',
+ proname => 'regexp_instr', prorettype => 'int4', proargtypes => 'text text int4 int4',
+ prosrc => 'regexp_instr_no_return_opt' },
+{ oid => '9620', descr => 'position where the match for regexp was located',
+ proname => 'regexp_instr', prorettype => 'int4', proargtypes => 'text text int4 int4 int4',
+ prosrc => 'regexp_instr_no_flags' },
+{ oid => '9621', descr => 'position where the match for regexp was located',
+ proname => 'regexp_instr', prorettype => 'int4', proargtypes => 'text text int4 int4 int4 text',
+ prosrc => 'regexp_instr_no_subexpr' },
+{ oid => '9622', descr => 'position where the match for regexp was located',
+ proname => 'regexp_instr', prorettype => 'int4', proargtypes => 'text text int4 int4 int4 text int4',
+ prosrc => 'regexp_instr' },
+{ oid => '9623', descr => 'substring that matches the regexp pattern',
+ proname => 'regexp_substr', prorettype => 'text', proargtypes => 'text text',
+ prosrc => 'regexp_substr_no_start' },
+{ oid => '9624', descr => 'substring that matchies the regexp pattern',
+ proname => 'regexp_substr', prorettype => 'text', proargtypes => 'text text int4',
+ prosrc => 'regexp_substr_no_occurrence' },
+{ oid => '9625', descr => 'substring that matches the regexp pattern',
+ proname => 'regexp_substr', prorettype => 'text', proargtypes => 'text text int4 int4',
+ prosrc => 'regexp_substr_no_flags' },
+{ oid => '9626', descr => 'substring that matches the regexp pattern',
+ proname => 'regexp_substr', prorettype => 'text', proargtypes => 'text text int4 int4 text',
+ prosrc => 'regexp_substr_no_subexpr' },
+{ oid => '9627', descr => 'substring that matches the regexp pattern',
+ proname => 'regexp_substr', prorettype => 'text', proargtypes => 'text text int4 int4 text int4',
+ prosrc => 'regexp_substr' },
+{ oid => '9628', descr => 'evaluate match(es) for regexp',
+ proname => 'regexp_like', prorettype => 'bool', proargtypes => 'text text',
+ prosrc => 'regexp_like_no_flags' },
+{ oid => '9629', descr => 'evaluate match(es) for regexp',
+ proname => 'regexp_like', prorettype => 'bool',
+ proargtypes => 'text text text', prosrc => 'regexp_like' },
+{ oid => '9606', descr => 'replace text using regexp',
+ proname => 'regexp_replace', prorettype => 'text',
+ proargtypes => 'text text text int4 int4 text', prosrc => 'textregexreplace_extended' },
+{ oid => '9607', descr => 'replace text using regexp',
+ proname => 'regexp_replace', prorettype => 'text',
+ proargtypes => 'text text text int4 int4', prosrc => 'textregexreplace_extended_no_flags' },
+{ oid => '9608', descr => 'replace text using regexp',
+ proname => 'regexp_replace', prorettype => 'text',
+ proargtypes => 'text text text int4', prosrc => 'textregexreplace_extended_no_occurrence' },
{ oid => '2088', descr => 'split string by field_sep and return field_num',
proname => 'split_part', prorettype => 'text',
proargtypes => 'text text int4', prosrc => 'split_part' },
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index 91aa819804..b4e3501556 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -905,6 +905,757 @@ SELECT foo, length(foo) FROM regexp_split_to_table('thE QUick bROWn FOx jUMPs ov
ERROR: regexp_split_to_table() does not support the "global" option
SELECT regexp_split_to_array('thE QUick bROWn FOx jUMPs ovEr The lazy dOG', 'e', 'g');
ERROR: regexp_split_to_array() does not support the "global" option
+-- regexp_like tests
+SELECT regexp_like('a'||CHR(10)||'d', 'a.d');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('a'||CHR(10)||'d', 'a.d', 'm');
+ regexp_like
+-------------
+ f
+(1 row)
+
+SELECT regexp_like('a'||CHR(10)||'d', 'a.d', 'n');
+ regexp_like
+-------------
+ f
+(1 row)
+
+SELECT regexp_like('Steven', '^Ste(v|ph)en$');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('foo' || chr(10) || 'bar' || chr(10) || 'bequq' || chr(10) || 'baz', '^bar');
+ regexp_like
+-------------
+ f
+(1 row)
+
+SELECT regexp_like('foo' || chr(10) || 'bar' || chr(10) || 'bequq' || chr(10) || 'baz', 'bar');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('foo' || chr(10) || 'bar' || chr(10) || 'bequq' || chr(10) || 'baz', '^bar', 'm');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('foo' || chr(10) || 'bar' || chr(10) || 'bequq' || chr(10) || 'baz', '^bar', 'n');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('GREEN', '([aeiou])\1');
+ regexp_like
+-------------
+ f
+(1 row)
+
+SELECT regexp_like('GREEN', '([aeiou])\1', 'i');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '([aeiou])\1', 'i');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '^..([aeiou])\1', 'i');
+ regexp_like
+-------------
+ f
+(1 row)
+
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '([aeiou])\1', 'in');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '^..([aeiou])\1', 'in');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '^..([aeiou])\1', 'im');
+ regexp_like
+-------------
+ t
+(1 row)
+
+SELECT REGEXP_LIKE('abc', 'a b c');
+ regexp_like
+-------------
+ f
+(1 row)
+
+SELECT REGEXP_LIKE('abc', 'a b c','x');
+ regexp_like
+-------------
+ t
+(1 row)
+
+-- count all matches for regexp
+SELECT regexp_count('123123123123123', '(12)3');
+ regexp_count
+--------------
+ 5
+(1 row)
+
+-- count all matches with start position
+SELECT regexp_count('123123123123', '123', 0);
+ regexp_count
+--------------
+ 4
+(1 row)
+
+SELECT regexp_count('123123123123', '123', 1);
+ regexp_count
+--------------
+ 4
+(1 row)
+
+SELECT regexp_count('123123123123', '123', 3);
+ regexp_count
+--------------
+ 3
+(1 row)
+
+SELECT regexp_count('123123123123', '123', -3);
+ERROR: invalid value for parameter "start_position": -3
+-- count all matches in NULL string with a start position
+SELECT regexp_count(NULL, '123', 3);
+ regexp_count
+--------------
+ \N
+(1 row)
+
+-- count all matches with a start position greater than string length
+SELECT regexp_count('123', '123', 10);
+ regexp_count
+--------------
+ 0
+(1 row)
+
+-- count all matches from different regexp
+SELECT regexp_count('ABC123', '[A-Z]'), regexp_count('A1B2C3', '[A-Z]');
+ regexp_count | regexp_count
+--------------+--------------
+ 3 | 3
+(1 row)
+
+SELECT regexp_count('ABC123', '[A-Z][0-9]'), regexp_count('A1B2C3', '[A-Z][0-9]');
+ regexp_count | regexp_count
+--------------+--------------
+ 1 | 3
+(1 row)
+
+SELECT regexp_count('ABC123', '[A-Z][0-9]{2}'), regexp_count('A1B2C3', '[A-Z][0-9]{2}');
+ regexp_count | regexp_count
+--------------+--------------
+ 1 | 0
+(1 row)
+
+SELECT regexp_count('ABC123', '([A-Z][0-9]){2}'), regexp_count('A1B2C3', '([A-Z][0-9]){2}');
+ regexp_count | regexp_count
+--------------+--------------
+ 0 | 1
+(1 row)
+
+SELECT regexp_count('ABC123A5', '^[A-Z][0-9]'), regexp_count('A1B2C3', '^[A-Z][0-9]');
+ regexp_count | regexp_count
+--------------+--------------
+ 0 | 1
+(1 row)
+
+SELECT regexp_count('ABC123', '[A-Z][0-9]{2}'), regexp_count('A1B2C34', '[A-Z][0-9]{2}');
+ regexp_count | regexp_count
+--------------+--------------
+ 1 | 1
+(1 row)
+
+-- count matches with newline case insensivity
+SELECT regexp_count('a'||CHR(10)||'d', 'a.d');
+ regexp_count
+--------------
+ 1
+(1 row)
+
+SELECT regexp_count('a'||CHR(10)||'d', 'a.d', 1, 's');
+ regexp_count
+--------------
+ 1
+(1 row)
+
+-- count matches with newline case sensivity
+SELECT regexp_count('a'||CHR(10)||'d', 'a.d', 1, 'n');
+ regexp_count
+--------------
+ 0
+(1 row)
+
+-- count not multiline matches
+SELECT regexp_count('a'||CHR(10)||'d', '^d$');
+ regexp_count
+--------------
+ 0
+(1 row)
+
+-- count multiline matches
+SELECT regexp_count('a'||CHR(10)||'d', '^d$', 1, 'm');
+ regexp_count
+--------------
+ 1
+(1 row)
+
+SELECT regexp_count('Hello'||CHR(10)||'world!', '^world!$', 1, 'm'); -- 1
+ regexp_count
+--------------
+ 1
+(1 row)
+
+-- Count the number of occurrences of the substring an in the string.
+SELECT regexp_count('a man, a plan, a canal: Panama', 'an');
+ regexp_count
+--------------
+ 4
+(1 row)
+
+-- Find the number of occurrences of the substring an in the string starting with the fifth character.
+SELECT regexp_count('a man, a plan, a canal: Panama', 'an',5);
+ regexp_count
+--------------
+ 3
+(1 row)
+
+-- Find the number of occurrences of a substring containing a lower-case character
+-- followed by an. In the first example, do not use a modifier. In the second example,
+-- use the i modifier to force the regular expression to ignore case.
+SELECT regexp_count('a man, a plan, a canal: Panama', '[a-z]an');
+ regexp_count
+--------------
+ 3
+(1 row)
+
+SELECT regexp_count('a man, a plan, a canal: Panama', '[a-z]an', 1, 'i');
+ regexp_count
+--------------
+ 4
+(1 row)
+
+DROP TABLE IF EXISTS regexp_temp;
+NOTICE: table "regexp_temp" does not exist, skipping
+CREATE TABLE regexp_temp(fullname varchar(20), email varchar(20));
+INSERT INTO regexp_temp (fullname, email) VALUES ('John Doe', '[email protected]');
+INSERT INTO regexp_temp (fullname, email) VALUES ('Jane Doe', 'janedoe');
+-- count matches case sensitive
+SELECT fullname, regexp_count(fullname, 'e', 1, 'c') FROM regexp_temp;
+ fullname | regexp_count
+----------+--------------
+ John Doe | 1
+ Jane Doe | 2
+(2 rows)
+
+SELECT fullname, regexp_count(fullname, 'D', 1, 'c') FROM regexp_temp;
+ fullname | regexp_count
+----------+--------------
+ John Doe | 1
+ Jane Doe | 1
+(2 rows)
+
+SELECT fullname, regexp_count(fullname, 'd', 1, 'c') FROM regexp_temp;
+ fullname | regexp_count
+----------+--------------
+ John Doe | 0
+ Jane Doe | 0
+(2 rows)
+
+-- count matches case insensitive
+SELECT fullname, regexp_count(fullname, 'E', 1, 'i') FROM regexp_temp;
+ fullname | regexp_count
+----------+--------------
+ John Doe | 1
+ Jane Doe | 2
+(2 rows)
+
+SELECT fullname, regexp_count(fullname, 'do', 1, 'i') FROM regexp_temp;
+ fullname | regexp_count
+----------+--------------
+ John Doe | 1
+ Jane Doe | 1
+(2 rows)
+
+-- return the start position of the 6th occurence starting at beginning of the string
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[^ ]+', 1, 6);
+ regexp_instr
+--------------
+ 32
+(1 row)
+
+-- return the start position of the 5th occurence starting after the first word
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[^ ]+', 7, 5, 0);
+ regexp_instr
+--------------
+ 32
+(1 row)
+
+-- return the ending position of the 5th occurence starting after the first word
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[^ ]+', 7, 5, 0);
+ regexp_instr
+--------------
+ 32
+(1 row)
+
+-- return the ending position of the 2nd occurence starting after the first word
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[T|Z|S][[:alpha:]]{5}', 7, 2, 1, 'i');
+ regexp_instr
+--------------
+ 30
+(1 row)
+
+-- return the starting position corresponding to the different capture group
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 0);
+ regexp_instr
+--------------
+ 1
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 1);
+ regexp_instr
+--------------
+ 1
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 2);
+ regexp_instr
+--------------
+ 4
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 3);
+ regexp_instr
+--------------
+ 5
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 4);
+ regexp_instr
+--------------
+ 7
+(1 row)
+
+-- return the starting position corresponding to a non existant capture group
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 5);
+ regexp_instr
+--------------
+ 0
+(1 row)
+
+-- Same but with the ending position
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 0);
+ regexp_instr
+--------------
+ 9
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 1);
+ regexp_instr
+--------------
+ 4
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 2);
+ regexp_instr
+--------------
+ 9
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 3);
+ regexp_instr
+--------------
+ 7
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 4);
+ regexp_instr
+--------------
+ 9
+(1 row)
+
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 5);
+ regexp_instr
+--------------
+ 0
+(1 row)
+
+-- start position of a valid email
+SELECT email, regexp_instr(email, '\w+@\w+(\.\w+)+') "valid_email" FROM regexp_temp;
+ email | valid_email
+---------------------+-------------
+ [email protected] | 1
+ janedoe | 0
+(2 rows)
+
+-- ending position of a valid email
+SELECT email, regexp_instr(email, '\w+@\w+(\.\w+)+', 1, 1, 1) "valid_email" FROM regexp_temp;
+ email | valid_email
+---------------------+-------------
+ [email protected] | 20
+ janedoe | 0
+(2 rows)
+
+-- start position of first capture group in the email (the dot part)
+SELECT email, regexp_instr(email, '\w+@\w+(\.\w+)+', 1, 1, 0, 'i', 1) FROM regexp_temp;
+ email | regexp_instr
+---------------------+--------------
+ [email protected] | 16
+ janedoe | 0
+(2 rows)
+
+-- ending position of first capture group in the email (the dot part)
+SELECT email, regexp_instr(email, '\w+@\w+(\.\w+)+', 1, 1, 1, 'i', 1) FROM regexp_temp;
+ email | regexp_instr
+---------------------+--------------
+ [email protected] | 20
+ janedoe | 0
+(2 rows)
+
+-- test negative values
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', -1, 1, 1, 'i', 1);
+ERROR: invalid value for parameter "start_position": -1
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, -1, 1, 'i', 1);
+ERROR: invalid value for parameter "occurrence": -1
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, -1, 'i', 1);
+ERROR: invalid value for parameter "return_option": -1
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', -1);
+ERROR: invalid value for parameter "group": -1
+-- Find the first occurrence of a sequence of letters starting with the letter e
+-- and ending with the letter y in the phrase "easy come, easy go."
+SELECT regexp_instr('easy come, easy go','e\w*y');
+ regexp_instr
+--------------
+ 1
+(1 row)
+
+-- Find the first sequence of letters starting with the letter e and ending with
+-- the letter y in the string "easy come, easy go" starting at the second character
+SELECT regexp_instr('easy come, easy go','e\w*y',2);
+ regexp_instr
+--------------
+ 12
+(1 row)
+
+-- Find the second sequence of letters starting with the letter e and ending with
+-- the letter y in the string "easy come, easy go" starting at the first character.
+SELECT regexp_instr('easy come, easy go','e\w*y',1,2);
+ regexp_instr
+--------------
+ 12
+(1 row)
+
+-- Find the position of the first character after the first whitespace in the string "easy come, easy go."
+SELECT regexp_instr('easy come, easy go','\s',1,1,1);
+ regexp_instr
+--------------
+ 6
+(1 row)
+
+-- Find the position of the start of the third word in a string by capturing each
+-- word as a subexpression, and returning the third subexpression's start position.
+SELECT regexp_instr('one two three','(\w+)\s+(\w+)\s+(\w+)', 1,1,0,'',3);
+ regexp_instr
+--------------
+ 9
+(1 row)
+
+-- return the substring matching the regexp
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+');
+ regexp_substr
+----------------
+ , zipcode town
+(1 row)
+
+-- return the substring matching the regexp
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+', 24);
+ regexp_substr
+---------------
+ , FR
+(1 row)
+
+-- return the substring matching the regexp at the first occurrence
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+', 1, 1);
+ regexp_substr
+----------------
+ , zipcode town
+(1 row)
+
+-- return the substring matching the regexp at the second occurrence
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+', 1, 2);
+ regexp_substr
+---------------
+ , FR
+(1 row)
+
+-- case sensitivity substring search
+SELECT regexp_substr('number of your street, zipcode town, FR', ',\s+[Zf][^,]+', 1, 1);
+ regexp_substr
+---------------
+ \N
+(1 row)
+
+SELECT regexp_substr('number of your street, zipcode town, FR', ',\s+[Zf][^,]+', 1, 1, 'i');
+ regexp_substr
+----------------
+ , zipcode town
+(1 row)
+
+-- case sensitivity substring search with no capture group
+SELECT regexp_substr('number of your street, zipcode town, FR', ',\s+[Zf][^,]+', 1, 1, 'i', 0);
+ regexp_substr
+----------------
+ , zipcode town
+(1 row)
+
+-- case sensitivity substring search with non existing capture group
+SELECT regexp_substr('number of your street, zipcode town, FR', ',\s+[Zf][^,]+', 1, 1, 'i', 0);
+ regexp_substr
+----------------
+ , zipcode town
+(1 row)
+
+-- return the substring matching the regexp at different occurrence and capture group
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', 1, 1, 'i', 4);
+ regexp_substr
+---------------
+ 78
+(1 row)
+
+SELECT regexp_substr('1234567890 1234557890', '(123)(4(5[56])(78))', 1, 2, 'i', 3);
+ regexp_substr
+---------------
+ 55
+(1 row)
+
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', 1, 1, 'i', 0);
+ regexp_substr
+---------------
+ 12345678
+(1 row)
+
+-- test negative values
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', -1, 1, 'i', 4);
+ERROR: invalid value for parameter "start_position": -1
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', 1, -1, 'i', 4);
+ERROR: invalid value for parameter "occurrence": -1
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', 1, 1, 'i', -4);
+ERROR: invalid value for parameter "group": -4
+-- Select the first substring of letters that end with "thy."
+SELECT regexp_substr('healthy, wealthy, and wise','\w+thy');
+ regexp_substr
+---------------
+ healthy
+(1 row)
+
+-- Select the first substring of letters that ends with "thy" starting at the second character in the string.
+SELECT regexp_substr('healthy, wealthy, and wise','\w+thy',2);
+ regexp_substr
+---------------
+ ealthy
+(1 row)
+
+-- Return the contents of the third captured subexpression, which captures the third word in the string.
+SELECT regexp_substr('one two three', '(\w+)\s+(\w+)\s+(\w+)', 1, 1, '', 3);
+ regexp_substr
+---------------
+ three
+(1 row)
+
+DROP TABLE IF EXISTS regexp_temp;
+-- Regression tests for extended regexp_replace() function with start position and occurrence
+SELECT regexp_replace('512.123.4567', '([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})', '(\1) \2-\3', 1);
+ regexp_replace
+----------------
+ (512) 123-4567
+(1 row)
+
+SELECT regexp_replace('512.123.4567 612.123.4567', '([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})', '(\1) \2-\3', 1, 0);
+ regexp_replace
+-------------------------------
+ (512) 123-4567 (612) 123-4567
+(1 row)
+
+SELECT regexp_replace('number your street,'||CHR(10)||' zipcode town, FR', '( ){2,}', ' ', 1, 0);
+ regexp_replace
+---------------------
+ number your street,+
+ zipcode town, FR
+(1 row)
+
+SELECT regexp_replace('number your street, zipcode town, FR', '( ){2,}', ' ', 9);
+ regexp_replace
+--------------------------------------------
+ number your street, zipcode town, FR
+(1 row)
+
+SELECT regexp_replace('number your street, zipcode town, FR', '( ){2,}', ' ', 9, 0);
+ regexp_replace
+----------------------------------------
+ number your street, zipcode town, FR
+(1 row)
+
+SELECT regexp_replace('number your street, zipcode town, FR', '( ){2,}', ' ', 9, 2);
+ regexp_replace
+---------------------------------------------
+ number your street, zipcode town, FR
+(1 row)
+
+SELECT regexp_replace('number your street, zipcode town, FR', '( ){2,}', ' ', 9, 2, 'm');
+ regexp_replace
+---------------------------------------------
+ number your street, zipcode town, FR
+(1 row)
+
+SELECT regexp_replace('number your street, zipcode town, FR', '([EURT]){2,}', '[\1]', 9, 1, 'i');
+ regexp_replace
+-------------------------------------------------
+ number yo[r] street, zipcode town, FR
+(1 row)
+
+SELECT regexp_replace('number your street, zipcode town, FR', '([EURT]){2,}', '[\1]', 9, 2, 'i');
+ regexp_replace
+----------------------------------------------
+ number your s[t], zipcode town, FR
+(1 row)
+
+SELECT regexp_replace ('A PostgreSQL function', 'A|e|i|o|u', 'X', 1, 2);
+ regexp_replace
+-----------------------
+ A PXstgreSQL function
+(1 row)
+
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 0, 'i');
+ regexp_replace
+-----------------------
+ X PXstgrXSQL fXnctXXn
+(1 row)
+
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 1, 'i');
+ regexp_replace
+-----------------------
+ X PostgreSQL function
+(1 row)
+
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 2, 'i');
+ regexp_replace
+-----------------------
+ A PXstgreSQL function
+(1 row)
+
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 3, 'i');
+ regexp_replace
+-----------------------
+ A PostgrXSQL function
+(1 row)
+
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 9, 'i');
+ regexp_replace
+-----------------------
+ A PostgreSQL function
+(1 row)
+
+SELECT regexp_replace ('A PostgreSQL function', 'A|e|i|o|u', 'X', 1, 9);
+ regexp_replace
+-----------------------
+ A PostgreSQL function
+(1 row)
+
+-- Invalid parameter values
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', -1, 0, 'i');
+ERROR: invalid value for parameter "start_position": -1
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, -1, 'i');
+ERROR: invalid value for parameter "occurrence": -1
+-- Modifier 'g' should not be taken in account, we have an occurrence to replace
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 1, 'g');
+ regexp_replace
+-----------------------
+ A PXstgreSQL function
+(1 row)
+
+-- Find groups of "word characters" (letters, numbers and underscore) ending with
+-- "thy" in the string "healthy, wealthy, and wise" and replace them with nothing.
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', '', 'g');
+ regexp_replace
+----------------
+ , , and wise
+(1 row)
+
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', '', 1, 0);
+ regexp_replace
+----------------
+ , , and wise
+(1 row)
+
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', '', 1, 0, 'g');
+ regexp_replace
+----------------
+ , , and wise
+(1 row)
+
+-- Find groups of word characters ending with "thy" and replace with "something."
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', 'something', 'g');
+ regexp_replace
+--------------------------------
+ something, something, and wise
+(1 row)
+
+-- Find groups of word characters ending with "thy" and replace with the string
+-- "something" starting at the third character in the string.
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', 'something', 3, 0);
+ regexp_replace
+----------------------------------
+ hesomething, something, and wise
+(1 row)
+
+-- Replace the second group of word characters ending with "thy" with "something."
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', 'something', 1, 2);
+ regexp_replace
+------------------------------
+ healthy, something, and wise
+(1 row)
+
+-- Find groups of word characters ending with "thy" capturing the letters before
+-- the "thy", and replace with the captured letters plus the letters "ish."
+SELECT regexp_replace('healthy, wealthy, and wise','(\w+)thy', '\1ish', 'g');
+ regexp_replace
+----------------------------
+ healish, wealish, and wise
+(1 row)
+
+SELECT regexp_replace('healthy, wealthy, and wise','(\w+)thy', '\1ish', 1, 0);
+ regexp_replace
+----------------------------
+ healish, wealish, and wise
+(1 row)
+
+-- Ambiguous use of option parameter with regexp_replace()
+PREPARE rr AS SELECT regexp_replace('healthy, wealthy, and wise','(\w+)thy', '\1ish', $1);
+EXECUTE rr(1);
+ERROR: ambiguous use of the option parameter in regex_replace(), value: 1
+HINT: you might set the occurrence parameter to force the use of the extended form of regex_replace()
+DEALLOCATE rr;
-- change NULL-display back
\pset null ''
-- E021-11 position expression
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index 2c502534c2..f50131896f 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -255,6 +255,202 @@ SELECT regexp_split_to_array('thE QUick bROWn FOx jUMPs ovEr The lazy dOG', 'e',
SELECT foo, length(foo) FROM regexp_split_to_table('thE QUick bROWn FOx jUMPs ovEr The lazy dOG', 'e', 'g') AS foo;
SELECT regexp_split_to_array('thE QUick bROWn FOx jUMPs ovEr The lazy dOG', 'e', 'g');
+-- regexp_like tests
+SELECT regexp_like('a'||CHR(10)||'d', 'a.d');
+SELECT regexp_like('a'||CHR(10)||'d', 'a.d', 'm');
+SELECT regexp_like('a'||CHR(10)||'d', 'a.d', 'n');
+SELECT regexp_like('Steven', '^Ste(v|ph)en$');
+SELECT regexp_like('foo' || chr(10) || 'bar' || chr(10) || 'bequq' || chr(10) || 'baz', '^bar');
+SELECT regexp_like('foo' || chr(10) || 'bar' || chr(10) || 'bequq' || chr(10) || 'baz', 'bar');
+SELECT regexp_like('foo' || chr(10) || 'bar' || chr(10) || 'bequq' || chr(10) || 'baz', '^bar', 'm');
+SELECT regexp_like('foo' || chr(10) || 'bar' || chr(10) || 'bequq' || chr(10) || 'baz', '^bar', 'n');
+SELECT regexp_like('GREEN', '([aeiou])\1');
+SELECT regexp_like('GREEN', '([aeiou])\1', 'i');
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '([aeiou])\1', 'i');
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '^..([aeiou])\1', 'i');
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '([aeiou])\1', 'in');
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '^..([aeiou])\1', 'in');
+SELECT regexp_like('ORANGE' || chr(10) || 'GREEN', '^..([aeiou])\1', 'im');
+SELECT REGEXP_LIKE('abc', 'a b c');
+SELECT REGEXP_LIKE('abc', 'a b c','x');
+
+-- count all matches for regexp
+SELECT regexp_count('123123123123123', '(12)3');
+-- count all matches with start position
+SELECT regexp_count('123123123123', '123', 0);
+SELECT regexp_count('123123123123', '123', 1);
+SELECT regexp_count('123123123123', '123', 3);
+SELECT regexp_count('123123123123', '123', -3);
+-- count all matches in NULL string with a start position
+SELECT regexp_count(NULL, '123', 3);
+-- count all matches with a start position greater than string length
+SELECT regexp_count('123', '123', 10);
+-- count all matches from different regexp
+SELECT regexp_count('ABC123', '[A-Z]'), regexp_count('A1B2C3', '[A-Z]');
+SELECT regexp_count('ABC123', '[A-Z][0-9]'), regexp_count('A1B2C3', '[A-Z][0-9]');
+SELECT regexp_count('ABC123', '[A-Z][0-9]{2}'), regexp_count('A1B2C3', '[A-Z][0-9]{2}');
+SELECT regexp_count('ABC123', '([A-Z][0-9]){2}'), regexp_count('A1B2C3', '([A-Z][0-9]){2}');
+SELECT regexp_count('ABC123A5', '^[A-Z][0-9]'), regexp_count('A1B2C3', '^[A-Z][0-9]');
+SELECT regexp_count('ABC123', '[A-Z][0-9]{2}'), regexp_count('A1B2C34', '[A-Z][0-9]{2}');
+-- count matches with newline case insensivity
+SELECT regexp_count('a'||CHR(10)||'d', 'a.d');
+SELECT regexp_count('a'||CHR(10)||'d', 'a.d', 1, 's');
+-- count matches with newline case sensivity
+SELECT regexp_count('a'||CHR(10)||'d', 'a.d', 1, 'n');
+-- count not multiline matches
+SELECT regexp_count('a'||CHR(10)||'d', '^d$');
+-- count multiline matches
+SELECT regexp_count('a'||CHR(10)||'d', '^d$', 1, 'm');
+SELECT regexp_count('Hello'||CHR(10)||'world!', '^world!$', 1, 'm'); -- 1
+-- Count the number of occurrences of the substring an in the string.
+SELECT regexp_count('a man, a plan, a canal: Panama', 'an');
+-- Find the number of occurrences of the substring an in the string starting with the fifth character.
+SELECT regexp_count('a man, a plan, a canal: Panama', 'an',5);
+-- Find the number of occurrences of a substring containing a lower-case character
+-- followed by an. In the first example, do not use a modifier. In the second example,
+-- use the i modifier to force the regular expression to ignore case.
+SELECT regexp_count('a man, a plan, a canal: Panama', '[a-z]an');
+SELECT regexp_count('a man, a plan, a canal: Panama', '[a-z]an', 1, 'i');
+
+DROP TABLE IF EXISTS regexp_temp;
+CREATE TABLE regexp_temp(fullname varchar(20), email varchar(20));
+INSERT INTO regexp_temp (fullname, email) VALUES ('John Doe', '[email protected]');
+INSERT INTO regexp_temp (fullname, email) VALUES ('Jane Doe', 'janedoe');
+-- count matches case sensitive
+SELECT fullname, regexp_count(fullname, 'e', 1, 'c') FROM regexp_temp;
+SELECT fullname, regexp_count(fullname, 'D', 1, 'c') FROM regexp_temp;
+SELECT fullname, regexp_count(fullname, 'd', 1, 'c') FROM regexp_temp;
+-- count matches case insensitive
+SELECT fullname, regexp_count(fullname, 'E', 1, 'i') FROM regexp_temp;
+SELECT fullname, regexp_count(fullname, 'do', 1, 'i') FROM regexp_temp;
+
+-- return the start position of the 6th occurence starting at beginning of the string
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[^ ]+', 1, 6);
+-- return the start position of the 5th occurence starting after the first word
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[^ ]+', 7, 5, 0);
+-- return the ending position of the 5th occurence starting after the first word
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[^ ]+', 7, 5, 0);
+-- return the ending position of the 2nd occurence starting after the first word
+SELECT regexp_instr('number of your street, zipcode thetown, FR', '[T|Z|S][[:alpha:]]{5}', 7, 2, 1, 'i');
+-- return the starting position corresponding to the different capture group
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 0);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 1);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 2);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 3);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 4);
+-- return the starting position corresponding to a non existant capture group
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 0, 'i', 5);
+-- Same but with the ending position
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 0);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 1);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 2);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 3);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 4);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', 5);
+-- start position of a valid email
+SELECT email, regexp_instr(email, '\w+@\w+(\.\w+)+') "valid_email" FROM regexp_temp;
+-- ending position of a valid email
+SELECT email, regexp_instr(email, '\w+@\w+(\.\w+)+', 1, 1, 1) "valid_email" FROM regexp_temp;
+-- start position of first capture group in the email (the dot part)
+SELECT email, regexp_instr(email, '\w+@\w+(\.\w+)+', 1, 1, 0, 'i', 1) FROM regexp_temp;
+-- ending position of first capture group in the email (the dot part)
+SELECT email, regexp_instr(email, '\w+@\w+(\.\w+)+', 1, 1, 1, 'i', 1) FROM regexp_temp;
+-- test negative values
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', -1, 1, 1, 'i', 1);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, -1, 1, 'i', 1);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, -1, 'i', 1);
+SELECT regexp_instr('1234567890', '(123)(4(56)(78))', 1, 1, 1, 'i', -1);
+
+-- Find the first occurrence of a sequence of letters starting with the letter e
+-- and ending with the letter y in the phrase "easy come, easy go."
+SELECT regexp_instr('easy come, easy go','e\w*y');
+-- Find the first sequence of letters starting with the letter e and ending with
+-- the letter y in the string "easy come, easy go" starting at the second character
+SELECT regexp_instr('easy come, easy go','e\w*y',2);
+-- Find the second sequence of letters starting with the letter e and ending with
+-- the letter y in the string "easy come, easy go" starting at the first character.
+SELECT regexp_instr('easy come, easy go','e\w*y',1,2);
+-- Find the position of the first character after the first whitespace in the string "easy come, easy go."
+SELECT regexp_instr('easy come, easy go','\s',1,1,1);
+-- Find the position of the start of the third word in a string by capturing each
+-- word as a subexpression, and returning the third subexpression's start position.
+SELECT regexp_instr('one two three','(\w+)\s+(\w+)\s+(\w+)', 1,1,0,'',3);
+
+-- return the substring matching the regexp
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+');
+-- return the substring matching the regexp
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+', 24);
+-- return the substring matching the regexp at the first occurrence
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+', 1, 1);
+-- return the substring matching the regexp at the second occurrence
+SELECT regexp_substr('number of your street, zipcode town, FR', ',[^,]+', 1, 2);
+-- case sensitivity substring search
+SELECT regexp_substr('number of your street, zipcode town, FR', ',\s+[Zf][^,]+', 1, 1);
+SELECT regexp_substr('number of your street, zipcode town, FR', ',\s+[Zf][^,]+', 1, 1, 'i');
+-- case sensitivity substring search with no capture group
+SELECT regexp_substr('number of your street, zipcode town, FR', ',\s+[Zf][^,]+', 1, 1, 'i', 0);
+-- case sensitivity substring search with non existing capture group
+SELECT regexp_substr('number of your street, zipcode town, FR', ',\s+[Zf][^,]+', 1, 1, 'i', 0);
+-- return the substring matching the regexp at different occurrence and capture group
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', 1, 1, 'i', 4);
+SELECT regexp_substr('1234567890 1234557890', '(123)(4(5[56])(78))', 1, 2, 'i', 3);
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', 1, 1, 'i', 0);
+-- test negative values
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', -1, 1, 'i', 4);
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', 1, -1, 'i', 4);
+SELECT regexp_substr('1234567890 1234567890', '(123)(4(56)(78))', 1, 1, 'i', -4);
+-- Select the first substring of letters that end with "thy."
+SELECT regexp_substr('healthy, wealthy, and wise','\w+thy');
+-- Select the first substring of letters that ends with "thy" starting at the second character in the string.
+SELECT regexp_substr('healthy, wealthy, and wise','\w+thy',2);
+-- Return the contents of the third captured subexpression, which captures the third word in the string.
+SELECT regexp_substr('one two three', '(\w+)\s+(\w+)\s+(\w+)', 1, 1, '', 3);
+
+DROP TABLE IF EXISTS regexp_temp;
+
+-- Regression tests for extended regexp_replace() function with start position and occurrence
+SELECT regexp_replace('512.123.4567', '([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})', '(\1) \2-\3', 1);
+SELECT regexp_replace('512.123.4567 612.123.4567', '([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})', '(\1) \2-\3', 1, 0);
+SELECT regexp_replace('number your street,'||CHR(10)||' zipcode town, FR', '( ){2,}', ' ', 1, 0);
+SELECT regexp_replace('number your street, zipcode town, FR', '( ){2,}', ' ', 9);
+SELECT regexp_replace('number your street, zipcode town, FR', '( ){2,}', ' ', 9, 0);
+SELECT regexp_replace('number your street, zipcode town, FR', '( ){2,}', ' ', 9, 2);
+SELECT regexp_replace('number your street, zipcode town, FR', '( ){2,}', ' ', 9, 2, 'm');
+SELECT regexp_replace('number your street, zipcode town, FR', '([EURT]){2,}', '[\1]', 9, 1, 'i');
+SELECT regexp_replace('number your street, zipcode town, FR', '([EURT]){2,}', '[\1]', 9, 2, 'i');
+SELECT regexp_replace ('A PostgreSQL function', 'A|e|i|o|u', 'X', 1, 2);
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 0, 'i');
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 1, 'i');
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 2, 'i');
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 3, 'i');
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 9, 'i');
+SELECT regexp_replace ('A PostgreSQL function', 'A|e|i|o|u', 'X', 1, 9);
+-- Invalid parameter values
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', -1, 0, 'i');
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, -1, 'i');
+-- Modifier 'g' should not be taken in account, we have an occurrence to replace
+SELECT regexp_replace ('A PostgreSQL function', 'a|e|i|o|u', 'X', 1, 1, 'g');
+-- Find groups of "word characters" (letters, numbers and underscore) ending with
+-- "thy" in the string "healthy, wealthy, and wise" and replace them with nothing.
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', '', 'g');
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', '', 1, 0);
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', '', 1, 0, 'g');
+-- Find groups of word characters ending with "thy" and replace with "something."
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', 'something', 'g');
+-- Find groups of word characters ending with "thy" and replace with the string
+-- "something" starting at the third character in the string.
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', 'something', 3, 0);
+-- Replace the second group of word characters ending with "thy" with "something."
+SELECT regexp_replace('healthy, wealthy, and wise','\w+thy', 'something', 1, 2);
+-- Find groups of word characters ending with "thy" capturing the letters before
+-- the "thy", and replace with the captured letters plus the letters "ish."
+SELECT regexp_replace('healthy, wealthy, and wise','(\w+)thy', '\1ish', 'g');
+SELECT regexp_replace('healthy, wealthy, and wise','(\w+)thy', '\1ish', 1, 0);
+-- Ambiguous use of option parameter with regexp_replace()
+PREPARE rr AS SELECT regexp_replace('healthy, wealthy, and wise','(\w+)thy', '\1ish', $1);
+EXECUTE rr(1);
+DEALLOCATE rr;
+
-- change NULL-display back
\pset null ''