On Wed, Apr 3, 2024 at 4:45 AM Tom Lane <[email protected]> wrote:
>
> jian he <[email protected]> writes:
> > On Thu, Jan 18, 2024 at 4:17 PM Peter Eisentraut <[email protected]>
> > wrote:
> >> Reading back through the discussion, I wasn't quite able to interpret
> >> the resolution regarding Oracle compatibility. From the patch, it looks
> >> like you chose not to adopt the parameter names from Oracle. Was that
> >> your intention?
>
> > per committee message:
> > https://git.postgresql.org/cgit/postgresql.git/commit/?id=6424337073589476303b10f6d7cc74f501b8d9d7
> > Even if the names are all the same, our function is still not the same
> > as oracle.
>
> The fact that there's minor discrepancies in the regex languages
> doesn't seem to me to have a lot of bearing on whether we should
> follow Oracle's choices of parameter names.
>
> However, if we do follow Oracle, it seems like we should do that
> consistently, which this patch doesn't. For instance, per [1]
> Oracle calls the arguments of regex_substr
>
> source_char,
> pattern,
> position,
> occurrence,
> match_param,
> subexpr
>
> while we have
>
> string,
> pattern,
> start,
> N,
> flags,
> subexpr
>
> The patch proposes to replace "N" with "occurrence" but not touch
> the other discrepancies, which seems to me to be a pretty poor
> choice. "occurrence" is very long and difficult to spell correctly,
> and if you're not following Oracle slavishly, exactly what is the
> argument in its favor? I quite agree that Oracle's other choices
> aren't improvements over ours, but neither is that one.
>
> On the whole my inclination would be to stick to the names we have
> in the documentation. There might be an argument for changing "N"
> to something lower-case so you don't have to quote it; but if we do,
> I'd go for, say, "count".
>
we have
---------------------------------------------------------------
The replacement string can contain \n, where n is 1 through 9, to
indicate that the source substring matching the n'th parenthesized
subexpression of the pattern should be inserted, and it can contain \&
to indicate that the substring matching the entire pattern should be
inserted.
----------------------------------------------------------------------------
in the regexp_replace explanation section.
changing "N" to lower-case would be misleading for regexp_replace?
so I choose "count".
By the way, I think the above is so hard to comprehend.
I can only find related test in src/test/regress/sql/strings.sql are:
SELECT regexp_replace('1112223333', E'(\\d{3})(\\d{3})(\\d{4})',
E'(\\1) \\2-\\3');
SELECT regexp_replace('foobarrbazz', E'(.)\\1', E'X\\&Y', 'g');
SELECT regexp_replace('foobarrbazz', E'(.)\\1', E'X\\\\Y', 'g');
but these tests seem not friendly.
maybe we should have some simple examples to demonstrate the above paragraph.
From 372a0c6cb894194b819fc380efda179bf6d1055d Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Thu, 4 Apr 2024 21:50:13 +0800
Subject: [PATCH v4 1/1] add regex functions argument names.
Specifically add function argument names to the following funtions:
regexp_replace,
regexp_match,
regexp_matches,
regexp_count,
regexp_instr,
regexp_like,
regexp_substr,
regexp_split_to_table,
regexp_split_to_array
So it would be easier to understand these functions in psql via \df.
now these functions can be called in different notaions.
discussion: https://postgr.es/m/CACJufxG3NFKKsh6x4fRLv8h3V-HvN4W5dA%3DzNKMxsNcDwOKang%40mail.gmail.com
---
doc/src/sgml/func.sgml | 50 +++++++++++------------
src/include/catalog/pg_proc.dat | 71 ++++++++++++++++++++++++++-------
2 files changed, 82 insertions(+), 39 deletions(-)
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ff690113..57ad1624 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3332,7 +3332,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
</indexterm>
<function>regexp_instr</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>
[, <parameter>start</parameter> <type>integer</type>
- [, <parameter>N</parameter> <type>integer</type>
+ [, <parameter>count</parameter> <type>integer</type>
[, <parameter>endoption</parameter> <type>integer</type>
[, <parameter>flags</parameter> <type>text</type>
[, <parameter>subexpr</parameter> <type>integer</type> ] ] ] ] ] )
@@ -3340,7 +3340,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
</para>
<para>
Returns the position within <parameter>string</parameter> where
- the <parameter>N</parameter>'th match of the POSIX regular
+ the <parameter>count</parameter>'th match of the POSIX regular
expression <parameter>pattern</parameter> occurs, or zero if there is
no such match; see <xref linkend="functions-posix-regexp"/>.
</para>
@@ -3446,14 +3446,14 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
<entry role="func_table_entry"><para role="func_signature">
<function>regexp_replace</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>, <parameter>replacement</parameter> <type>text</type>,
<parameter>start</parameter> <type>integer</type>,
- <parameter>N</parameter> <type>integer</type>
+ <parameter>count</parameter> <type>integer</type>
[, <parameter>flags</parameter> <type>text</type> ] )
<returnvalue>text</returnvalue>
</para>
<para>
- Replaces the substring that is the <parameter>N</parameter>'th
+ Replaces the substring that is the <parameter>count</parameter>'th
match to the POSIX regular expression <parameter>pattern</parameter>,
- or all such matches if <parameter>N</parameter> is zero; see
+ or all such matches if <parameter>count</parameter> is zero; see
<xref linkend="functions-posix-regexp"/>.
</para>
<para>
@@ -3511,14 +3511,14 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
</indexterm>
<function>regexp_substr</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>
[, <parameter>start</parameter> <type>integer</type>
- [, <parameter>N</parameter> <type>integer</type>
+ [, <parameter>count</parameter> <type>integer</type>
[, <parameter>flags</parameter> <type>text</type>
[, <parameter>subexpr</parameter> <type>integer</type> ] ] ] ] )
<returnvalue>text</returnvalue>
</para>
<para>
Returns the substring within <parameter>string</parameter> that
- matches the <parameter>N</parameter>'th occurrence of the POSIX
+ matches the <parameter>count</parameter>'th occurrence of the POSIX
regular expression <parameter>pattern</parameter>,
or <literal>NULL</literal> if there is no such match; see
<xref linkend="functions-posix-regexp"/>.
@@ -5921,13 +5921,13 @@ regexp_count('ABCABCAXYaxy', 'A.', 1, 'i') <lineannotation>4</lineannotation>
<para>
The <function>regexp_instr</function> function returns the starting or
- ending position of the <replaceable>N</replaceable>'th match of a
+ ending position of the <replaceable>count</replaceable>'th match of a
POSIX regular expression pattern to a string, or zero if there is no
such match. It has the syntax
<function>regexp_instr</function>(<replaceable>string</replaceable>,
<replaceable>pattern</replaceable>
<optional>, <replaceable>start</replaceable>
- <optional>, <replaceable>N</replaceable>
+ <optional>, <replaceable>count</replaceable>
<optional>, <replaceable>endoption</replaceable>
<optional>, <replaceable>flags</replaceable>
<optional>, <replaceable>subexpr</replaceable>
@@ -5936,8 +5936,8 @@ regexp_count('ABCABCAXYaxy', 'A.', 1, 'i') <lineannotation>4</lineannotation>
in <replaceable>string</replaceable>, normally from the beginning of
the string, but if the <replaceable>start</replaceable> parameter is
provided then beginning from that character index.
- If <replaceable>N</replaceable> is specified
- then the <replaceable>N</replaceable>'th match of the pattern
+ If <replaceable>count</replaceable> is specified
+ then the <replaceable>count</replaceable>'th match of the pattern
is located, otherwise the first match is located.
If the <replaceable>endoption</replaceable> parameter is omitted or
specified as zero, the function returns the position of the first
@@ -6057,8 +6057,8 @@ SELECT (regexp_match('foobarbequebaz', 'bar.*que'))[1];
expression pattern to a string. It has the same syntax as
<function>regexp_match</function>.
This function returns no rows if there is no match, one row if there is
- a match and the <literal>g</literal> flag is not given, or <replaceable>N</replaceable>
- rows if there are <replaceable>N</replaceable> matches and the <literal>g</literal> flag
+ a match and the <literal>g</literal> flag is not given, or <replaceable>count</replaceable>
+ rows if there are <replaceable>count</replaceable> matches and the <literal>g</literal> flag
is given. Each returned row is a text array containing the whole
matched substring or the substrings matching parenthesized
subexpressions of the <replaceable>pattern</replaceable>, just as described above
@@ -6109,18 +6109,18 @@ SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
The <function>regexp_replace</function> function provides substitution of
new text for substrings that match POSIX regular expression patterns.
It has the syntax
- <function>regexp_replace</function>(<replaceable>source</replaceable>,
+ <function>regexp_replace</function>(<replaceable>string</replaceable>,
<replaceable>pattern</replaceable>, <replaceable>replacement</replaceable>
<optional>, <replaceable>start</replaceable>
- <optional>, <replaceable>N</replaceable>
+ <optional>, <replaceable>count</replaceable>
</optional></optional>
<optional>, <replaceable>flags</replaceable> </optional>).
- (Notice that <replaceable>N</replaceable> cannot be specified
+ (Notice that <replaceable>count</replaceable> cannot be specified
unless <replaceable>start</replaceable> is,
but <replaceable>flags</replaceable> can be given in any case.)
- The <replaceable>source</replaceable> string is returned unchanged if
+ The <replaceable>string</replaceable> is returned unchanged if
there is no match to the <replaceable>pattern</replaceable>. If there is a
- match, the <replaceable>source</replaceable> string is returned with the
+ match, the <replaceable>string</replaceable> is returned with the
<replaceable>replacement</replaceable> string substituted for the matching
substring. The <replaceable>replacement</replaceable> string can contain
<literal>\</literal><replaceable>n</replaceable>, where <replaceable>n</replaceable> is 1
@@ -6135,14 +6135,14 @@ SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
the string, but if the <replaceable>start</replaceable> parameter is
provided then beginning from that character index.
By default, only the first match of the pattern is replaced.
- If <replaceable>N</replaceable> is specified and is greater than zero,
- then the <replaceable>N</replaceable>'th match of the pattern
+ If <replaceable>count</replaceable> is specified and is greater than zero,
+ then the <replaceable>count</replaceable>'th match of the pattern
is replaced.
If the <literal>g</literal> flag is given, or
- if <replaceable>N</replaceable> is specified and is zero, then all
+ if <replaceable>count</replaceable> is specified and is zero, then all
matches at or after the <replaceable>start</replaceable> position are
replaced. (The <literal>g</literal> flag is ignored
- when <replaceable>N</replaceable> is specified.)
+ when <replaceable>count</replaceable> is specified.)
The <replaceable>flags</replaceable> parameter is an optional text
string containing zero or more single-letter flags that change the
function's behavior. Supported flags (though
@@ -6253,7 +6253,7 @@ SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo;
<function>regexp_substr</function>(<replaceable>string</replaceable>,
<replaceable>pattern</replaceable>
<optional>, <replaceable>start</replaceable>
- <optional>, <replaceable>N</replaceable>
+ <optional>, <replaceable>count</replaceable>
<optional>, <replaceable>flags</replaceable>
<optional>, <replaceable>subexpr</replaceable>
</optional></optional></optional></optional>).
@@ -6261,8 +6261,8 @@ SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo;
in <replaceable>string</replaceable>, normally from the beginning of
the string, but if the <replaceable>start</replaceable> parameter is
provided then beginning from that character index.
- If <replaceable>N</replaceable> is specified
- then the <replaceable>N</replaceable>'th match of the pattern
+ If <replaceable>count</replaceable> is specified
+ then the <replaceable>count</replaceable>'th match of the pattern
is returned, otherwise the first match is returned.
The <replaceable>flags</replaceable> parameter is an optional text
string containing zero or more single-letter flags that change the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 153d816a..be28950c 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3623,105 +3623,148 @@
prosrc => 'replace_text' },
{ oid => '2284', descr => 'replace text using regexp',
proname => 'regexp_replace', prorettype => 'text',
- proargtypes => 'text text text', prosrc => 'textregexreplace_noopt' },
+ proargtypes => 'text text text',
+ proargnames => '{string, pattern, replacement}',
+ prosrc => 'textregexreplace_noopt' },
{ oid => '2285', descr => 'replace text using regexp',
proname => 'regexp_replace', prorettype => 'text',
- proargtypes => 'text text text text', prosrc => 'textregexreplace' },
+ proargtypes => 'text text text text',
+ proargnames => '{string, pattern, replacement, flags}',
+ prosrc => 'textregexreplace' },
{ oid => '6251', descr => 'replace text using regexp',
proname => 'regexp_replace', prorettype => 'text',
proargtypes => 'text text text int4 int4 text',
+ proargnames => '{string, pattern, replacement, start, count, flags}',
prosrc => 'textregexreplace_extended' },
{ oid => '6252', descr => 'replace text using regexp',
proname => 'regexp_replace', prorettype => 'text',
proargtypes => 'text text text int4 int4',
+ proargnames => '{string, pattern, replacement, start, count}',
prosrc => 'textregexreplace_extended_no_flags' },
{ oid => '6253', descr => 'replace text using regexp',
proname => 'regexp_replace', prorettype => 'text',
proargtypes => 'text text text int4',
+ proargnames => '{string, pattern, replacement, start}',
prosrc => 'textregexreplace_extended_no_n' },
{ oid => '3396', descr => 'find first match for regexp',
proname => 'regexp_match', prorettype => '_text', proargtypes => 'text text',
+ proargnames => '{string, pattern}',
prosrc => 'regexp_match_no_flags' },
{ oid => '3397', descr => 'find first match for regexp',
proname => 'regexp_match', prorettype => '_text',
- proargtypes => 'text text text', prosrc => 'regexp_match' },
+ proargtypes => 'text text text',
+ proargnames => '{string, pattern, flags}',
+ prosrc => 'regexp_match' },
{ oid => '2763', descr => 'find match(es) for regexp',
proname => 'regexp_matches', prorows => '1', proretset => 't',
prorettype => '_text', proargtypes => 'text text',
+ proargnames => '{string, pattern}',
prosrc => 'regexp_matches_no_flags' },
{ oid => '2764', descr => 'find match(es) for regexp',
proname => 'regexp_matches', prorows => '10', proretset => 't',
prorettype => '_text', proargtypes => 'text text text',
+ proargnames => '{string, pattern, flags}',
prosrc => 'regexp_matches' },
{ oid => '6254', descr => 'count regexp matches',
proname => 'regexp_count', prorettype => 'int4', proargtypes => 'text text',
+ proargnames => '{string, pattern}',
prosrc => 'regexp_count_no_start' },
{ oid => '6255', descr => 'count regexp matches',
proname => 'regexp_count', prorettype => 'int4',
- proargtypes => 'text text int4', prosrc => 'regexp_count_no_flags' },
+ proargtypes => 'text text int4',
+ proargnames => '{string, pattern, start}',
+ prosrc => 'regexp_count_no_flags' },
{ oid => '6256', descr => 'count regexp matches',
proname => 'regexp_count', prorettype => 'int4',
- proargtypes => 'text text int4 text', prosrc => 'regexp_count' },
+ proargtypes => 'text text int4 text',
+ proargnames => '{string, pattern, start, flags}',
+ prosrc => 'regexp_count' },
{ oid => '6257', descr => 'position of regexp match',
proname => 'regexp_instr', prorettype => 'int4', proargtypes => 'text text',
+ proargnames => '{string, pattern}',
prosrc => 'regexp_instr_no_start' },
{ oid => '6258', descr => 'position of regexp match',
proname => 'regexp_instr', prorettype => 'int4',
- proargtypes => 'text text int4', prosrc => 'regexp_instr_no_n' },
+ proargtypes => 'text text int4',
+ proargnames => '{string, pattern, start}',
+ prosrc => 'regexp_instr_no_n' },
{ oid => '6259', descr => 'position of regexp match',
proname => 'regexp_instr', prorettype => 'int4',
- proargtypes => 'text text int4 int4', prosrc => 'regexp_instr_no_endoption' },
+ proargtypes => 'text text int4 int4',
+ proargnames => '{string, pattern, start, count}',
+ prosrc => 'regexp_instr_no_endoption' },
{ oid => '6260', descr => 'position of regexp match',
proname => 'regexp_instr', prorettype => 'int4',
proargtypes => 'text text int4 int4 int4',
+ proargnames => '{string, pattern, start, count, endoption}',
prosrc => 'regexp_instr_no_flags' },
{ oid => '6261', descr => 'position of regexp match',
proname => 'regexp_instr', prorettype => 'int4',
proargtypes => 'text text int4 int4 int4 text',
+ proargnames => '{string, pattern, start, count, endoption, flags}',
prosrc => 'regexp_instr_no_subexpr' },
{ oid => '6262', descr => 'position of regexp match',
proname => 'regexp_instr', prorettype => 'int4',
proargtypes => 'text text int4 int4 int4 text int4',
+ proargnames => '{string, pattern, start, count, endoption, flags, subexpr}',
prosrc => 'regexp_instr' },
{ oid => '6263', descr => 'test for regexp match',
- proname => 'regexp_like', prorettype => 'bool', proargtypes => 'text text',
+ proname => 'regexp_like', prorettype => 'bool',
+ proargtypes => 'text text',
+ proargnames => '{string, pattern}',
prosrc => 'regexp_like_no_flags' },
{ oid => '6264', descr => 'test for regexp match',
proname => 'regexp_like', prorettype => 'bool',
- proargtypes => 'text text text', prosrc => 'regexp_like' },
+ proargtypes => 'text text text',
+ proargnames => '{string, pattern,flags}',
+ prosrc => 'regexp_like' },
{ oid => '6265', descr => 'extract substring that matches regexp',
proname => 'regexp_substr', prorettype => 'text', proargtypes => 'text text',
+ proargnames => '{string, pattern}',
prosrc => 'regexp_substr_no_start' },
{ oid => '6266', descr => 'extract substring that matches regexp',
proname => 'regexp_substr', prorettype => 'text',
- proargtypes => 'text text int4', prosrc => 'regexp_substr_no_n' },
+ proargtypes => 'text text int4',
+ proargnames => '{string, pattern, start}',
+ prosrc => 'regexp_substr_no_n' },
{ oid => '6267', descr => 'extract substring that matches regexp',
proname => 'regexp_substr', prorettype => 'text',
- proargtypes => 'text text int4 int4', prosrc => 'regexp_substr_no_flags' },
+ proargtypes => 'text text int4 int4',
+ proargnames => '{string, pattern, start, count}',
+ prosrc => 'regexp_substr_no_flags' },
{ oid => '6268', descr => 'extract substring that matches regexp',
proname => 'regexp_substr', prorettype => 'text',
proargtypes => 'text text int4 int4 text',
+ proargnames => '{string, pattern, start, count, flags}',
prosrc => 'regexp_substr_no_subexpr' },
{ oid => '6269', descr => 'extract substring that matches regexp',
proname => 'regexp_substr', prorettype => 'text',
- proargtypes => 'text text int4 int4 text int4', prosrc => 'regexp_substr' },
+ proargtypes => 'text text int4 int4 text int4',
+ proargnames => '{string, pattern, start, count, flags, subexpr}',
+ prosrc => 'regexp_substr' },
{ oid => '2088', descr => 'split string by field_sep and return field_num',
proname => 'split_part', prorettype => 'text',
proargtypes => 'text text int4', prosrc => 'split_part' },
{ oid => '2765', descr => 'split string by pattern',
proname => 'regexp_split_to_table', prorows => '1000', proretset => 't',
prorettype => 'text', proargtypes => 'text text',
+ proargnames => '{string, pattern}',
prosrc => 'regexp_split_to_table_no_flags' },
{ oid => '2766', descr => 'split string by pattern',
proname => 'regexp_split_to_table', prorows => '1000', proretset => 't',
prorettype => 'text', proargtypes => 'text text text',
+ proargnames => '{string, pattern, flags}',
prosrc => 'regexp_split_to_table' },
{ oid => '2767', descr => 'split string by pattern',
proname => 'regexp_split_to_array', prorettype => '_text',
- proargtypes => 'text text', prosrc => 'regexp_split_to_array_no_flags' },
+ proargtypes => 'text text',
+ proargnames => '{string, pattern}',
+ prosrc => 'regexp_split_to_array_no_flags' },
{ oid => '2768', descr => 'split string by pattern',
proname => 'regexp_split_to_array', prorettype => '_text',
- proargtypes => 'text text text', prosrc => 'regexp_split_to_array' },
+ proargtypes => 'text text text',
+ proargnames => '{string, pattern, flags}',
+ prosrc => 'regexp_split_to_array' },
{ oid => '9030', descr => 'convert int4 number to binary',
proname => 'to_bin', prorettype => 'text', proargtypes => 'int4',
prosrc => 'to_bin32' },
base-commit: 6f4d63e989ffbdb44f5856a17fae5ae68c40327d
--
2.34.1