Re: add function argument names to regex* functions.

jian he Wed, 10 Jan 2024 06:18:36 -0800

On Tue, Jan 9, 2024 at 8:52 AM Dian Fay <[email protected]> wrote:
>
> On Mon Jan 8, 2024 at 9:26 AM EST, jian he wrote:
> > On Mon, Jan 8, 2024 at 8:44 AM Dian Fay <[email protected]> wrote:
> > > The `regexp_replace` summary in table 9.10 is mismatched and still
> > > specifies the first parameter name as `string` instead of `source`.
> > > Since all the other functions use `string`, should `regexp_replace` do
> > > the same or is this a case where an established "standard" diverges?
> >
> > got it. Thanks for pointing it out.
> >
> > in functions-matching.html
> > if I change <replaceable>source</replaceable> to
> > <replaceable>string</replaceable> then
> > there are no markup "string" and markup "string", it's kind of
> > slightly confusing.
> >
> > So does the following refactored description of regexp_replace make sense:
> >
> >      The <replaceable>string</replaceable> is returned unchanged if
> >      there is no match to the <replaceable>pattern</replaceable>.  If there 
> > is a
> >      match, the <replaceable>string</replaceable> is returned with the
> >      <replaceable>replacement</replaceable> string substituted for the 
> > matching
> >      substring.  The <replaceable>replacement</replaceable> string can 
> > contain
> >      <literal>\</literal><replaceable>n</replaceable>, where
> > <replaceable>n</replaceable> is 1
> >      through 9, to indicate that the source substring matching the
> >      <replaceable>n</replaceable>'th parenthesized subexpression of
> > the pattern should be
> >      inserted, and it can contain <literal>\&amp;</literal> to indicate 
> > that the
> >      substring matching the entire pattern should be inserted.  Write
> >      <literal>\\</literal> if you need to put a literal backslash in
> > the replacement
> >      text.
>
> That change makes sense to me! I'll see about the section refactoring
> after this lands.


I put the changes into the new patch.

From 8fa04ed1fecb48ca8254d2ed7e60a9013fa130a3 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Wed, 10 Jan 2024 17:46:18 +0800
Subject: [PATCH v3 1/1] add function argument names to regex.* functions.

Specifically add function argument names to the following funtions:
    regexp_replace,
    regexp_match,
    regexp_matches
    regexp_count,
    regexp_instr,
    regexp_like,
    regexp_substr,
    regexp_split_to_table
    regexp_split_to_array
So it would be easier to understand these functions in psql via \df.
Also now these functions can be called in different notaions.
---
 doc/src/sgml/func.sgml          | 50 +++++++++++------------
 src/include/catalog/pg_proc.dat | 71 ++++++++++++++++++++++++++-------
 2 files changed, 82 insertions(+), 39 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index de78d58d..23ef07a5 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3299,7 +3299,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
         </indexterm>
         <function>regexp_instr</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>
          [, <parameter>start</parameter> <type>integer</type>
-         [, <parameter>N</parameter> <type>integer</type>
+         [, <parameter>occurrence</parameter> <type>integer</type>
          [, <parameter>endoption</parameter> <type>integer</type>
          [, <parameter>flags</parameter> <type>text</type>
          [, <parameter>subexpr</parameter> <type>integer</type> ] ] ] ] ] )
@@ -3307,7 +3307,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
        </para>
        <para>
         Returns the position within <parameter>string</parameter> where
-        the <parameter>N</parameter>'th match of the POSIX regular
+        the <parameter>occurrence</parameter>'th match of the POSIX regular
         expression <parameter>pattern</parameter> occurs, or zero if there is
         no such match; see <xref linkend="functions-posix-regexp"/>.
        </para>
@@ -3413,14 +3413,14 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
        <entry role="func_table_entry"><para role="func_signature">
         <function>regexp_replace</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>, <parameter>replacement</parameter> <type>text</type>,
          <parameter>start</parameter> <type>integer</type>,
-         <parameter>N</parameter> <type>integer</type>
+         <parameter>occurrence</parameter> <type>integer</type>
          [, <parameter>flags</parameter> <type>text</type> ] )
         <returnvalue>text</returnvalue>
        </para>
        <para>
-        Replaces the substring that is the <parameter>N</parameter>'th
+        Replaces the substring that is the <parameter>occurrence</parameter>'th
         match to the POSIX regular expression <parameter>pattern</parameter>,
-        or all such matches if <parameter>N</parameter> is zero; see
+        or all such matches if <parameter>occurrence</parameter> is zero; see
         <xref linkend="functions-posix-regexp"/>.
        </para>
        <para>
@@ -3478,14 +3478,14 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
         </indexterm>
         <function>regexp_substr</function> ( <parameter>string</parameter> <type>text</type>, <parameter>pattern</parameter> <type>text</type>
          [, <parameter>start</parameter> <type>integer</type>
-         [, <parameter>N</parameter> <type>integer</type>
+         [, <parameter>occurrence</parameter> <type>integer</type>
          [, <parameter>flags</parameter> <type>text</type>
          [, <parameter>subexpr</parameter> <type>integer</type> ] ] ] ] )
         <returnvalue>text</returnvalue>
        </para>
        <para>
         Returns the substring within <parameter>string</parameter> that
-        matches the <parameter>N</parameter>'th occurrence of the POSIX
+        matches the <parameter>occurrence</parameter>'th occurrence of the POSIX
         regular expression <parameter>pattern</parameter>,
         or <literal>NULL</literal> if there is no such match; see
         <xref linkend="functions-posix-regexp"/>.
@@ -5888,13 +5888,13 @@ regexp_count('ABCABCAXYaxy', 'A.', 1, 'i')  <lineannotation>4</lineannotation>
 
     <para>
      The <function>regexp_instr</function> function returns the starting or
-     ending position of the <replaceable>N</replaceable>'th match of a
+     ending position of the <replaceable>occurrence</replaceable>'th match of a
      POSIX regular expression pattern to a string, or zero if there is no
      such match.  It has the syntax
      <function>regexp_instr</function>(<replaceable>string</replaceable>,
      <replaceable>pattern</replaceable>
      <optional>, <replaceable>start</replaceable>
-     <optional>, <replaceable>N</replaceable>
+     <optional>, <replaceable>occurrence</replaceable>
      <optional>, <replaceable>endoption</replaceable>
      <optional>, <replaceable>flags</replaceable>
      <optional>, <replaceable>subexpr</replaceable>
@@ -5903,8 +5903,8 @@ regexp_count('ABCABCAXYaxy', 'A.', 1, 'i')  <lineannotation>4</lineannotation>
      in <replaceable>string</replaceable>, normally from the beginning of
      the string, but if the <replaceable>start</replaceable> parameter is
      provided then beginning from that character index.
-     If <replaceable>N</replaceable> is specified
-     then the <replaceable>N</replaceable>'th match of the pattern
+     If <replaceable>occurrence</replaceable> is specified
+     then the <replaceable>occurrence</replaceable>'th match of the pattern
      is located, otherwise the first match is located.
      If the <replaceable>endoption</replaceable> parameter is omitted or
      specified as zero, the function returns the position of the first
@@ -6024,8 +6024,8 @@ SELECT (regexp_match('foobarbequebaz', 'bar.*que'))[1];
      expression pattern to a string.  It has the same syntax as
      <function>regexp_match</function>.
      This function returns no rows if there is no match, one row if there is
-     a match and the <literal>g</literal> flag is not given, or <replaceable>N</replaceable>
-     rows if there are <replaceable>N</replaceable> matches and the <literal>g</literal> flag
+     a match and the <literal>g</literal> flag is not given, or <replaceable>occurrence</replaceable>
+     rows if there are <replaceable>occurrence</replaceable> matches and the <literal>g</literal> flag
      is given.  Each returned row is a text array containing the whole
      matched substring or the substrings matching parenthesized
      subexpressions of the <replaceable>pattern</replaceable>, just as described above
@@ -6076,18 +6076,18 @@ SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
      The <function>regexp_replace</function> function provides substitution of
      new text for substrings that match POSIX regular expression patterns.
      It has the syntax
-     <function>regexp_replace</function>(<replaceable>source</replaceable>,
+     <function>regexp_replace</function>(<replaceable>string</replaceable>,
      <replaceable>pattern</replaceable>, <replaceable>replacement</replaceable>
      <optional>, <replaceable>start</replaceable>
-     <optional>, <replaceable>N</replaceable>
+     <optional>, <replaceable>occurrence</replaceable>
      </optional></optional>
      <optional>, <replaceable>flags</replaceable> </optional>).
-     (Notice that <replaceable>N</replaceable> cannot be specified
+     (Notice that <replaceable>occurrence</replaceable> cannot be specified
      unless <replaceable>start</replaceable> is,
      but <replaceable>flags</replaceable> can be given in any case.)
-     The <replaceable>source</replaceable> string is returned unchanged if
+     The <replaceable>string</replaceable> is returned unchanged if
      there is no match to the <replaceable>pattern</replaceable>.  If there is a
-     match, the <replaceable>source</replaceable> string is returned with the
+     match, the <replaceable>string</replaceable> is returned with the
      <replaceable>replacement</replaceable> string substituted for the matching
      substring.  The <replaceable>replacement</replaceable> string can contain
      <literal>\</literal><replaceable>n</replaceable>, where <replaceable>n</replaceable> is 1
@@ -6102,14 +6102,14 @@ SELECT col1, (SELECT regexp_matches(col2, '(bar)(beque)')) FROM tab;
      the string, but if the <replaceable>start</replaceable> parameter is
      provided then beginning from that character index.
      By default, only the first match of the pattern is replaced.
-     If <replaceable>N</replaceable> is specified and is greater than zero,
-     then the <replaceable>N</replaceable>'th match of the pattern
+     If <replaceable>occurrence</replaceable> is specified and is greater than zero,
+     then the <replaceable>occurrence</replaceable>'th match of the pattern
      is replaced.
      If the <literal>g</literal> flag is given, or
-     if <replaceable>N</replaceable> is specified and is zero, then all
+     if <replaceable>occurrence</replaceable> is specified and is zero, then all
      matches at or after the <replaceable>start</replaceable> position are
      replaced.  (The <literal>g</literal> flag is ignored
-     when <replaceable>N</replaceable> is specified.)
+     when <replaceable>occurrence</replaceable> is specified.)
      The <replaceable>flags</replaceable> parameter is an optional text
      string containing zero or more single-letter flags that change the
      function's behavior.  Supported flags (though
@@ -6220,7 +6220,7 @@ SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo;
      <function>regexp_substr</function>(<replaceable>string</replaceable>,
      <replaceable>pattern</replaceable>
      <optional>, <replaceable>start</replaceable>
-     <optional>, <replaceable>N</replaceable>
+     <optional>, <replaceable>occurrence</replaceable>
      <optional>, <replaceable>flags</replaceable>
      <optional>, <replaceable>subexpr</replaceable>
      </optional></optional></optional></optional>).
@@ -6228,8 +6228,8 @@ SELECT foo FROM regexp_split_to_table('the quick brown fox', '\s*') AS foo;
      in <replaceable>string</replaceable>, normally from the beginning of
      the string, but if the <replaceable>start</replaceable> parameter is
      provided then beginning from that character index.
-     If <replaceable>N</replaceable> is specified
-     then the <replaceable>N</replaceable>'th match of the pattern
+     If <replaceable>occurrence</replaceable> is specified
+     then the <replaceable>occurrence</replaceable>'th match of the pattern
      is returned, otherwise the first match is returned.
      The <replaceable>flags</replaceable> parameter is an optional text
      string containing zero or more single-letter flags that change the
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 79793927..3b4330a6 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -3611,105 +3611,148 @@
   prosrc => 'replace_text' },
 { oid => '2284', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
-  proargtypes => 'text text text', prosrc => 'textregexreplace_noopt' },
+  proargtypes => 'text text text',
+  proargnames => '{string, pattern, replacement}',
+  prosrc => 'textregexreplace_noopt' },
 { oid => '2285', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
-  proargtypes => 'text text text text', prosrc => 'textregexreplace' },
+  proargtypes => 'text text text text',
+  proargnames => '{string, pattern, replacement, flags}',
+  prosrc => 'textregexreplace' },
 { oid => '6251', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
   proargtypes => 'text text text int4 int4 text',
+  proargnames => '{string, pattern, replacement, start, occurrence, flags}',
   prosrc => 'textregexreplace_extended' },
 { oid => '6252', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
   proargtypes => 'text text text int4 int4',
+  proargnames => '{string, pattern, replacement, start, occurrence}',
   prosrc => 'textregexreplace_extended_no_flags' },
 { oid => '6253', descr => 'replace text using regexp',
   proname => 'regexp_replace', prorettype => 'text',
   proargtypes => 'text text text int4',
+  proargnames => '{string, pattern, replacement, start}',
   prosrc => 'textregexreplace_extended_no_n' },
 { oid => '3396', descr => 'find first match for regexp',
   proname => 'regexp_match', prorettype => '_text', proargtypes => 'text text',
+  proargnames => '{string, pattern}',
   prosrc => 'regexp_match_no_flags' },
 { oid => '3397', descr => 'find first match for regexp',
   proname => 'regexp_match', prorettype => '_text',
-  proargtypes => 'text text text', prosrc => 'regexp_match' },
+  proargtypes => 'text text text',
+  proargnames => '{string, pattern, flags}',
+  prosrc => 'regexp_match' },
 { oid => '2763', descr => 'find match(es) for regexp',
   proname => 'regexp_matches', prorows => '1', proretset => 't',
   prorettype => '_text', proargtypes => 'text text',
+  proargnames => '{string, pattern}',
   prosrc => 'regexp_matches_no_flags' },
 { oid => '2764', descr => 'find match(es) for regexp',
   proname => 'regexp_matches', prorows => '10', proretset => 't',
   prorettype => '_text', proargtypes => 'text text text',
+  proargnames => '{string, pattern, flags}',
   prosrc => 'regexp_matches' },
 { oid => '6254', descr => 'count regexp matches',
   proname => 'regexp_count', prorettype => 'int4', proargtypes => 'text text',
+  proargnames => '{string, pattern}',
   prosrc => 'regexp_count_no_start' },
 { oid => '6255', descr => 'count regexp matches',
   proname => 'regexp_count', prorettype => 'int4',
-  proargtypes => 'text text int4', prosrc => 'regexp_count_no_flags' },
+  proargtypes => 'text text int4',
+  proargnames => '{string, pattern, start}',
+  prosrc => 'regexp_count_no_flags' },
 { oid => '6256', descr => 'count regexp matches',
   proname => 'regexp_count', prorettype => 'int4',
-  proargtypes => 'text text int4 text', prosrc => 'regexp_count' },
+  proargtypes => 'text text int4 text',
+  proargnames => '{string, pattern, start, flags}',
+  prosrc => 'regexp_count' },
 { oid => '6257', descr => 'position of regexp match',
   proname => 'regexp_instr', prorettype => 'int4', proargtypes => 'text text',
+  proargnames => '{string, pattern}',
   prosrc => 'regexp_instr_no_start' },
 { oid => '6258', descr => 'position of regexp match',
   proname => 'regexp_instr', prorettype => 'int4',
-  proargtypes => 'text text int4', prosrc => 'regexp_instr_no_n' },
+  proargtypes => 'text text int4',
+  proargnames => '{string, pattern, start}',
+  prosrc => 'regexp_instr_no_n' },
 { oid => '6259', descr => 'position of regexp match',
   proname => 'regexp_instr', prorettype => 'int4',
-  proargtypes => 'text text int4 int4', prosrc => 'regexp_instr_no_endoption' },
+  proargtypes => 'text text int4 int4',
+  proargnames => '{string, pattern, start, occurrence}',
+  prosrc => 'regexp_instr_no_endoption' },
 { oid => '6260', descr => 'position of regexp match',
   proname => 'regexp_instr', prorettype => 'int4',
   proargtypes => 'text text int4 int4 int4',
+  proargnames => '{string, pattern, start, occurrence, endoption}',
   prosrc => 'regexp_instr_no_flags' },
 { oid => '6261', descr => 'position of regexp match',
   proname => 'regexp_instr', prorettype => 'int4',
   proargtypes => 'text text int4 int4 int4 text',
+  proargnames => '{string, pattern, start, occurrence, endoption, flags}',
   prosrc => 'regexp_instr_no_subexpr' },
 { oid => '6262', descr => 'position of regexp match',
   proname => 'regexp_instr', prorettype => 'int4',
   proargtypes => 'text text int4 int4 int4 text int4',
+  proargnames => '{string, pattern, start, occurrence, endoption, flags, subexpr}',
   prosrc => 'regexp_instr' },
 { oid => '6263', descr => 'test for regexp match',
-  proname => 'regexp_like', prorettype => 'bool', proargtypes => 'text text',
+  proname => 'regexp_like', prorettype => 'bool',
+  proargtypes => 'text text',
+  proargnames => '{string, pattern}',
   prosrc => 'regexp_like_no_flags' },
 { oid => '6264', descr => 'test for regexp match',
   proname => 'regexp_like', prorettype => 'bool',
-  proargtypes => 'text text text', prosrc => 'regexp_like' },
+  proargtypes => 'text text text',
+  proargnames => '{string, pattern,flags}',
+  prosrc => 'regexp_like' },
 { oid => '6265', descr => 'extract substring that matches regexp',
   proname => 'regexp_substr', prorettype => 'text', proargtypes => 'text text',
+  proargnames => '{string, pattern}',
   prosrc => 'regexp_substr_no_start' },
 { oid => '6266', descr => 'extract substring that matches regexp',
   proname => 'regexp_substr', prorettype => 'text',
-  proargtypes => 'text text int4', prosrc => 'regexp_substr_no_n' },
+  proargtypes => 'text text int4',
+  proargnames => '{string, pattern, start}',
+  prosrc => 'regexp_substr_no_n' },
 { oid => '6267', descr => 'extract substring that matches regexp',
   proname => 'regexp_substr', prorettype => 'text',
-  proargtypes => 'text text int4 int4', prosrc => 'regexp_substr_no_flags' },
+  proargtypes => 'text text int4 int4',
+  proargnames => '{string, pattern, start, occurrence}',
+  prosrc => 'regexp_substr_no_flags' },
 { oid => '6268', descr => 'extract substring that matches regexp',
   proname => 'regexp_substr', prorettype => 'text',
   proargtypes => 'text text int4 int4 text',
+  proargnames => '{string, pattern, start, occurrence, flags}',
   prosrc => 'regexp_substr_no_subexpr' },
 { oid => '6269', descr => 'extract substring that matches regexp',
   proname => 'regexp_substr', prorettype => 'text',
-  proargtypes => 'text text int4 int4 text int4', prosrc => 'regexp_substr' },
+  proargtypes => 'text text int4 int4 text int4',
+  proargnames => '{string, pattern, start, occurrence, flags, subexpr}',
+  prosrc => 'regexp_substr' },
 { oid => '2088', descr => 'split string by field_sep and return field_num',
   proname => 'split_part', prorettype => 'text',
   proargtypes => 'text text int4', prosrc => 'split_part' },
 { oid => '2765', descr => 'split string by pattern',
   proname => 'regexp_split_to_table', prorows => '1000', proretset => 't',
   prorettype => 'text', proargtypes => 'text text',
+  proargnames => '{string, pattern}',
   prosrc => 'regexp_split_to_table_no_flags' },
 { oid => '2766', descr => 'split string by pattern',
   proname => 'regexp_split_to_table', prorows => '1000', proretset => 't',
   prorettype => 'text', proargtypes => 'text text text',
+  proargnames => '{string, pattern, flags}',
   prosrc => 'regexp_split_to_table' },
 { oid => '2767', descr => 'split string by pattern',
   proname => 'regexp_split_to_array', prorettype => '_text',
-  proargtypes => 'text text', prosrc => 'regexp_split_to_array_no_flags' },
+  proargtypes => 'text text',
+  proargnames => '{string, pattern}',
+  prosrc => 'regexp_split_to_array_no_flags' },
 { oid => '2768', descr => 'split string by pattern',
   proname => 'regexp_split_to_array', prorettype => '_text',
-  proargtypes => 'text text text', prosrc => 'regexp_split_to_array' },
+  proargtypes => 'text text text',
+  proargnames => '{string, pattern, flags}',
+  prosrc => 'regexp_split_to_array' },
 { oid => '9030', descr => 'convert int4 number to binary',
   proname => 'to_bin', prorettype => 'text', proargtypes => 'int4',
   prosrc => 'to_bin32' },
-- 
2.34.1

Re: add function argument names to regex* functions.

Reply via email to