I have updated the patch, attached, to clarify that this returns text
arrays, and that you can force it to always return one row using
COALESCE() and a '|' pattern (the later suggested by Daniele Varrazzo).
---------------------------------------------------------------------------
Bruce Momjian wrote:
> Daniele Varrazzo wrote:
> > "If there is no match to the pattern, the function returns no rows" is
> > easily overlooked as "it returns null", or some other behaviour that
> > don't change the returned set. The point is, because the function is
> > listed in the string function, you would expect the function to
> > manipulate text, not the dataset. The function as it is is not safe to
> > be used in a construct
> >
> > SELECT foo, bar, regexp_matches(bar, pattern) FROM table;
> >
> > unless you really wanted:
> >
> > SELECT foo, bar, regexp_matches(bar, pattern) FROM table WHERE bar
> > ~ pattern;
> >
> > otherwise you have to take measures to be able to deal with records in
> > which the pattern is not matched, for example:
> >
> > SELECT foo, bar, regexp_matches(bar, pattern || '|') FROM table;
> >
> > the latter still doesn't work when bar is NULL: in this case the
> > record is dropped anyway, so I don't think it can be proposed as
> > general solution.
> >
> > The characteristics of returning a set of text[] is useful when the
> > user wants all the matches, not only the first one: the behaviour is
> > selected specifying the flag 'g' as third argument.
> >
> > >From this point of view, I hope it can be stated that in its current
> > form the regexp_matches() has not the most optimal interface. Please
> > accept my apology for the tone being too rude in my previous message.
>
> I found the description in the documentation quite confusing also. I
> have created the attached documention patch which is clearer about the
> behavior of regexp_matches().
--
Bruce Momjian <[email protected]> http://momjian.us
EnterpriseDB http://enterprisedb.com
Index: doc/src/sgml/func.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.513
diff -c -c -r1.513 func.sgml
*** doc/src/sgml/func.sgml 7 Apr 2010 06:12:52 -0000 1.513
--- doc/src/sgml/func.sgml 29 May 2010 20:55:23 -0000
***************
*** 3445,3463 ****
</para>
<para>
! The <function>regexp_matches</> function returns all of the captured
! substrings resulting from matching a POSIX regular expression pattern.
! It has the syntax
<function>regexp_matches</function>(<replaceable>string</>, <replaceable>pattern</>
<optional>, <replaceable>flags</> </optional>).
! If there is no match to the <replaceable>pattern</>, the function returns
! no rows. If there is a match, the function returns a text array whose
<replaceable>n</>'th element is the substring matching the
<replaceable>n</>'th parenthesized subexpression of the pattern
(not counting <quote>non-capturing</> parentheses; see below for
! details). If the pattern does not contain any parenthesized
! subexpressions, then the result is a single-element text array containing
! the substring matching the whole pattern.
The <replaceable>flags</> parameter is an optional text
string containing zero or more single-letter flags that change the
function's behavior. Flag <literal>g</> causes the function to find
--- 3445,3466 ----
</para>
<para>
! The <function>regexp_matches</> function returns a text array of
! all of the captured substrings resulting from matching a POSIX
! regular expression pattern. It has the syntax
<function>regexp_matches</function>(<replaceable>string</>, <replaceable>pattern</>
<optional>, <replaceable>flags</> </optional>).
! The function can return no rows, one row, or multiple rows (see
! the <literal>g</> flag below). If the <replaceable>pattern</>
! does not match, the function returns no rows. If the pattern
! contains no parenthesized subexpressions, then each row
! returned is a single-element text array containing the substring
! matching the whole pattern. If the pattern contains parenthesized
! subexpressions, the function returns a text array whose
<replaceable>n</>'th element is the substring matching the
<replaceable>n</>'th parenthesized subexpression of the pattern
(not counting <quote>non-capturing</> parentheses; see below for
! details).
The <replaceable>flags</> parameter is an optional text
string containing zero or more single-letter flags that change the
function's behavior. Flag <literal>g</> causes the function to find
***************
*** 3490,3495 ****
--- 3493,3509 ----
</programlisting>
</para>
+ <para>
+ It is possible to force <function>regexp_matches()</> to always
+ return one row by using <function>COALESCE()</> and an empty
+ <literal>|</> pattern; this is particularly useful in a
+ <literal>SELECT</> target list when you want all rows returned,
+ even non-matching ones:
+ <programlisting>
+ SELECT col1, regexp_matches(COALESCE(col2, ''), '(bar)(beque)|') FROM tab;
+ </programlisting>
+ </para>
+
<para>
The <function>regexp_split_to_table</> function splits a string using a POSIX
regular expression pattern as a delimiter. It has the syntax
--
Sent via pgsql-bugs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs