Re: [PATCHES] patch adding new regexp functions

2007-03-28 Thread Jeremy Drake
 Jeremy Drake wrote:
  On Thu, 22 Mar 2007, Tom Lane wrote:
 
   I'd vote for making this new code look like the rest of it, to wit
   hardwire the values.
 
  Attached please find a patch which does this.

I just realized that the last patch removed all usage of fcinfo in the
setup_regexp_matches function, so this version of the patch also removes
it as a parameter to that function.

-- 
Think of it!  With VLSI we can pack 100 ENIACs in 1 sq. cm.!Index: src/backend/utils/adt/regexp.c
===
RCS file: 
/home/jeremyd/local/postgres/cvsuproot/pgsql/src/backend/utils/adt/regexp.c,v
retrieving revision 1.70
diff -c -r1.70 regexp.c
*** src/backend/utils/adt/regexp.c  20 Mar 2007 05:44:59 -  1.70
--- src/backend/utils/adt/regexp.c  28 Mar 2007 18:57:28 -
***
*** 30,35 
--- 30,36 
  #include postgres.h
  
  #include access/heapam.h
+ #include catalog/pg_type.h
  #include funcapi.h
  #include regex/regex.h
  #include utils/builtins.h
***
*** 95,106 
size_toffset;
  
re_comp_flags flags;
- 
-   /* text type info */
-   Oid   param_type;
-   int16 typlen;
-   bool  typbyval;
-   char  typalign;
  } regexp_matches_ctx;
  
  typedef struct regexp_split_ctx
--- 96,101 
***
*** 119,126 
  static intnum_res = 0;/* # of cached re's */
  static cached_re_str re_array[MAX_CACHED_RES];/* cached re's */
  
! static regexp_matches_ctx *setup_regexp_matches(FunctionCallInfo fcinfo,
!   
text *orig_str, text *pattern,

text *flags);
  static ArrayType *perform_regexp_matches(regexp_matches_ctx *matchctx);
  
--- 114,120 
  static intnum_res = 0;/* # of cached re's */
  static cached_re_str re_array[MAX_CACHED_RES];/* cached re's */
  
! static regexp_matches_ctx *setup_regexp_matches(text *orig_str, text *pattern,

text *flags);
  static ArrayType *perform_regexp_matches(regexp_matches_ctx *matchctx);
  
***
*** 760,767 
oldcontext = 
MemoryContextSwitchTo(funcctx-multi_call_memory_ctx);
  
/* be sure to copy the input string into the multi-call ctx */
!   matchctx = setup_regexp_matches(fcinfo, 
PG_GETARG_TEXT_P_COPY(0),
!   
pattern, flags);
  
MemoryContextSwitchTo(oldcontext);
funcctx-user_fctx = (void *) matchctx;
--- 754,761 
oldcontext = 
MemoryContextSwitchTo(funcctx-multi_call_memory_ctx);
  
/* be sure to copy the input string into the multi-call ctx */
!   matchctx = setup_regexp_matches(PG_GETARG_TEXT_P_COPY(0), 
pattern,
!   
flags);
  
MemoryContextSwitchTo(oldcontext);
funcctx-user_fctx = (void *) matchctx;
***
*** 822,828 
  }
  
  static regexp_matches_ctx *
! setup_regexp_matches(FunctionCallInfo fcinfo, text *orig_str, text *pattern, 
text *flags)
  {
regexp_matches_ctx  *matchctx = palloc(sizeof(regexp_matches_ctx));
  
--- 816,822 
  }
  
  static regexp_matches_ctx *
! setup_regexp_matches(text *orig_str, text *pattern, text *flags)
  {
regexp_matches_ctx  *matchctx = palloc(sizeof(regexp_matches_ctx));
  
***
*** 835,845 
matchctx-pmatch = palloc(sizeof(regmatch_t) * 
(matchctx-cpattern-re_nsub + 1));
matchctx-offset = 0;
  
-   /* get text type oid, too lazy to do it some other way */
-   matchctx-param_type = get_fn_expr_argtype(fcinfo-flinfo, 0);
-   get_typlenbyvalalign(matchctx-param_type, matchctx-typlen,
-matchctx-typbyval, 
matchctx-typalign);
- 
matchctx-wide_str = palloc(sizeof(pg_wchar) * (matchctx-orig_len + 
1));
matchctx-wide_len = pg_mb2wchar_with_len(VARDATA(matchctx-orig_str),

  matchctx-wide_str, matchctx-orig_len);
--- 829,834 
***
*** 915,923 
dims[0] = 1;
}
  
return construct_md_array(elems, nulls, ndims, dims, lbs,
! matchctx-param_type, 
matchctx-typlen,
! matchctx-typbyval, 
matchctx-typalign);
  }
  
  Datum
--- 904,912 
dims[0] = 1;
}
  
+   /* XXX: this hardcodes 

Re: [PATCHES] patch adding new regexp functions

2007-03-28 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Jeremy Drake wrote:
  Jeremy Drake wrote:
   On Thu, 22 Mar 2007, Tom Lane wrote:
  
I'd vote for making this new code look like the rest of it, to wit
hardwire the values.
  
   Attached please find a patch which does this.
 
 I just realized that the last patch removed all usage of fcinfo in the
 setup_regexp_matches function, so this version of the patch also removes
 it as a parameter to that function.
 
 -- 
 Think of it!  With VLSI we can pack 100 ENIACs in 1 sq. cm.!
Content-Description: 

[ Attachment, skipping... ]

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-03-28 Thread Neil Conway

Jeremy Drake wrote:

I just realized that the last patch removed all usage of fcinfo in the
setup_regexp_matches function, so this version of the patch also removes
it as a parameter to that function.

Applied, thanks.


-Neil


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] patch adding new regexp functions

2007-03-27 Thread Bruce Momjian
Jeremy Drake wrote:
 On Mon, 26 Mar 2007, Bruce Momjian wrote:
 
  Tom Lane wrote:
   Or were you speaking to the question of whether to adjust the regexp
   patch to conform more nearly to the coding practices found elsewhere?
   I agree with that, but I thought there was already a submitted patch
   for it.
 
  Yes, regex patch adjustment, and I have not seen a patch which makes
  such adjustments.
 
 http://archives.postgresql.org/pgsql-patches/2007-03/msg00285.php

Ah, yes, I still have that in my mailbox, but didn't think it was
related because there was discussion _after_ the patch was posted.  Will
add the patch to the queue now.  Thanks.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-03-27 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Jeremy Drake wrote:
 On Thu, 22 Mar 2007, Tom Lane wrote:
 
  I'd vote for making this new code look like the rest of it, to wit
  hardwire the values.
 
 Attached please find a patch which does this.
 
 
 
 -- 
 Although written many years ago, Lady Chatterley's Lover has just been
 reissued by the Grove Press, and this pictorial account of the
 day-to-day life of an English gamekeeper is full of considerable
 interest to outdoor minded readers, as it contains many passages on
 pheasant-raising, the apprehending of poachers, ways to control vermin,
 and other chores and duties of the professional gamekeeper.
 Unfortunately, one is obliged to wade through many pages of extraneous
 material in order to discover and savour those sidelights on the
 management of a midland shooting estate, and in this reviewer's opinion
 the book cannot take the place of J. R. Miller's Practical
 Gamekeeping.
   -- Ed Zern, Field and Stream (Nov. 1959)
Content-Description: 

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-03-26 Thread Bruce Momjian
Jeremy Drake wrote:
 On Thu, 22 Mar 2007, Tom Lane wrote:
 
  AFAIR, the reason there's no TextPGetDatum (and ditto for lots of other
  datatypes) is lack of obvious usefulness.  A function dealing with a
  text * doesn't normally have reason to convert that to a Datum until
  it returns --- and at that point PG_RETURN_TEXT_P is the thing to use.
  Do you have a counterexample, or does this just suggest that the regexp
  function patch needs some refactoring?
 
 If you are asking why I have reason to convert text * to a Datum in cases
 other than PG_RETURN_TEXT_P, it is used for calling text_substr functions
 using DirectFunctionCallN.  BTW, this usage of text_substr using
 PointerGetDatum was copied from the pre-existing textregexsubstr function.

Is there a follup patch based on this discussion?

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-03-26 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Jeremy Drake wrote:
 On Thu, 22 Mar 2007, Tom Lane wrote:
 AFAIR, the reason there's no TextPGetDatum (and ditto for lots of other
 datatypes) is lack of obvious usefulness.

 If you are asking why I have reason to convert text * to a Datum in cases
 other than PG_RETURN_TEXT_P, it is used for calling text_substr functions
 using DirectFunctionCallN.  BTW, this usage of text_substr using
 PointerGetDatum was copied from the pre-existing textregexsubstr function.

 Is there a follup patch based on this discussion?

Not at the moment.  I suppose someone could run around and replace
PointerGetDatum by (exactly-equivalent) TextPGetDatum etc, but it seems
like mostly make-work.  I definitely don't want to spend time on such
a project for 8.3.

Or were you speaking to the question of whether to adjust the regexp
patch to conform more nearly to the coding practices found elsewhere?
I agree with that, but I thought there was already a submitted patch
for it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] patch adding new regexp functions

2007-03-26 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Jeremy Drake wrote:
  On Thu, 22 Mar 2007, Tom Lane wrote:
  AFAIR, the reason there's no TextPGetDatum (and ditto for lots of other
  datatypes) is lack of obvious usefulness.
 
  If you are asking why I have reason to convert text * to a Datum in cases
  other than PG_RETURN_TEXT_P, it is used for calling text_substr functions
  using DirectFunctionCallN.  BTW, this usage of text_substr using
  PointerGetDatum was copied from the pre-existing textregexsubstr function.
 
  Is there a follup patch based on this discussion?
 
 Not at the moment.  I suppose someone could run around and replace
 PointerGetDatum by (exactly-equivalent) TextPGetDatum etc, but it seems
 like mostly make-work.  I definitely don't want to spend time on such
 a project for 8.3.
 
 Or were you speaking to the question of whether to adjust the regexp
 patch to conform more nearly to the coding practices found elsewhere?
 I agree with that, but I thought there was already a submitted patch
 for it.

Yes, regex patch adjustment, and I have not seen a patch which makes
such adjustments.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] patch adding new regexp functions

2007-03-26 Thread Jeremy Drake
On Mon, 26 Mar 2007, Bruce Momjian wrote:

 Tom Lane wrote:
  Or were you speaking to the question of whether to adjust the regexp
  patch to conform more nearly to the coding practices found elsewhere?
  I agree with that, but I thought there was already a submitted patch
  for it.

 Yes, regex patch adjustment, and I have not seen a patch which makes
 such adjustments.

http://archives.postgresql.org/pgsql-patches/2007-03/msg00285.php


-- 
Pope Goestheveezl was the shortest reigning pope in the history of the
Church, reigning for two hours and six minutes on 1 April 1866.  The
white smoke had hardly faded into the blue of the Vatican skies before
it dawned on the assembled multitudes in St. Peter's Square that his
name had hilarious possibilities.  The crowds fell about, helpless with
laughter, singing

Half a pound of tuppenny rice
Half a pound of treacle
That's the way the chimney smokes
Pope Goestheveezl

The square was finally cleared by armed carabineri with tears of
laughter streaming down their faces.  The event set a record for
hilarious civic functions, smashing the previous record set when Baron
Hans Neizant Bompzidaize was elected Landburgher of Koln in 1653.
-- Mike Harding, The Armchair Anarchist's Almanac

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Gregory Stark

Jeremy Drake [EMAIL PROTECTED] writes:

 BTW, should I be calling get_typlenbyvalalign on TEXTOID or are there macros
 for those also?

Hardcoding -1 for typlen of varlenas is one of the few (the only?) magic
constants used throughout the source code. I'm surprised there isn't a macro
for it though.

Do you need the alignment? If so I want to check the code against the packed
varlena patch. Just in case.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Gregory Stark

Gregory Stark [EMAIL PROTECTED] writes:

 Jeremy Drake [EMAIL PROTECTED] writes:

 BTW, should I be calling get_typlenbyvalalign on TEXTOID or are there macros
 for those also?

 Hardcoding -1 for typlen of varlenas is one of the few (the only?) magic
 constants used throughout the source code. I'm surprised there isn't a macro
 for it though.

 Do you need the alignment? If so I want to check the code against the packed
 varlena patch. Just in case.

Ah, it's just to construct an array, that's not a concern at all. And you're
detoasting the text data types before using or storing them so that's fine.


The only thing I would say is that this should maybe be a TextGetDatum() just
for code hygiene. It should be exactly equivalent though:

+ PointerGetDatum(matchctx-orig_str),

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 BTW, should I be calling
 get_typlenbyvalalign on TEXTOID or are there macros for those also?

By and large we tend to hard-wire those properties too, eg there are
plenty of places that do things like this:

/* XXX: this hardcodes assumptions about the regtype type */
result = construct_array(tmp_ary, num_params, REGTYPEOID, 4, true, 'i');

Some are better commented than others ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Jeremy Drake
On Wed, 21 Mar 2007, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  BTW, should I be calling
  get_typlenbyvalalign on TEXTOID or are there macros for those also?

 By and large we tend to hard-wire those properties too, eg there are
 plenty of places that do things like this:

 /* XXX: this hardcodes assumptions about the regtype type */
 result = construct_array(tmp_ary, num_params, REGTYPEOID, 4, true, 'i');

 Some are better commented than others ...

So, what action (if any) should be taken?  Should all (or some) of these
values be hardcoded, or should the current calls to determine them be left
in place?



   regards, tom lane

 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?

http://archives.postgresql.org



-- 
Fortune's Real-Life Courtroom Quote #19:

Q:  Doctor, how many autopsies have you performed on dead people?
A:  All my autopsies have been performed on dead people.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 On Wed, 21 Mar 2007, Tom Lane wrote:
 By and large we tend to hard-wire those properties too, eg there are
 plenty of places that do things like this:

 So, what action (if any) should be taken?  Should all (or some) of these
 values be hardcoded, or should the current calls to determine them be left
 in place?

I'd vote for making this new code look like the rest of it, to wit
hardwire the values.  As-is, you're expending code space and cycles
on flexibility that's entirely illusory --- if we ever decided to
change these properties of TEXT, we'd have way more work to do than
just fixing the new regexp functions.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Jeremy Drake
On Wed, 21 Mar 2007, Gregory Stark wrote:

 The only thing I would say is that this should maybe be a TextGetDatum() just
 for code hygiene. It should be exactly equivalent though:

I could not find such a thing using ctags, nor TextPGetDatum().  I looked
at PG_RETURN_TEXT_P and it just uses PG_RETURN_POINTER, which in turn uses
PointerGetDatum.  If there is such a thing, it is well camoflaged...

-- 
If you think the problem is bad now, just wait until we've solved it.
-- Arthur Kasspe

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Jeremy Drake
On Thu, 22 Mar 2007, Tom Lane wrote:

 I'd vote for making this new code look like the rest of it, to wit
 hardwire the values.

Attached please find a patch which does this.



-- 
Although written many years ago, Lady Chatterley's Lover has just been
reissued by the Grove Press, and this pictorial account of the
day-to-day life of an English gamekeeper is full of considerable
interest to outdoor minded readers, as it contains many passages on
pheasant-raising, the apprehending of poachers, ways to control vermin,
and other chores and duties of the professional gamekeeper.
Unfortunately, one is obliged to wade through many pages of extraneous
material in order to discover and savour those sidelights on the
management of a midland shooting estate, and in this reviewer's opinion
the book cannot take the place of J. R. Miller's Practical
Gamekeeping.
-- Ed Zern, Field and Stream (Nov. 1959)Index: src/backend/utils/adt/regexp.c
===
RCS file: 
/home/jeremyd/local/postgres/cvsuproot/pgsql/src/backend/utils/adt/regexp.c,v
retrieving revision 1.70
diff -c -r1.70 regexp.c
*** src/backend/utils/adt/regexp.c  20 Mar 2007 05:44:59 -  1.70
--- src/backend/utils/adt/regexp.c  22 Mar 2007 05:17:15 -
***
*** 30,35 
--- 30,36 
  #include postgres.h
  
  #include access/heapam.h
+ #include catalog/pg_type.h
  #include funcapi.h
  #include regex/regex.h
  #include utils/builtins.h
***
*** 95,106 
size_toffset;
  
re_comp_flags flags;
- 
-   /* text type info */
-   Oid   param_type;
-   int16 typlen;
-   bool  typbyval;
-   char  typalign;
  } regexp_matches_ctx;
  
  typedef struct regexp_split_ctx
--- 96,101 
***
*** 835,845 
matchctx-pmatch = palloc(sizeof(regmatch_t) * 
(matchctx-cpattern-re_nsub + 1));
matchctx-offset = 0;
  
-   /* get text type oid, too lazy to do it some other way */
-   matchctx-param_type = get_fn_expr_argtype(fcinfo-flinfo, 0);
-   get_typlenbyvalalign(matchctx-param_type, matchctx-typlen,
-matchctx-typbyval, 
matchctx-typalign);
- 
matchctx-wide_str = palloc(sizeof(pg_wchar) * (matchctx-orig_len + 
1));
matchctx-wide_len = pg_mb2wchar_with_len(VARDATA(matchctx-orig_str),

  matchctx-wide_str, matchctx-orig_len);
--- 830,835 
***
*** 915,923 
dims[0] = 1;
}
  
return construct_md_array(elems, nulls, ndims, dims, lbs,
! matchctx-param_type, 
matchctx-typlen,
! matchctx-typbyval, 
matchctx-typalign);
  }
  
  Datum
--- 905,913 
dims[0] = 1;
}
  
+   /* XXX: this hardcodes assumptions about the text type */
return construct_md_array(elems, nulls, ndims, dims, lbs,
! TEXTOID, -1, false, 
'i');
  }
  
  Datum
***
*** 976,991 
  {
ArrayBuildState *astate = NULL;
regexp_split_ctx*splitctx;
-   Oid  param_type;
int  nitems;
  
splitctx = setup_regexp_split(PG_GETARG_TEXT_P(0),
  
PG_GETARG_TEXT_P(1),
  
PG_GETARG_TEXT_P_IF_EXISTS(2));
  
-   /* get text type oid, too lazy to do it some other way */
-   param_type = get_fn_expr_argtype(fcinfo-flinfo, 0);
- 
for (nitems = 0; splitctx-offset  splitctx-wide_len; nitems++)
{
if (nitems  splitctx-wide_len)
--- 966,977 
***
*** 995,1001 
astate = accumArrayResult(astate,
  
get_next_split(splitctx),
  false,
! param_type,
  
CurrentMemoryContext);
}
  
--- 981,987 
astate = accumArrayResult(astate,
  
get_next_split(splitctx),
  false,
! TEXTOID,
  
CurrentMemoryContext);
}
  

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 I could not find such a thing using ctags, nor TextPGetDatum().  I looked
 at PG_RETURN_TEXT_P and it just uses PG_RETURN_POINTER, which in turn uses
 PointerGetDatum.  If there is such a thing, it is well camoflaged...

AFAIR, the reason there's no TextPGetDatum (and ditto for lots of other
datatypes) is lack of obvious usefulness.  A function dealing with a
text * doesn't normally have reason to convert that to a Datum until
it returns --- and at that point PG_RETURN_TEXT_P is the thing to use.
Do you have a counterexample, or does this just suggest that the regexp
function patch needs some refactoring?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-03-21 Thread Jeremy Drake
On Thu, 22 Mar 2007, Tom Lane wrote:

 AFAIR, the reason there's no TextPGetDatum (and ditto for lots of other
 datatypes) is lack of obvious usefulness.  A function dealing with a
 text * doesn't normally have reason to convert that to a Datum until
 it returns --- and at that point PG_RETURN_TEXT_P is the thing to use.
 Do you have a counterexample, or does this just suggest that the regexp
 function patch needs some refactoring?

If you are asking why I have reason to convert text * to a Datum in cases
other than PG_RETURN_TEXT_P, it is used for calling text_substr functions
using DirectFunctionCallN.  BTW, this usage of text_substr using
PointerGetDatum was copied from the pre-existing textregexsubstr function.


-- 
Malek's Law:
Any simple idea will be worded in the most complicated way.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-03-20 Thread Neil Conway

Jeremy Drake wrote:

The patch has been sitting in the unapplied patches queue for a while, and
the inevitable bitrot finally caught up with it.  This version of the
patch is exactly the same as the one in the patches queue, except for
using new, non-conflicting oids for the functions.
  

Applied -- thanks for the patch. BTW, a few cosmetic comments:

* #includes should be sorted alphabetically (unless another #include 
sort order rules take precedence, like including postgres.h first).


* patch included some trailing CR line endings

* IMHO using ++var in the increment component of a for loop is bad 
style in C (if only because it is needlessly inconsistent; good C++ 
style is not necessarily good C style)


* Comments like /* get text type oid, too lazy to do it some other way 
*/ do not exactly inspire confidence :)


-Neil


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-03-20 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 * Comments like /* get text type oid, too lazy to do it some other way 
 */ do not exactly inspire confidence :)

Surely the code was just using the TEXTOID macro?  If so, it does not
require a comment.  If not, it should be fixed ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-03-20 Thread Jeremy Drake
On Wed, 21 Mar 2007, Tom Lane wrote:

 Neil Conway [EMAIL PROTECTED] writes:
  * Comments like /* get text type oid, too lazy to do it some other way
  */ do not exactly inspire confidence :)

 Surely the code was just using the TEXTOID macro?  If so, it does not
 require a comment.  If not, it should be fixed ...

Nope, it does get_fn_expr_argtype(fcinfo-flinfo, 0);  The too lazy remark
was that I thought there may be a better way (like the macro you
mentioned) but did not go looking for it because I already had something
that worked that I found in the manual.  If you like, I can put together
another patch that removes the param_type members from the structs and
hardcodes TEXTOID in the proper places.  BTW, should I be calling
get_typlenbyvalalign on TEXTOID or are there macros for those also?


   regards, tom lane

 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?

http://archives.postgresql.org



-- 
We're only in it for the volume.
-- Black Sabbath

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-03-17 Thread Neil Conway

Jeremy Drake wrote:

The patch has been sitting in the unapplied patches queue for a while

Barring any objections, I'll apply this tomorrow.

-Neil


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] patch adding new regexp functions

2007-02-20 Thread Bruce Momjian

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---


Jeremy Drake wrote:
 On Sun, 18 Feb 2007, Tom Lane wrote:
 
  Jeremy Drake [EMAIL PROTECTED] writes:
   I will rename the functions regexp_split_to_(table|array) and I will add
   an optional limit parameter to the regexp_split_to_table function, for
   consistency and to avoid ordering concerns with LIMIT.
 
  I'd go the other way: get rid of the limit option all 'round.  It seems
  like fairly useless complexity.
 
 OK, here is what is hopefully the final version of this patch.  I have
 renamed the functions as above, and removed the optional limit parameter.
 If there are no further complaints, this should be ready to apply.
 
 -- 
 It is the business of little minds to shrink.
   -- Carl Sandburg
Content-Description: 

[ Attachment, skipping... ]

 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] patch adding new regexp functions

2007-02-18 Thread Peter Eisentraut
Jeremy Drake wrote:
  As for the argument about array vs setof, I could see doing both to
  end the argument of which one is really superior for any particular
  problem.

 regexp_split(string text, pattern text[, flags text]) returns setof
 text

 regexp_split_array(string text, pattern text[. flags text[, limit
 int]]) returns text[]

Since you are not splitting an array but returning an array, I would 
think that regexp_split_to_array would be better, and the other 
should then be regexp_split_to_table.

But why does the second one have a limit and the first one doesn't?  Is 
this because you rely on the LIMIT clause to do the same?  Is there a 
guarantee that LIMIT on a table function makes a consistent order?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-02-18 Thread Jeremy Drake
On Sun, 18 Feb 2007, Jeremy Drake wrote:

 On Sun, 18 Feb 2007, Peter Eisentraut wrote:

   regexp_split(string text, pattern text[, flags text]) returns setof
   text
  
   regexp_split_array(string text, pattern text[. flags text[, limit
   int]]) returns text[]
 
  Since you are not splitting an array but returning an array, I would
  think that regexp_split_to_array would be better, and the other
  should then be regexp_split_to_table.

 OK

  But why does the second one have a limit and the first one doesn't?

I will rename the functions regexp_split_to_(table|array) and I will add
an optional limit parameter to the regexp_split_to_table function, for
consistency and to avoid ordering concerns with LIMIT.

-- 
Sometimes I worry about being a success in a mediocre world.
-- Lily Tomlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] patch adding new regexp functions

2007-02-18 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 I will rename the functions regexp_split_to_(table|array) and I will add
 an optional limit parameter to the regexp_split_to_table function, for
 consistency and to avoid ordering concerns with LIMIT.

I'd go the other way: get rid of the limit option all 'round.  It seems
like fairly useless complexity.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Peter Eisentraut
Jeremy Drake wrote:
 In case you haven't noticed, I am rather averse to making this return
 text[] because it is much easier in my experience to use the results
 when returned in SETOF rather than text[],

The primary use case I know for string splitting is parsing  
comma/pipe/whatever separated fields into a row structure, and the way 
I see it your API proposal makes that exceptionally difficult.

I don't know what your use case is, though.  All of this is missing 
actual use cases.

 While, if you
 really really wanted a text[], you could use the (fully documented)
 ARRAY(select resultstr from regexp_split(...) order by startpos)
 construct.

I think, however, that we should be providing simple primitives that can 
be combined into complex expressions rather than complex primitives 
that have to be dissected apart to get simple results.

  As for the regexp_matches() function, it seems to me that it
  returns too much information at once.  What is the use case for
  getting all of prematch, fullmatch, matches, and postmatch in one
  call?

 It was requested by David Fetter:
 http://archives.postgresql.org/pgsql-hackers/2007-02/msg00056.php

 It was not horribly difficult to provide, and it seemed reasonable to
 me. I have no need for them personally.

David Fetter has also repeated failed to offer a use case for this, so I 
hesitate to accept this.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread David Fetter
On Sat, Feb 17, 2007 at 09:02:24AM +0100, Peter Eisentraut wrote:
 Jeremy Drake wrote:
  In case you haven't noticed, I am rather averse to making this return
  text[] because it is much easier in my experience to use the results
  when returned in SETOF rather than text[],
 
 The primary use case I know for string splitting is parsing  
 comma/pipe/whatever separated fields into a row structure, and the way 
 I see it your API proposal makes that exceptionally difficult.
 
 I don't know what your use case is, though.  All of this is missing 
 actual use cases.
 
  While, if you
  really really wanted a text[], you could use the (fully documented)
  ARRAY(select resultstr from regexp_split(...) order by startpos)
  construct.
 
 I think, however, that we should be providing simple primitives that can 
 be combined into complex expressions rather than complex primitives 
 that have to be dissected apart to get simple results.
 
   As for the regexp_matches() function, it seems to me that it
   returns too much information at once.  What is the use case for
   getting all of prematch, fullmatch, matches, and postmatch in one
   call?
 
  It was requested by David Fetter:
  http://archives.postgresql.org/pgsql-hackers/2007-02/msg00056.php
 
  It was not horribly difficult to provide, and it seemed reasonable
  to me. I have no need for them personally.
 
 David Fetter has also repeated failed to offer a use case for this,
 so I hesitate to accept this.

What is it about having the whole match, pre-match and post-match
available that you're objecting to?  Are you saying there aren't
common uses for any or all of these?  Regular expression users use
them all over the place, and adding this capability to SQL seems like
a reasonable next step :)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Jeremy Drake
On Sat, 17 Feb 2007, Peter Eisentraut wrote:

 Jeremy Drake wrote:
  In case you haven't noticed, I am rather averse to making this return
  text[] because it is much easier in my experience to use the results
  when returned in SETOF rather than text[],

 The primary use case I know for string splitting is parsing
 comma/pipe/whatever separated fields into a row structure, and the way
 I see it your API proposal makes that exceptionally difficult.

For this case see string_to_array:
http://developer.postgresql.org/pgdocs/postgres/functions-array.html
select string_to_array('a|b|c', '|');
 string_to_array
-
 {a,b,c}
(1 row)


 I don't know what your use case is, though.  All of this is missing
 actual use cases.

The particular use case I had for this function was at a previous
employer, and I am not sure exactly how much detail is appropriate to
divulge.  Basically, the project was doing some text processing inside of
postgres, and getting all of the words from a string into a table with
some processing (excluding stopwords and so forth) as efficiently as
possible was a big concern.

The regexp_split function code was based on some code that a friend of
mine wrote which used PCRE rather than postgres' internal regexp support.
I don't know exactly what his use-case was, but he probably had
one because he wrote the function and had it returning SETOF text ;)
Perhaps he can share a general idea of what it was (nudge nudge)?

  While, if you
  really really wanted a text[], you could use the (fully documented)
  ARRAY(select resultstr from regexp_split(...) order by startpos)
  construct.

 I think, however, that we should be providing simple primitives that can
 be combined into complex expressions rather than complex primitives
 that have to be dissected apart to get simple results.

The most simple primitive is string_to_array(text, text) returns text[],
but it was not sufficient for our needs.

   As for the regexp_matches() function, it seems to me that it
   returns too much information at once.  What is the use case for
   getting all of prematch, fullmatch, matches, and postmatch in one
   call?
 
  It was requested by David Fetter:
  http://archives.postgresql.org/pgsql-hackers/2007-02/msg00056.php
 
  It was not horribly difficult to provide, and it seemed reasonable to
  me. I have no need for them personally.

 David Fetter has also repeated failed to offer a use case for this, so I
 hesitate to accept this.

I have no strong opinion either way, so I will let those who do argue it
out and wait for the dust to settle ;)

-- 
The Law, in its majestic equality, forbids the rich, as well as the
poor, to sleep under the bridges, to beg in the streets, and to steal
bread.
-- Anatole France

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Peter Eisentraut
David Fetter wrote:
 What is it about having the whole match, pre-match and post-match
 available that you're objecting to?  Are you saying there aren't
 common uses for any or all of these?  Regular expression users use
 them all over the place,

You keep saying that, and I keep saying please show a use case.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 David Fetter wrote:
 What is it about having the whole match, pre-match and post-match
 available that you're objecting to?  Are you saying there aren't
 common uses for any or all of these?  Regular expression users use
 them all over the place,

 You keep saying that, and I keep saying please show a use case.

Maybe I'm missing something, but ISTM that given the ability to return
multiple match substrings there is no need for such features.  Given
an arbitrary regex 'xyz', write '^(.*)(xyz)(.*)$' and you'll get back
the pre-match, whole match, and post-match in addition to any
parenthesized substrings that 'xyz' contains.  If you only need some
of them, you can leave out the corresponding parts of this pattern.

So I'd vote against complicating the API in order to make special
provision for these results.  I claim that all we need is a function
taking (string text, pattern text, flags text) and returning either
array of text or setof text containing the matched substrings in
whatever order is standard (order by left-parenthesis position,
I think).  In the degenerate case where there are no parenthesized
subpatterns, just return the whole match as a single element.

As for the argument about array vs setof, I could see doing both to
end the argument of which one is really superior for any particular
problem.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Peter Eisentraut
Jeremy Drake wrote:
 For this case see string_to_array:

That only works with fixed delimiters, and I was hoping that 
regexp_split would be an alternative with regexp delimiters.  It's 
certainly another argument that all the split functions in PostgreSQL 
should offer analogous interfaces.  There is no reason being offered 
why splitting on a fixed string should result in an array and splitting 
on a regexp should result in a table.

 The particular use case I had for this function was at a previous
 employer, and I am not sure exactly how much detail is appropriate to
 divulge.  Basically, the project was doing some text processing
 inside of postgres, and getting all of the words from a string into a
 table with some processing (excluding stopwords and so forth) as
 efficiently as possible was a big concern.

We already have tsearch for that.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Jeremy Drake
Finally the voice of reason :)
On Sat, 17 Feb 2007, Tom Lane wrote:

 So I'd vote against complicating the API in order to make special
 provision for these results.  I claim that all we need is a function
 taking (string text, pattern text, flags text) and returning either
 array of text or setof text

For this function, it would be setof array of text, as the capture groups
would definitely go in an array, but if you asked for global in the flags,
there could be more than one match in the string.

 containing the matched substrings in
 whatever order is standard (order by left-parenthesis position,
 I think).  In the degenerate case where there are no parenthesized
 subpatterns, just return the whole match as a single element.

Good idea, that did not occur to me.  I was planning to throw an error in
that case.

 As for the argument about array vs setof, I could see doing both to
 end the argument of which one is really superior for any particular
 problem.

The array vs setof argument was on the split function.  I will work on
doing both.  Any idea how the name and/or arguments should differ?  I
think that an optional limit argument for the array version like perl has
would be reasonable, but a name difference in the functions is probably
necessary to avoid confusion.


-- 
Speaking of Godzilla and other things that convey horror:

With a purposeful grimace and a Mongo-like flair
He throws the spinning disk drives in the air!
And he picks up a Vax and he throws it back down
As he wades through the lab making terrible sounds!
Helpless users with projects due
Scream My God! as he stomps on the tape drives, too!

Oh, no!  He says Unix runs too slow!  Go, go, DECzilla!
Oh, yes!  He's gonna bring up VMS!  Go, go, DECzilla!

* VMS is a trademark of Digital Equipment Corporation
* DECzilla is a trademark of Hollow Chocolate Bunnies of Death, Inc.
-- Curtis Jackson

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Mark Dilger

Jeremy Drake wrote:

The regexp_split function code was based on some code that a friend of
mine wrote which used PCRE rather than postgres' internal regexp support.
I don't know exactly what his use-case was, but he probably had
one because he wrote the function and had it returning SETOF text ;)
Perhaps he can share a general idea of what it was (nudge nudge)?


db=# CREATE OR REPLACE FUNCTION split(p TEXT, t TEXT) RETURNS SETOF TEXT AS $$
db$# my ($p, $t) = @_;
db$# return [ split(/$p/,$t) ];
db$# $$ LANGUAGE plperl;
CREATE FUNCTION
Time: 1.254 ms
db=# select distinct word from (select * from split('\\W+','mary had a little 
lamb, whose fleece was black as soot') as word) as ss;

  word

 a
 as
 black
 fleece
 had
 lamb
 little
 mary
 soot
 was
 whose
(11 rows)

Time: 30.517 ms



As you can see, this can easily be done with a plperl function.  Some people may 
not want to install plperl, or may not want to allow arbitrary patterns to be 
handed to perl in this fashion.  That was not my concern.  I was simply trying 
to see if I could make it faster in a C-language coded function.


In the end I dropped the project because the plperl function works fast enough 
for me and I don't have any objection to plperl from a security standpoint, etc.


mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 On Sat, 17 Feb 2007, Tom Lane wrote:
 So I'd vote against complicating the API in order to make special
 provision for these results.  I claim that all we need is a function
 taking (string text, pattern text, flags text) and returning either
 array of text or setof text

 For this function, it would be setof array of text, as the capture groups
 would definitely go in an array, but if you asked for global in the flags,
 there could be more than one match in the string.

Oh, right.  And you could do a 2-D array if you wanted it all in one
blob (or a guarantee of order).  So no need for record-returning functions?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] patch adding new regexp functions

2007-02-17 Thread Jeremy Drake
On Sat, 17 Feb 2007, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  On Sat, 17 Feb 2007, Tom Lane wrote:
  So I'd vote against complicating the API in order to make special
  provision for these results.  I claim that all we need is a function
  taking (string text, pattern text, flags text) and returning either
  array of text or setof text

  For this function, it would be setof array of text, as the capture groups
  would definitely go in an array, but if you asked for global in the flags,
  there could be more than one match in the string.

 Oh, right.  And you could do a 2-D array if you wanted it all in one
 blob (or a guarantee of order).

I don't think that there is much of an argument for having this one return
a 2d array, in this particular case, even perl requires you to build a
loop, IIRC.

my $str = 'foobarbecuebazilbarf';
while($str=~/(b[^b]+)(b[^b]+)/g) {
print $1, \t, length($1), \n;
print $2, \t, length($2), \n;
print ---\n;
}

bar 3
becue   5
---
bazil   5
barf4
---

 So no need for record-returning functions?
No.


-- 
Go placidly amid the noise and waste, and remember what value there may
be in owning a piece thereof.
-- National Lampoon, Deteriorata

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-02-16 Thread Peter Eisentraut
Am Freitag, 16. Februar 2007 08:02 schrieb Jeremy Drake:
 On Thu, 15 Feb 2007, Peter Eisentraut wrote:
  I have no strong opinion about how matches are returned.  Seeing the
  definitional difficulties that you point out, it may be fine to return
  them unordered.  But then all matches functions should do that.
 
  For the split functions, however, providing the order is clearly
  important.

 Does this version sufficiently address your concerns?

I don't think anyone asked for the start position and length in the result of 
regexp_split().  The result should be an array of text.  That is what Perl et 
al. do.

As for the regexp_matches() function, it seems to me that it returns too much 
information at once.  What is the use case for getting all of prematch, 
fullmatch, matches, and postmatch in one call?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-02-16 Thread David Fetter
On Fri, Feb 16, 2007 at 05:54:47PM +0100, Peter Eisentraut wrote:
 Am Freitag, 16. Februar 2007 17:11 schrieb David Fetter:
   As for the regexp_matches() function, it seems to me that it
   returns too much information at once.  What is the use case for
   getting all of prematch, fullmatch, matches, and postmatch in
   one call?
 
  If not in one call, how would you get it?  Perl, for example,
  makes these available to any regex match in the form of variables
  it sets.
 
 The question is, what is the use case?  If there is one in Perl, can
 this proposed function API support it?

Perl makes the following variables available in any regex match,
although it optimizes some cases for when they're not there:

$1, ... $n (captured matches in parentheses)
$` (pre-match)
$' (post-match)
$ (whole match)

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] patch adding new regexp functions

2007-02-16 Thread Andrew Dunstan

David Fetter wrote:


The question is, what is the use case?  If there is one in Perl, can
this proposed function API support it?



Perl makes the following variables available in any regex match,
although it optimizes some cases for when they're not there:

$1, ... $n (captured matches in parentheses)
$` (pre-match)
$' (post-match)
$ (whole match)

  


Use of any of these is notoriously costly, BTW, especially pre-match and 
post-match. See perlre man page.


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] patch adding new regexp functions

2007-02-16 Thread Peter Eisentraut
David Fetter wrote:
  The question is, what is the use case?  If there is one in Perl,
  can this proposed function API support it?

 Perl makes the following variables available in any regex match,

That is not a use case.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] patch adding new regexp functions

2007-02-16 Thread Jeremy Drake
On Fri, 16 Feb 2007, Peter Eisentraut wrote:

 Am Freitag, 16. Februar 2007 08:02 schrieb Jeremy Drake:
  Does this version sufficiently address your concerns?

 I don't think anyone asked for the start position and length in the result of
 regexp_split().  The result should be an array of text.  That is what Perl et
 al. do.

The length is not returned, I simply call length() on the string result to
make sure that no whitespace gets in.

The start position was suggested in these two messages from Alvaro Herrera:
http://archives.postgresql.org/pgsql-patches/2007-02/msg00276.php
http://archives.postgresql.org/pgsql-patches/2007-02/msg00281.php

This was meant to address your concern about the order not necessarily
being preserved of the split results when being returned as SETOF.  This
gives the benefits of returning SETOF while still allowing the order to be
preserved if desired (simply add ORDER BY startpos to guarantee the
correct order).

In case you haven't noticed, I am rather averse to making this return
text[] because it is much easier in my experience to use the results when
returned in SETOF rather than text[], and in all of the code that I have
experience with where this would be useful I would end up using
information_schema._pg_expandarray (a function that, AFAIK, is
documented nowhere) to convert it into SETOF text.  While, if you really
really wanted a text[], you could use the (fully documented) ARRAY(select
resultstr from regexp_split(...) order by startpos) construct.

 As for the regexp_matches() function, it seems to me that it returns too much
 information at once.  What is the use case for getting all of prematch,
 fullmatch, matches, and postmatch in one call?

It was requested by David Fetter:
http://archives.postgresql.org/pgsql-hackers/2007-02/msg00056.php

It was not horribly difficult to provide, and it seemed reasonable to me.
I have no need for them personally.

-- 
Some people in this department wouldn't recognize subtlety if it hit
them on the head.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Peter Eisentraut
Neil Conway wrote:
 On Wed, 2007-02-14 at 16:49 -0800, Jeremy Drake wrote:
  What was the status of this?  Was there anything else I needed to
  do with this patch, or is it ready to be applied?  Let me know if
  there is anything else I need to do on this...

 Will do -- I'm planning to apply this as soon as I have the free
 cycles to do so, likely tomorrow or Friday.

I don't know which patch is actually being proposed now.  It would be 
good to make this more explicit and maybe include a synopsis of the 
functions in the email, so we know what's going on.

What confuses me about some of the functions I've seen in earlier 
patches in this thread is that they return setof something.  But in my 
mind, regular expression matches or string splits are inherently 
ordered, so an array would be the correct return type.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Jeremy Drake
On Thu, 15 Feb 2007, Peter Eisentraut wrote:

 Neil Conway wrote:
  On Wed, 2007-02-14 at 16:49 -0800, Jeremy Drake wrote:
   What was the status of this?  Was there anything else I needed to
   do with this patch, or is it ready to be applied?  Let me know if
   there is anything else I need to do on this...
 
  Will do -- I'm planning to apply this as soon as I have the free
  cycles to do so, likely tomorrow or Friday.

 I don't know which patch is actually being proposed now.  It would be
 good to make this more explicit and maybe include a synopsis of the
 functions in the email, so we know what's going on.

Sorry, my intent was just to check to see if I had gotten the patch
sufficiently fixed for Neil to apply and he just hadn't gotten to it yet
(which seems to be the case), or if there was something else he still
expected me to fix that I had missed in the prior discussions.  I suppose
I should have emailed him privately.

The patch in question can be seen in the archives here:
http://archives.postgresql.org/pgsql-patches/2007-02/msg00214.php

The functions added are:
* regexp_split(str text, pattern text) RETURNS SETOF text
  regexp_split(str text, pattern text, flags text) RETURNS SETOF text
   returns each section of the string delimited by the pattern.
* regexp_matches(str text, pattern text) RETURNS text[]
   returns all capture groups when matching pattern against string in an
   array
* regexp_matches(str text, pattern text, flags text) RETURNS SETOF
(prematch text, fullmatch text, matches text[], postmatch text)
   returns all capture groups when matching pattern against string in an
   array.  also returns the entire match in fullmatch.  if the 'g' option
   is given, returns all matches in the string.  if the 'r' option is
   given, also return the text before and after the match in prematch and
   postmatch respectively.

 What confuses me about some of the functions I've seen in earlier
 patches in this thread is that they return setof something.  But in my
 mind, regular expression matches or string splits are inherently
 ordered, so an array would be the correct return type.

They do return SETOF.  Addressing them separately:

regexp_matches uses a text[] for the match groups.  If you specify the
global flag, it could return multiple matches.  Couple this with the
requested feature of pre- and postmatch returns (with its own flag) and
the return would turn into some sort of nasty array of tuples, or multiple
arrays.  It seems much cleaner to me to return a set of the matches found,
and I find which order the matches occur in to be much less important than
the fact that they did occur and their contents.


regexp_split returns setof text.  This has, in my opinion, a much greater
case to return an array.  However, there are several issues with this
approach:

# My experience with the array code leads me to believe that building up
an array is an expensive proposition.  I know I could code it smarter so
that the array is only constructed in the end.

# With a set-returning function, it is possible to add a LIMIT clause, to
prevent splitting up more of the string than is necessary.  It is also
immediately possible to insert them into a table, or do grouping on them,
or call a function on each value.  Most of the time when I do a split, I
intend to do something like this with each split value.

# You can still get an array if you really want it:
#* SELECT ARRAY(SELECT * FROM regexp_split('first, second, third', E',\\s*'))



-- 
No problem is so formidable that you can't just walk away from it.
-- C. Schulz

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Peter Eisentraut
Jeremy Drake wrote:
 regexp_matches uses a text[] for the match groups.  If you specify
 the global flag, it could return multiple matches.  Couple this with
 the requested feature of pre- and postmatch returns (with its own
 flag) and the return would turn into some sort of nasty array of
 tuples, or multiple arrays.  It seems much cleaner to me to return a
 set of the matches found, and I find which order the matches occur in
 to be much less important than the fact that they did occur and their
 contents.

Then the fact that the flag-less matches function returns an array would 
be a mistake.  They have to return the same category of object.

 regexp_split returns setof text.  This has, in my opinion, a much
 greater case to return an array.  However, there are several issues
 with this approach:

Any programming language I have ever seen returns the result of a 
regular expression split as a structure with order.  That in turn 
implies that there are use cases for having the order, which your 
proposed function could not address.

 # My experience with the array code leads me to believe that building
 up an array is an expensive proposition.  I know I could code it
 smarter so that the array is only constructed in the end.

You can make any code arbitrarily fast if it doesn't have to give the 
right answer.

 # With a set-returning function, it is possible to add a LIMIT
 clause, to prevent splitting up more of the string than is necessary.

You can also add such functionality to a function in form of a 
parameter.  In fact, relying on a LIMIT clause to do this seems pretty 
fragile.  We argue elsewhere that LIMIT without ORDER BY is not 
well-defined, and while it's hard to imagine in the current 
implementation why the result of a set returning function would come 
back in arbitrary order, it is certainly possible in theory, so you 
still need to order the result set if you want reliable limits, but 
that is not possible of the order is lost in the result.

  It is also immediately possible to insert them into a table, or do
 grouping on them, or call a function on each value.  Most of the time
 when I do a split, I intend to do something like this with each split
 value.

These sort of arguments remind me of the contrib/xml2 module, which also 
has a very, uh, pragmatic API.  Sure, these operations may seem useful 
to you.  But when we offer a function as part of the core API, it is 
also important that we offer a clean design that allows other users to 
combine reasonably orthogonal functionality into tools that are useful 
to them.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Alvaro Herrera
Jeremy Drake wrote:

 The functions added are:
 * regexp_split(str text, pattern text) RETURNS SETOF text
   regexp_split(str text, pattern text, flags text) RETURNS SETOF text
returns each section of the string delimited by the pattern.
 * regexp_matches(str text, pattern text) RETURNS text[]
returns all capture groups when matching pattern against string in an
array
 * regexp_matches(str text, pattern text, flags text) RETURNS SETOF
 (prematch text, fullmatch text, matches text[], postmatch text)
returns all capture groups when matching pattern against string in an
array.  also returns the entire match in fullmatch.  if the 'g' option
is given, returns all matches in the string.  if the 'r' option is
given, also return the text before and after the match in prematch and
postmatch respectively.

I think the position the match is in could be important.  I'm wondering
if you could define them like

create type re_match(match text, position int)
regexp_split(str text, pattern text) returns setof re_match

or maybe
regexp_split(str text, pattern text, OUT match text, OUT position int);
(not sure of the exact syntax for this one)

so that you would have the position for each match, automatically.  Is
this information available in the regex code?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 Jeremy Drake wrote:
 # My experience with the array code leads me to believe that building
 up an array is an expensive proposition.  I know I could code it
 smarter so that the array is only constructed in the end.

 You can make any code arbitrarily fast if it doesn't have to give the 
 right answer.

Even more to the point, it's folly to suppose that the overhead of
processing a SETOF result is less than that of array construction.

I tend to agree with Peter's concern that returning a set is going to
make it hard to track which result is which.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 so that you would have the position for each match, automatically.  Is
 this information available in the regex code?

Certainly, that's where we got the text snippets from to begin with.
However, I'm not sure that this is important enough to justify a special
type --- for one thing, since we don't have arrays of composites, that
would foreclose responding to Peter's concern that SETOF is the wrong
thing.  If you look at the Perl and Tcl APIs for regexes, they return
just the strings, not the numerical positions; and I've not heard anyone
complaining about that.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  so that you would have the position for each match, automatically.  Is
  this information available in the regex code?
 
 Certainly, that's where we got the text snippets from to begin with.
 However, I'm not sure that this is important enough to justify a special
 type --- for one thing, since we don't have arrays of composites, that
 would foreclose responding to Peter's concern that SETOF is the wrong
 thing.

My point is that if you want to have the order in which the matches were
found, you can do that easily by looking at the positions; no need to
create an ordered array.  Which does respond to Peter's concern, since
the point was to keep the ordering of matches, which an array does; but
if we provide the positions, the SETOF way does as well.

On the other hand, I don't think it's impossible to have matches that
start earlier than others in the string, but are actually found later
(say, because they are a parentized expression that ends later).  So
giving the starting positions allows one to know where are they located,
rather than where were they reported.  (I don't really know if the
matches are sorted before reporting though.)

 If you look at the Perl and Tcl APIs for regexes, they return
 just the strings, not the numerical positions; and I've not heard anyone
 complaining about that.

I know, but that may be just because it would be too much extra
complexity for them (in terms of user API) to be returning the positions
along the text.  I know I'd be fairly annoyed if =~ in Perl returned an
array of hashes { text = 'foo', position = 42} instead of array of
text.  We don't have that problem.

In fact, I would claim that's much easier to deal with a SETOF function
than is to deal with text[].

Regarding the nobody complains argument, I don't find that
particularly compelling; witness how people gets used to working around
limitations in MySQL ... ;-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Peter Eisentraut
Alvaro Herrera wrote:
 On the other hand, I don't think it's impossible to have matches that
 start earlier than others in the string, but are actually found later
 (say, because they are a parentized expression that ends later).  So
 giving the starting positions allows one to know where are they
 located, rather than where were they reported.  (I don't really know
 if the matches are sorted before reporting though.)

I have no strong opinion about how matches are returned.  Seeing the 
definitional difficulties that you point out, it may be fine to return 
them unordered.  But then all matches functions should do that.

For the split functions, however, providing the order is clearly 
important.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread David Fetter
On Thu, Feb 15, 2007 at 10:37:26AM -0500, Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  so that you would have the position for each match, automatically.  Is
  this information available in the regex code?
 
 Certainly, that's where we got the text snippets from to begin with.
 However, I'm not sure that this is important enough to justify a
 special type --- for one thing, since we don't have arrays of
 composites,

This is a TODO :)

I've obviously misunderstood the scope of the TODO because it appears
that an INSERT into pg_type at creation time for compound types that
looks something like the below would do it.  What have I missed?

INSERT INTO pg_type
VALUES (
'_foo', /* Generated by makeArrayTypeName */
16744,  /* OID of schema */
10, /* OID of owner of the base type */
-1, /* typlen indicates varlena */
'f',/* not passed by value */
'c',/* typtype is composite */
't',/* type is already defined */
',',/* typdelim */
0,  /* should this actually refer to the type? */
'foo'::regtype, /* typelem */
'array_in', /* typinput */
'array_out',/* typoutput */
'array_recv',   /* typreceive */
'array_send',   /* typsend */
0,  /* typanalyze */
'i',/* typalign.  Should this be 'd'? */
'x',/* typstorage */
'f',/* not a DOMAIN, but while we're at it, why not arrays of 
DOMAIN? */
0,  /* base type. should this be different? */
-1, /* no typmod */
0   /* dims not specified */
);


 that would foreclose responding to Peter's concern that SETOF is the
 wrong thing.  If you look at the Perl and Tcl APIs for regexes, they
 return just the strings, not the numerical positions; and I've not
 heard anyone complaining about that.

They do return them in the order in which they appear, though, which,
as far as I can tell, Jeremy's functions also do.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Jeremy Drake
On Thu, 15 Feb 2007, Alvaro Herrera wrote:

 Jeremy Drake wrote:

  The functions added are:
  * regexp_split(str text, pattern text) RETURNS SETOF text
regexp_split(str text, pattern text, flags text) RETURNS SETOF text
 returns each section of the string delimited by the pattern.
  * regexp_matches(str text, pattern text) RETURNS text[]
 returns all capture groups when matching pattern against string in an
 array
  * regexp_matches(str text, pattern text, flags text) RETURNS SETOF
  (prematch text, fullmatch text, matches text[], postmatch text)
 returns all capture groups when matching pattern against string in an
 array.  also returns the entire match in fullmatch.  if the 'g' option
 is given, returns all matches in the string.  if the 'r' option is
 given, also return the text before and after the match in prematch and
 postmatch respectively.

 I think the position the match is in could be important.  I'm wondering
 if you could define them like

 create type re_match(match text, position int)
 regexp_split(str text, pattern text) returns setof re_match

So it looks like the issues are:
1. regexp_matches without flags has a different return type than
   regexp_matches with flags.  I can make them return the same OUT
   parameters, but should I declare it as returning SETOF when I know
   for a fact that the no-flags version will never return more than one
   row?

2. regexp_split does not represent the order of the results.  I can do
   something like:

 regexp_split(str text, pattern text[, flags text], OUT result text, OUT
startpos int) RETURNS SETOF record;

It could also have the int being a simple counter to represent the
relative order, rather than the position.


Thoughts?  Do these changes address the issues recently expressed?




-- 
I have yet to see any problem, however complicated, which, when looked
at in the right way, did not become still more complicated.
-- Poul Anderson

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 I've obviously misunderstood the scope of the TODO because it appears
 that an INSERT into pg_type at creation time for compound types that
 looks something like the below would do it.  What have I missed?

There are a couple of issues.  One is that we probably don't want two
pg_type entries for every single table.  Will you be satisfied if only
CREATE TYPE AS ... makes an array type?  The other is that, at least at
the time they were written, the array support routines couldn't handle
composite array values.  Things might or might not be easier today;
I don't think we had record_in and record_out in their current form
then.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread David Fetter
On Thu, Feb 15, 2007 at 07:35:46PM -0500, Tom Lane wrote:
 David Fetter [EMAIL PROTECTED] writes:
  I've obviously misunderstood the scope of the TODO because it appears
  that an INSERT into pg_type at creation time for compound types that
  looks something like the below would do it.  What have I missed?
 
 There are a couple of issues.  One is that we probably don't want
 two pg_type entries for every single table.

Now that you mention it, I would want that if that's what it takes to
get arrays for them.  The long-term goal here is to make all of
PostgreSQL's types play nicely together.  I'm guessing that SETOF
will eventually be a way to describe a collection because MULTISET is
in SQL:2003.

 Will you be satisfied if only CREATE TYPE AS ... makes an array
 type?  The other is that, at least at the time they were written,
 the array support routines couldn't handle composite array values.
 Things might or might not be easier today; I don't think we had
 record_in and record_out in their current form then.

OK.  What about pg_depend?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Andrew Dunstan



Tom Lane wrote:

David Fetter [EMAIL PROTECTED] writes:
  

I've obviously misunderstood the scope of the TODO because it appears
that an INSERT into pg_type at creation time for compound types that
looks something like the below would do it.  What have I missed?



There are a couple of issues.  One is that we probably don't want two
pg_type entries for every single table.  Will you be satisfied if only
CREATE TYPE AS ... makes an array type?  
  


There should be some better way to create the array type for tables than 
directly mangling pg_type, though. Maybe a builtin function that took an 
oid?


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] patch adding new regexp functions

2007-02-14 Thread Jeremy Drake
What was the status of this?  Was there anything else I needed to do with
this patch, or is it ready to be applied?  Let me know if there is
anything else I need to do on this...

-- 
What this country needs is a good five cent nickel.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] patch adding new regexp functions

2007-02-14 Thread Neil Conway
On Wed, 2007-02-14 at 16:49 -0800, Jeremy Drake wrote:
 What was the status of this?  Was there anything else I needed to do with
 this patch, or is it ready to be applied?  Let me know if there is
 anything else I need to do on this...

Will do -- I'm planning to apply this as soon as I have the free cycles
to do so, likely tomorrow or Friday.

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-09 Thread Neil Conway
On Fri, 2007-02-09 at 01:08 -0800, Jeremy Drake wrote:
 Yeah, I try to do that, but this one just barely passed my personal
 compression threshold.  Guess I should raise my threshold :)

No, don't pay any attention to me, I'm just lazy :)

 Here is a new version of the patch which fixes up the documentation a
 little (should have read it over again before posting).

The doing_srf business is rather tortured, and seems an invitation for
bugs. ISTM there should be a cleaner way to implement this. For example,
would it be possible to put all the common logic into one or more
additional functions, and then have SRF vs. non-SRF cases that call into
those functions after doing the appropriate initialization?

-Neil



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] patch adding new regexp functions

2007-02-09 Thread Neil Conway
On Fri, 2007-02-09 at 16:33 -0800, Jeremy Drake wrote:
 Here is a new version of the patch which eliminates the doing_srf stuff.

* C89 require constant-sized stack allocated arrays, so the coding in
perform_regexp_matches() is non-portable.

* I'm not clear about the control flow in regexp_matches() and
regexp_split(). Presumably it's not possible for the call_cntr to
actually exceed max_calls, so the error message in these cases should be
elog(ERROR), not ereport (the former is for shouldn't happen bug
scenarios, the latter is for user-facing errors). Can you describe the
logic here (e.g. via comments) a bit more?

* The logic in regexp_split (incremented_offset, first_match, etc.) is
pretty ugly and hard to follow. The if condition on line 1037 is
particularly objectionable. Again, ISTM there should be a cleaner way to
do this.

* Try to keep lines to 80 characters or fewer. If the code is wandering
off the right side of the screen all the time, that might be a hint that
it needs simplification.

Attached is a cleaned up version of your patch -- various improvements
throughout, but mostly cosmetic stuff. Do you want to take a look at the
above?

-Neil

Index: doc/src/sgml/func.sgml
===
RCS file: /home/neilc/postgres/cvs_root/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.357
diff -c -p -r1.357 func.sgml
*** doc/src/sgml/func.sgml	1 Feb 2007 00:28:16 -	1.357
--- doc/src/sgml/func.sgml	10 Feb 2007 03:20:14 -
***
*** 1446,1451 
--- 1446,1464 
/row
  
row
+entryliteralfunctionregexp_matches/function(parameterstring/parameter typetext/type, parameterpattern/parameter typetext/type [,parameterflags/parameter typetext/type])/literal/entry
+entrytypetext[]/type or typesetof record/type (if flags are given)/entry
+entry
+ Return all capture groups resulting from matching POSIX regular
+ expression against the parameterstring/parameter. See
+ xref linkend=functions-matching for more information on pattern
+ matching.
+/entry
+entryliteralregexp_matches('foobarbequebaz', '(bar)(beque)')/literal/entry
+entryliteral{bar,beque}/literal/entry
+   /row
+ 
+   row
 entryliteralfunctionregexp_replace/function(parameterstring/parameter typetext/type, parameterpattern/parameter typetext/type, parameterreplacement/parameter typetext/type [,parameterflags/parameter typetext/type])/literal/entry
 entrytypetext/type/entry
 entry
***
*** 1458,1463 
--- 1471,1488 
/row
  
row
+entryliteralfunctionregexp_split/function(parameterstring/parameter typetext/type, parameterpattern/parameter typetext/type [,parameterflags/parameter typetext/type])/literal/entry
+entrytypesetof text/type/entry
+entry
+ Splits parameterstring/parameter using POSIX regular expression as
+ the delimiter.  See xref linkend=functions-matching for more
+ information on pattern matching.
+/entry
+entryliteralregexp_split('hello world', E'\\s+')/literal/entry
+entryliteralhello/literalparaliteralworld/literal/para (2 rows)/entry
+   /row
+ 
+   row
 entryliteralfunctionrepeat/function(parameterstring/parameter typetext/type, parameternumber/parameter typeint/type)/literal/entry
 entrytypetext/type/entry
 entryRepeat parameterstring/parameter the specified
*** cast(-44 as bit(12))   lineanno
*** 2864,2869 
--- 2889,2900 
 indexterm
  primaryregexp_replace/primary
 /indexterm
+indexterm
+ primaryregexp_matches/primary
+/indexterm
+indexterm
+ primaryregexp_split/primary
+/indexterm
  
  synopsis
  replaceablestring/replaceable SIMILAR TO replaceablepattern/replaceable optionalESCAPE replaceableescape-character/replaceable/optional
*** substring('foobar' from 'o(.)b')   line
*** 3112,3118 
   string containing zero or more single-letter flags that change the
   function's behavior.  Flag literali/ specifies case-insensitive
   matching, while flag literalg/ specifies replacement of each matching
!  substring rather than only the first one.
  /para
  
 para
--- 3143,3152 
   string containing zero or more single-letter flags that change the
   function's behavior.  Flag literali/ specifies case-insensitive
   matching, while flag literalg/ specifies replacement of each matching
!  substring rather than only the first one.  Other supported flags are
!  literalm/, literaln/, literalp/, literalw/ and
!  literalx/, whose meanings correspond to those shown in
!  xref linkend=posix-embedded-options-table.
  /para
  
 para
*** regexp_replace('foobarbaz', 'b(..)', E'X
*** 3127,3132 
--- 3161,3341 
  /programlisting
 /para
  
+ para
+  The 

Re: [PATCHES] patch adding new regexp functions

2007-02-08 Thread Neil Conway
On Thu, 2007-02-08 at 19:22 -0800, Jeremy Drake wrote:
 I have added documentation for the functions in this patch.  At this point
 I am ready to submit this patch for the review and application process.
 Please let me know if you find any issues with it.

Barring any objections, I'll review and apply this patch in the next few
days.

BTW, unless the patch is truly large, leaving the patch uncompressed
makes it easier to eyeball it without leaving my mail client. That's
just personal preference, though...

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-08 Thread Alvaro Herrera
Neil Conway wrote:

 BTW, unless the patch is truly large, leaving the patch uncompressed
 makes it easier to eyeball it without leaving my mail client. That's
 just personal preference, though...

FWIW, I just do a | zcat | less, which shows it uncompressed.  No big
deal.  I don't know if Evolution can do that ... (it surprised me that
Thunderbird doesn't seem to be capable of searching a message based on
Message-Id).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] patch adding new regexp functions

2007-02-08 Thread Bruce Momjian
Alvaro Herrera wrote:
 Neil Conway wrote:
 
  BTW, unless the patch is truly large, leaving the patch uncompressed
  makes it easier to eyeball it without leaving my mail client. That's
  just personal preference, though...
 
 FWIW, I just do a | zcat | less, which shows it uncompressed.  No big
 deal.  I don't know if Evolution can do that ... (it surprised me that
 Thunderbird doesn't seem to be capable of searching a message based on
 Message-Id).

My .mailcap calls a script runs 'file' and tries to display it properly
automatically, and it worked fine for his attachment.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly