Re: [PATCHES] regexp_replace
Patch applied. Thanks. --- Atsushi Ogawa wrote: > > Bruce Momjian wrote: > > I have applied your patch, with slight adjustments in spacing and > > documentation. > > > > Patch applied. Thanks. > > Thank you for applying patch. > An attached patch is a small additional improvement. > > This patch use appendStringInfoText instead of appendStringInfoString. > There is an overhead of PG_TEXT_GET_STR when appendStringInfoString is > executed by text type. This can be reduced by appendStringInfoText. > > regards, > > Atsushi Ogawa [ Attachment, skipping... ] -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] regexp_replace
Bruce Momjian wrote: > I have applied your patch, with slight adjustments in spacing and > documentation. > > Patch applied. Thanks. Thank you for applying patch. An attached patch is a small additional improvement. This patch use appendStringInfoText instead of appendStringInfoString. There is an overhead of PG_TEXT_GET_STR when appendStringInfoString is executed by text type. This can be reduced by appendStringInfoText. regards, Atsushi Ogawa varlena.patch Description: Binary data ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] regexp_replace
The change below has broken tsearch2. See for example http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=shrew&dt=2005-07-10%2015:02:01 cheers andrew Bruce Momjian wrote: I have applied your patch, with slight adjustments in spacing and documentation. Patch applied. Thanks. [snip] Index: src/include/regex/regex.h === RCS file: /cvsroot/pgsql/src/include/regex/regex.h,v retrieving revision 1.26 diff -c -c -r1.26 regex.h *** src/include/regex/regex.h 29 Nov 2003 22:41:10 - 1.26 --- src/include/regex/regex.h 10 Jul 2005 04:52:51 - *** *** 163,169 * the prototypes for exported functions */ extern int pg_regcomp(regex_t *, const pg_wchar *, size_t, int); ! extern intpg_regexec(regex_t *, const pg_wchar *, size_t, rm_detail_t *, size_t, regmatch_t[], int); extern void pg_regfree(regex_t *); extern size_t pg_regerror(int, const regex_t *, char *, size_t); --- 163,169 * the prototypes for exported functions */ extern int pg_regcomp(regex_t *, const pg_wchar *, size_t, int); ! extern intpg_regexec(regex_t *, const pg_wchar *, size_t, size_t, rm_detail_t *, size_t, regmatch_t[], int); extern void pg_regfree(regex_t *); extern size_t pg_regerror(int, const regex_t *, char *, size_t); ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] regexp_replace
I have applied your patch, with slight adjustments in spacing and documentation. Patch applied. Thanks. --- Atsushi Ogawa wrote: > > I made the patch that implements regexp_replace again. > The specification of this function is as follows. > > regexp_replace(source text, pattern text, replacement text, [flags text]) > returns text > > Replace string that matches to regular expression in source text to > replacement text. > > - pattern is regular expression pattern. > - replacement is replace string that can use '\1'-'\9', and '\&'. > '\1'-'\9': back reference to the n'th subexpression. > '\&' : entire matched string. > - flags can use the following values: > g: global (replace all) > i: ignore case > When the flags is not specified, case sensitive, replace the first > instance only. > > regards, > > --- Atsushi Ogawa [ Attachment, skipping... ] > > ---(end of broadcast)--- > TIP 4: Don't 'kill -9' the postmaster -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 Index: doc/src/sgml/func.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.263 diff -c -c -r1.263 func.sgml *** doc/src/sgml/func.sgml 6 Jul 2005 19:02:52 - 1.263 --- doc/src/sgml/func.sgml 10 Jul 2005 04:52:43 - *** *** 1257,1262 --- 1257,1282 + regexp_replace(source text, +pattern text, +replacement text +, flags text) +text +Replace string that matches the regular expression + pattern in source to + replacement. + replacement can use \1-\9 and \&. + \1-\9 is a back reference to the n'th subexpression, and + \& is the entire matched string. + flags can use g(global) and i(ignore case). + When flags is not specified, case sensitive matching is used, and it replaces + only the instance. + +regexp_replace('111222', '(\\d{3})(\\d{3})(\\d{4})', '(\\1) \\2-\\3') +(111) 222- + + + repeat(string text, number integer) text Repeat string the specified Index: src/backend/regex/regexec.c === RCS file: /cvsroot/pgsql/src/backend/regex/regexec.c,v retrieving revision 1.24 diff -c -c -r1.24 regexec.c *** src/backend/regex/regexec.c 29 Nov 2003 19:51:55 - 1.24 --- src/backend/regex/regexec.c 10 Jul 2005 04:52:44 - *** *** 110,115 --- 110,116 regmatch_t *pmatch; rm_detail_t *details; chr*start; /* start of string */ + chr*search_start; /* search start of string */ chr*stop; /* just past end of string */ int err;/* error code if any (0 none) */ regoff_t *mem;/* memory vector for backtracking */ *** *** 168,173 --- 169,175 pg_regexec(regex_t *re, const chr *string, size_t len, + size_t search_start, rm_detail_t *details, size_t nmatch, regmatch_t pmatch[], *** *** 219,224 --- 221,227 v->pmatch = pmatch; v->details = details; v->start = (chr *) string; + v->search_start = (chr *) string + search_start; v->stop = (chr *) string + len; v->err = 0; if (backref) *** *** 288,294 NOERR(); MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); cold = NULL; ! close = shortest(v, s, v->start, v->start, v->stop, &cold, (int *) NULL); freedfa(s); NOERR(); if (v->g->cflags & REG_EXPECT) --- 291,298 NOERR(); MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); cold = NULL; ! close = shortest(v, s, v->search_start, v->search_start, v->stop, !&cold, (int *) NULL); freedfa(s); NOERR(); if (v->g->cflags & REG_EXPECT) *** *** 415,421 assert(d != NULL && s != NULL); cold = NULL; ! close = v->start; do { MDEBUG(("\ncsearch at %ld\n", LOFF(close))); --- 419,425 assert(d != NULL && s != NULL); cold = NULL; ! close = v->search_start; do { MDEBUG(("\n
Re: [PATCHES] regexp_replace
> > pg_catalog | replace | text | text, text, text > > > > I think that regexp_replace is a good name. It is easy to understand. > > regards, I prefere this name too Regards Pavel Stehule ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PATCHES] regexp_replace
Atsushi Ogawa wrote: I think that regexp_replace is a good name. It is easy to understand. I'll go with the flow. cheers andrew ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] regexp_replace
Bruce Momjian wrote: > Andrew Dunstan wrote: > > I'm very glad to see this. But is a nicer name possible? To perl > > programmers at least, "substitute" should make sense. > > What is the matter with replace? We already have replace: > > test=> \df replace >List of functions >Schema | Name | Result data type | Argument data types > +-+--+- > pg_catalog | replace | text | text, text, text > I think that regexp_replace is a good name. It is easy to understand. regards, --- Atsushi Ogawa ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] regexp_replace
Andrew Dunstan wrote: > I'm very glad to see this. But is a nicer name possible? To perl > programmers at least, "substitute" should make sense. What is the matter with replace? We already have replace: test=> \df replace List of functions Schema | Name | Result data type | Argument data types +-+--+- pg_catalog | replace | text | text, text, text test=> \df replace -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PATCHES] regexp_replace
I'm very glad to see this. But is a nicer name possible? To perl programmers at least, "substitute" should make sense. cheers andrew Atsushi Ogawa wrote: >I made the patch that implements regexp_replace again. >The specification of this function is as follows. > >regexp_replace(source text, pattern text, replacement text, [flags text]) >returns text > >Replace string that matches to regular expression in source text to >replacement text. > > - pattern is regular expression pattern. > - replacement is replace string that can use '\1'-'\9', and '\&'. >'\1'-'\9': back reference to the n'th subexpression. >'\&' : entire matched string. > - flags can use the following values: >g: global (replace all) >i: ignore case >When the flags is not specified, case sensitive, replace the first >instance only. > >regards, > >--- Atsushi Ogawa > > > > > >---(end of broadcast)--- >TIP 4: Don't 'kill -9' the postmaster > > ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PATCHES] regexp_replace
I made the patch that implements regexp_replace again. The specification of this function is as follows. regexp_replace(source text, pattern text, replacement text, [flags text]) returns text Replace string that matches to regular expression in source text to replacement text. - pattern is regular expression pattern. - replacement is replace string that can use '\1'-'\9', and '\&'. '\1'-'\9': back reference to the n'th subexpression. '\&' : entire matched string. - flags can use the following values: g: global (replace all) i: ignore case When the flags is not specified, case sensitive, replace the first instance only. regards, --- Atsushi Ogawa regexp_replace.patch Description: Binary data ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] regexp_replace
Robert Treat wrote: > On Tuesday 07 June 2005 10:57, David Fetter wrote: > > On Tue, Jun 07, 2005 at 10:27:28PM +0900, Atsushi Ogawa wrote: > > > My idea is opposite. I think that the regexp_replace() should make > > > "replace all" a default. Because the replace() of pgsql replaces all > > > string, and regexp_replace() of oracle10g is also similar. > > > > I respectfully disagree. Although Oracle does things this way, no > > other regular expression search and replace does. Historically, you > > can find that "Oracle does it this way" is not a reason why we would > > do it. Text editors, programming languages, etc., etc. do "replace > > the first" by default and "replace globally" only when told to. > > > > You don't think it will be confusing to have a function called replace which > replaces all occurrences and a function called regex_replace which only > replaces the first occurance? There's something to be said for consitancy > within pgsql itself. Huh? I am confused. Why if both support regex, why does regex_replace only do the first one? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [HACKERS] [PATCHES] regexp_replace
On Tuesday 07 June 2005 10:57, David Fetter wrote: > On Tue, Jun 07, 2005 at 10:27:28PM +0900, Atsushi Ogawa wrote: > > My idea is opposite. I think that the regexp_replace() should make > > "replace all" a default. Because the replace() of pgsql replaces all > > string, and regexp_replace() of oracle10g is also similar. > > I respectfully disagree. Although Oracle does things this way, no > other regular expression search and replace does. Historically, you > can find that "Oracle does it this way" is not a reason why we would > do it. Text editors, programming languages, etc., etc. do "replace > the first" by default and "replace globally" only when told to. > You don't think it will be confusing to have a function called replace which replaces all occurrences and a function called regex_replace which only replaces the first occurance? There's something to be said for consitancy within pgsql itself. -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] [PATCHES] regexp_replace
On Tue, Jun 07, 2005 at 10:27:28PM +0900, Atsushi Ogawa wrote: > > David Fetter wrote: > > On Tue, Jun 07, 2005 at 09:35:56AM +0900, a_ogawa wrote: > > > David Fetter wrote: > > > > We don't yet have this functionality, as the patch allows for > > > > using second and later regex matches "()" in the replacement > > > > pattern. > > > > > > > > The function is misnamed. It should be called > > > > regex_replace_all() or some such, as it violates the principle > > > > of least astonishment by replacing all instances by default. > > > > Every other regex replacement defaults to "replace first," not > > > > "replace all." Or maybe it should take a bool for "replace > > > > all," or...? Anyhow, it's worth a discussion :) > > > > > > I think that the usage increases if "replace all" or "replace first" can > be > > > specified to this function. > > > > Ogawa-san, > > > > I think that this would be a case for function overloading: > > > > function regexp_replace( > > string text, pattern text, replacement text > > ) RETURNS TEXT; /* First only */ > > > > regexp_replace( > > string text, pattern text, replacement text, global bool > > ) RETURNS TEXT; /* Global if global is TRUE, first only otherwise */ > > > > What do you think of this idea? One trouble is that there are some > > other options. For example, one could add switches for all > > combinations of "global," "case insensitive," "compile once," "exclude > > whitespace," etc. as perl does. Do we want to go this route? > > My idea is opposite. I think that the regexp_replace() should make > "replace all" a default. Because the replace() of pgsql replaces all > string, and regexp_replace() of oracle10g is also similar. I respectfully disagree. Although Oracle does things this way, no other regular expression search and replace does. Historically, you can find that "Oracle does it this way" is not a reason why we would do it. Text editors, programming languages, etc., etc. do "replace the first" by default and "replace globally" only when told to. > And I think that it is better to be able to specify the option with text. > I think about this function specification: > > regexp_replace( > string text, pattern text, replacement text > ) RETURNS TEXT; /* Replace all */ > > regexp_replace( > string text, pattern text, replacement text, options text > ) RETURNS TEXT; /* Change operation by the option. */ > > The options can use the following values. > f: Replace first only > i: Case insensitive > > Any comments? I think that "case insensitive" is a good thing to add separately as a boolean :) Cheers, David. -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] [PATCHES] regexp_replace
David Fetter wrote: > On Tue, Jun 07, 2005 at 09:35:56AM +0900, a_ogawa wrote: > > David Fetter wrote: > > > We don't yet have this functionality, as the patch allows for > > > using second and later regex matches "()" in the replacement > > > pattern. > > > > > > The function is misnamed. It should be called > > > regex_replace_all() or some such, as it violates the principle > > > of least astonishment by replacing all instances by default. > > > Every other regex replacement defaults to "replace first," not > > > "replace all." Or maybe it should take a bool for "replace > > > all," or...? Anyhow, it's worth a discussion :) > > > > I think that the usage increases if "replace all" or "replace first" can be > > specified to this function. > > Ogawa-san, > > I think that this would be a case for function overloading: > > function regexp_replace( > string text, pattern text, replacement text > ) RETURNS TEXT; /* First only */ > > regexp_replace( > string text, pattern text, replacement text, global bool > ) RETURNS TEXT; /* Global if global is TRUE, first only otherwise */ > > What do you think of this idea? One trouble is that there are some > other options. For example, one could add switches for all > combinations of "global," "case insensitive," "compile once," "exclude > whitespace," etc. as perl does. Do we want to go this route? My idea is opposite. I think that the regexp_replace() should make "replace all" a default. Because the replace() of pgsql replaces all string, and regexp_replace() of oracle10g is also similar. And I think that it is better to be able to specify the option with text. I think about this function specification: regexp_replace( string text, pattern text, replacement text ) RETURNS TEXT; /* Replace all */ regexp_replace( string text, pattern text, replacement text, options text ) RETURNS TEXT; /* Change operation by the option. */ The options can use the following values. f: Replace first only i: Case insensitive Any comments? regards, --- Atsushi Ogawa ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] regexp_replace
On Tue, Jun 07, 2005 at 09:35:56AM +0900, a_ogawa wrote: > > Bruce Momjian wrote: > > David Fetter wrote: > > > On Mon, Jun 06, 2005 at 12:02:18PM -0400, Bruce Momjian wrote: > > > > > > > > Patch removed because we already have this functionality. > > > > > > We don't yet have this functionality, as the patch allows for > > > using second and later regex matches "()" in the replacement > > > pattern. > > > > > > The function is misnamed. It should be called > > > regex_replace_all() or some such, as it violates the principle > > > of least astonishment by replacing all instances by default. > > > Every other regex replacement defaults to "replace first," not > > > "replace all." Or maybe it should take a bool for "replace > > > all," or...? Anyhow, it's worth a discussion :) > > > > Does anyone want to argue that this additional functionality is > > significant and deserves its own function or an additional > > argument to the existing function? > > Oracle10g has a similar functionality. The name is regexp_replace. > There is the following usages in this functionality. > - Format the ZIP code and the telephone number, etc. >Example: select regexp_replace('111222', '(\\d{3})(\\d{3})(\\d{4})', > '(\\1) \\2-\\3'); > result: (111) 222- > - Delete an unnecessary white space. >Example: select regexp_replace('A B C', '\\s+', ' '); > result: A B C > > I think that the usage increases if "replace all" or "replace first" can be > specified to this function. Ogawa-san, I think that this would be a case for function overloading: function regexp_replace( string text, pattern text, replacement text ) RETURNS TEXT; /* First only */ regexp_replace( string text, pattern text, replacement text, global bool ) RETURNS TEXT; /* Global if global is TRUE, first only otherwise */ What do you think of this idea? One trouble is that there are some other options. For example, one could add switches for all combinations of "global," "case insensitive," "compile once," "exclude whitespace," etc. as perl does. Do we want to go this route? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 510 893 6100 mobile: +1 415 235 3778 Remember to vote! ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] regexp_replace
Bruce Momjian wrote: > David Fetter wrote: > > On Mon, Jun 06, 2005 at 12:02:18PM -0400, Bruce Momjian wrote: > > > > > > Patch removed because we already have this functionality. > > > > We don't yet have this functionality, as the patch allows for using > > second and later regex matches "()" in the replacement pattern. > > > > The function is misnamed. It should be called regex_replace_all() or > > some such, as it violates the principle of least astonishment by > > replacing all instances by default. Every other regex replacement > > defaults to "replace first," not "replace all." Or maybe it should > > take a bool for "replace all," or...? Anyhow, it's worth a discussion > > :) > > Does anyone want to argue that this additional functionality is > significant and deserves its own function or an additional argument to > the existing function? Oracle10g has a similar functionality. The name is regexp_replace. There is the following usages in this functionality. - Format the ZIP code and the telephone number, etc. Example: select regexp_replace('111222', '(\\d{3})(\\d{3})(\\d{4})', '(\\1) \\2-\\3'); result: (111) 222- - Delete an unnecessary white space. Example: select regexp_replace('A B C', '\\s+', ' '); result: A B C I think that the usage increases if "replace all" or "replace first" can be specified to this function. regards, --- Atsushi Ogawa ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PATCHES] regexp_replace
David Fetter wrote: > On Mon, Jun 06, 2005 at 12:02:18PM -0400, Bruce Momjian wrote: > > > > Patch removed because we already have this functionality. > > We don't yet have this functionality, as the patch allows for using > second and later regex matches "()" in the replacement pattern. > > The function is misnamed. It should be called regex_replace_all() or > some such, as it violates the principle of least astonishment by > replacing all instances by default. Every other regex replacement > defaults to "replace first," not "replace all." Or maybe it should > take a bool for "replace all," or...? Anyhow, it's worth a discussion > :) Does anyone want to argue that this additional functionality is significant and deserves its own function or an additional argument to the existing function? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] regexp_replace
On Mon, Jun 06, 2005 at 12:02:18PM -0400, Bruce Momjian wrote: > > Patch removed because we already have this functionality. We don't yet have this functionality, as the patch allows for using second and later regex matches "()" in the replacement pattern. The function is misnamed. It should be called regex_replace_all() or some such, as it violates the principle of least astonishment by replacing all instances by default. Every other regex replacement defaults to "replace first," not "replace all." Or maybe it should take a bool for "replace all," or...? Anyhow, it's worth a discussion :) Cheers, D > > --- > > a_ogawa00 wrote: > > > > This patch provides a new function regexp_replace. > > regexp_replace extends a replace function and enables text search > > by the regular expression. And, a back reference can be used within > > a replace string. > > (This patch for PostgreSQL 7.4.3) > > > > Function: regexp_replace(str, pattern, replace_str) > > Retuen Type: text > > Description: Replace all matched string in str. > > pattern is regular expression pattern. > > replace_str is replace string that can use '\1' - '\9', and > > '\&'. > > '\1' - '\9' is back reference to the n'th subexpression. > > '\&' is matched string. > > > > (example1) > > select regexp_replace('ABC-DEF', '(\\w+)-(\\w+)', '\\2-\\1') > > result: DEF-ABC > > > > (example2) > > update tab1 set col1 = regexp_replace(col1, '[A-Z]', ''); > > > > --- > > Atsushi Ogawa > > [EMAIL PROTECTED] > > > > --- cut here --- > > > > *** ./src/backend/regex/regexec.c.orig Tue Jul 20 08:45:39 2004 > > --- ./src/backend/regex/regexec.c Tue Jul 20 08:49:36 2004 > > *** > > *** 110,115 > > --- 110,116 > > regmatch_t *pmatch; > > rm_detail_t *details; > > chr*start; /* start of string */ > > + chr*search_start; /* search start of string */ > > chr*stop; /* just past end of > > string */ > > int err;/* error code if any (0 > > none) */ > > regoff_t *mem;/* memory vector for > > backtracking */ > > *** > > *** 168,173 > > --- 169,175 > > pg_regexec(regex_t *re, > >const chr *string, > >size_t len, > > + size_t search_start, > >rm_detail_t *details, > >size_t nmatch, > >regmatch_t pmatch[], > > *** > > *** 219,224 > > --- 221,227 > > v->pmatch = pmatch; > > v->details = details; > > v->start = (chr *) string; > > + v->search_start = (chr *) string + search_start; > > v->stop = (chr *) string + len; > > v->err = 0; > > if (backref) > > *** > > *** 288,294 > > NOERR(); > > MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); > > cold = NULL; > > ! close = shortest(v, s, v->start, v->start, v->stop, &cold, (int *) > > NULL); > > freedfa(s); > > NOERR(); > > if (v->g->cflags & REG_EXPECT) > > --- 291,298 > > NOERR(); > > MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); > > cold = NULL; > > ! close = shortest(v, s, v->search_start, v->search_start, v->stop, > > !&cold, (int *) NULL); > > freedfa(s); > > NOERR(); > > if (v->g->cflags & REG_EXPECT) > > *** > > *** 415,421 > > > > assert(d != NULL && s != NULL); > > cold = NULL; > > ! close = v->start; > > do > > { > > MDEBUG(("\ncsearch at %ld\n", LOFF(close))); > > --- 419,425 > > > > assert(d != NULL && s != NULL); > > cold = NULL; > > ! close = v->search_start; > > do > > { > > MDEBUG(("\ncsearch at %ld\n", LOFF(close))); > > *** ./src/backend/utils/adt/regexp.c.orig Tue Jul 20 08:50:08 2004 > > --- ./src/backend/utils/adt/regexp.cTue Jul 20 09:00:05 2004 > > *** > > *** 80,116 > > > > > > /* > > ! * RE_compile_and_execute - compile and execute a RE, caching if possible > >* > > ! * Returns TRUE on match, FALSE on no match > >* > > ! *text_re --- the pattern, expressed as an *untoasted* TEXT object > > ! *dat --- the data to match against (need not be null-terminated) > > ! *dat_len --- the length of the data string > > ! *cflags --- compile options for the pattern > > ! *nmatch, pmatch --- optional return area for match details > >* > > ! * Both pattern and data are given in the database encoding. We > > internally > > ! * convert to array of pg_wchar which is what Spencer's regex package > > wants. > >*/ > > ! static bool > > ! RE_compile_and_execute(text *text_re, unsigned char *dat, int dat_len, > > !
Re: [PATCHES] regexp_replace
Nice. Patch removed. --- Tom Lane wrote: > Bruce Momjian writes: > > Tom Lane wrote: > >> Don't we have this functionality already? It's even SQL-spec ... > > > Uh, all I see it replace(), which isn't regex: > > The SQL-spec function is substring(string from pattern for escape-char); > see > http://www.postgresql.org/docs/8.0/static/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP > > and we also have a variant of that for POSIX rather than SQL-style > regexps: > http://www.postgresql.org/docs/8.0/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP > > regards, tom lane > > ---(end of broadcast)--- > TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] > -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PATCHES] regexp_replace
Patch removed because we already have this functionality. --- a_ogawa00 wrote: > > This patch provides a new function regexp_replace. > regexp_replace extends a replace function and enables text search > by the regular expression. And, a back reference can be used within > a replace string. > (This patch for PostgreSQL 7.4.3) > > Function: regexp_replace(str, pattern, replace_str) > Retuen Type: text > Description: Replace all matched string in str. > pattern is regular expression pattern. > replace_str is replace string that can use '\1' - '\9', and > '\&'. > '\1' - '\9' is back reference to the n'th subexpression. > '\&' is matched string. > > (example1) > select regexp_replace('ABC-DEF', '(\\w+)-(\\w+)', '\\2-\\1') > result: DEF-ABC > > (example2) > update tab1 set col1 = regexp_replace(col1, '[A-Z]', ''); > > --- > Atsushi Ogawa > [EMAIL PROTECTED] > > --- cut here --- > > *** ./src/backend/regex/regexec.c.origTue Jul 20 08:45:39 2004 > --- ./src/backend/regex/regexec.c Tue Jul 20 08:49:36 2004 > *** > *** 110,115 > --- 110,116 > regmatch_t *pmatch; > rm_detail_t *details; > chr*start; /* start of string */ > + chr*search_start; /* search start of string */ > chr*stop; /* just past end of > string */ > int err;/* error code if any (0 > none) */ > regoff_t *mem;/* memory vector for > backtracking */ > *** > *** 168,173 > --- 169,175 > pg_regexec(regex_t *re, > const chr *string, > size_t len, > +size_t search_start, > rm_detail_t *details, > size_t nmatch, > regmatch_t pmatch[], > *** > *** 219,224 > --- 221,227 > v->pmatch = pmatch; > v->details = details; > v->start = (chr *) string; > + v->search_start = (chr *) string + search_start; > v->stop = (chr *) string + len; > v->err = 0; > if (backref) > *** > *** 288,294 > NOERR(); > MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); > cold = NULL; > ! close = shortest(v, s, v->start, v->start, v->stop, &cold, (int *) > NULL); > freedfa(s); > NOERR(); > if (v->g->cflags & REG_EXPECT) > --- 291,298 > NOERR(); > MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); > cold = NULL; > ! close = shortest(v, s, v->search_start, v->search_start, v->stop, > ! &cold, (int *) NULL); > freedfa(s); > NOERR(); > if (v->g->cflags & REG_EXPECT) > *** > *** 415,421 > > assert(d != NULL && s != NULL); > cold = NULL; > ! close = v->start; > do > { > MDEBUG(("\ncsearch at %ld\n", LOFF(close))); > --- 419,425 > > assert(d != NULL && s != NULL); > cold = NULL; > ! close = v->search_start; > do > { > MDEBUG(("\ncsearch at %ld\n", LOFF(close))); > *** ./src/backend/utils/adt/regexp.c.orig Tue Jul 20 08:50:08 2004 > --- ./src/backend/utils/adt/regexp.c Tue Jul 20 09:00:05 2004 > *** > *** 80,116 > > > /* > ! * RE_compile_and_execute - compile and execute a RE, caching if possible >* > ! * Returns TRUE on match, FALSE on no match >* > ! * text_re --- the pattern, expressed as an *untoasted* TEXT object > ! * dat --- the data to match against (need not be null-terminated) > ! * dat_len --- the length of the data string > ! * cflags --- compile options for the pattern > ! * nmatch, pmatch --- optional return area for match details >* > ! * Both pattern and data are given in the database encoding. We > internally > ! * convert to array of pg_wchar which is what Spencer's regex package > wants. >*/ > ! static bool > ! RE_compile_and_execute(text *text_re, unsigned char *dat, int dat_len, > !int cflags, int nmatch, regmatch_t > *pmatch) > { > int text_re_len = VARSIZE(text_re); > - pg_wchar *data; > - size_t data_len; > pg_wchar *pattern; > size_t pattern_len; > int i; > int regcomp_result; > - int regexec_result; > cached_re_str re_temp; > > - /* Convert data string to wide characters */ > - data = (pg_wchar *) palloc((dat_len + 1) * sizeof(pg_wchar)); > - data_len = pg_mb2wchar_with_len(dat, data, dat_len); > - > /* >* Look for a match among previously compiled REs. Since the data >* structure is self-organiz
Re: [PATCHES] regexp_replace
Bruce Momjian writes: > Tom Lane wrote: >> Don't we have this functionality already? It's even SQL-spec ... > Uh, all I see it replace(), which isn't regex: The SQL-spec function is substring(string from pattern for escape-char); see http://www.postgresql.org/docs/8.0/static/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP and we also have a variant of that for POSIX rather than SQL-style regexps: http://www.postgresql.org/docs/8.0/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] regexp_replace
Tom Lane wrote: > Bruce Momjian writes: > > Your patch has been added to the PostgreSQL unapplied patches list at: > > > a_ogawa00 wrote: > >> This patch provides a new function regexp_replace. > >> regexp_replace extends a replace function and enables text search > >> by the regular expression. And, a back reference can be used within > >> a replace string. > >> (This patch for PostgreSQL 7.4.3) > > Don't we have this functionality already? It's even SQL-spec ... Uh, all I see it replace(), which isn't regex: replace(string text, from text, to text) text Replace all occurrences in string of substring from with substring to replace( 'abcdefabcdef', 'cd', 'XX') abXXefabXXef test=> SELECT replace('abc','a','d'); replace - dbc (1 row) test=> SELECT replace('abc','[a-c]','d'); replace - abc (1 row) -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PATCHES] regexp_replace
Bruce Momjian writes: > Your patch has been added to the PostgreSQL unapplied patches list at: > a_ogawa00 wrote: >> This patch provides a new function regexp_replace. >> regexp_replace extends a replace function and enables text search >> by the regular expression. And, a back reference can be used within >> a replace string. >> (This patch for PostgreSQL 7.4.3) Don't we have this functionality already? It's even SQL-spec ... regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] regexp_replace
I will add the documentation and make sure your oids are not duplicates. Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers reviews and approves it. --- a_ogawa00 wrote: > > This patch provides a new function regexp_replace. > regexp_replace extends a replace function and enables text search > by the regular expression. And, a back reference can be used within > a replace string. > (This patch for PostgreSQL 7.4.3) > > Function: regexp_replace(str, pattern, replace_str) > Retuen Type: text > Description: Replace all matched string in str. > pattern is regular expression pattern. > replace_str is replace string that can use '\1' - '\9', and > '\&'. > '\1' - '\9' is back reference to the n'th subexpression. > '\&' is matched string. > > (example1) > select regexp_replace('ABC-DEF', '(\\w+)-(\\w+)', '\\2-\\1') > result: DEF-ABC > > (example2) > update tab1 set col1 = regexp_replace(col1, '[A-Z]', ''); > > --- > Atsushi Ogawa > [EMAIL PROTECTED] > > --- cut here --- > > *** ./src/backend/regex/regexec.c.origTue Jul 20 08:45:39 2004 > --- ./src/backend/regex/regexec.c Tue Jul 20 08:49:36 2004 > *** > *** 110,115 > --- 110,116 > regmatch_t *pmatch; > rm_detail_t *details; > chr*start; /* start of string */ > + chr*search_start; /* search start of string */ > chr*stop; /* just past end of > string */ > int err;/* error code if any (0 > none) */ > regoff_t *mem;/* memory vector for > backtracking */ > *** > *** 168,173 > --- 169,175 > pg_regexec(regex_t *re, > const chr *string, > size_t len, > +size_t search_start, > rm_detail_t *details, > size_t nmatch, > regmatch_t pmatch[], > *** > *** 219,224 > --- 221,227 > v->pmatch = pmatch; > v->details = details; > v->start = (chr *) string; > + v->search_start = (chr *) string + search_start; > v->stop = (chr *) string + len; > v->err = 0; > if (backref) > *** > *** 288,294 > NOERR(); > MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); > cold = NULL; > ! close = shortest(v, s, v->start, v->start, v->stop, &cold, (int *) > NULL); > freedfa(s); > NOERR(); > if (v->g->cflags & REG_EXPECT) > --- 291,298 > NOERR(); > MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); > cold = NULL; > ! close = shortest(v, s, v->search_start, v->search_start, v->stop, > ! &cold, (int *) NULL); > freedfa(s); > NOERR(); > if (v->g->cflags & REG_EXPECT) > *** > *** 415,421 > > assert(d != NULL && s != NULL); > cold = NULL; > ! close = v->start; > do > { > MDEBUG(("\ncsearch at %ld\n", LOFF(close))); > --- 419,425 > > assert(d != NULL && s != NULL); > cold = NULL; > ! close = v->search_start; > do > { > MDEBUG(("\ncsearch at %ld\n", LOFF(close))); > *** ./src/backend/utils/adt/regexp.c.orig Tue Jul 20 08:50:08 2004 > --- ./src/backend/utils/adt/regexp.c Tue Jul 20 09:00:05 2004 > *** > *** 80,116 > > > /* > ! * RE_compile_and_execute - compile and execute a RE, caching if possible >* > ! * Returns TRUE on match, FALSE on no match >* > ! * text_re --- the pattern, expressed as an *untoasted* TEXT object > ! * dat --- the data to match against (need not be null-terminated) > ! * dat_len --- the length of the data string > ! * cflags --- compile options for the pattern > ! * nmatch, pmatch --- optional return area for match details >* > ! * Both pattern and data are given in the database encoding. We > internally > ! * convert to array of pg_wchar which is what Spencer's regex package > wants. >*/ > ! static bool > ! RE_compile_and_execute(text *text_re, unsigned char *dat, int dat_len, > !int cflags, int nmatch, regmatch_t > *pmatch) > { > int text_re_len = VARSIZE(text_re); > - pg_wchar *data; > - size_t data_len; > pg_wchar *pattern; > size_t pattern_len; > int i; > int regcomp_result; > - int regexec_result; > cached_re_str re_temp; > > - /* Convert data string to wide characters */ > - data = (pg_wchar *)
Re: [PATCHES] regexp_replace
This has been saved for the 8.1 release: http:/momjian.postgresql.org/cgi-bin/pgpatches2 --- a_ogawa00 wrote: > > This patch provides a new function regexp_replace. > regexp_replace extends a replace function and enables text search > by the regular expression. And, a back reference can be used within > a replace string. > (This patch for PostgreSQL 7.4.3) > > Function: regexp_replace(str, pattern, replace_str) > Retuen Type: text > Description: Replace all matched string in str. > pattern is regular expression pattern. > replace_str is replace string that can use '\1' - '\9', and > '\&'. > '\1' - '\9' is back reference to the n'th subexpression. > '\&' is matched string. > > (example1) > select regexp_replace('ABC-DEF', '(\\w+)-(\\w+)', '\\2-\\1') > result: DEF-ABC > > (example2) > update tab1 set col1 = regexp_replace(col1, '[A-Z]', ''); > > --- > Atsushi Ogawa > [EMAIL PROTECTED] > > --- cut here --- > > *** ./src/backend/regex/regexec.c.origTue Jul 20 08:45:39 2004 > --- ./src/backend/regex/regexec.c Tue Jul 20 08:49:36 2004 > *** > *** 110,115 > --- 110,116 > regmatch_t *pmatch; > rm_detail_t *details; > chr*start; /* start of string */ > + chr*search_start; /* search start of string */ > chr*stop; /* just past end of string */ > int err;/* error code if any (0 none) > */ > regoff_t *mem;/* memory vector for backtracking */ > *** > *** 168,173 > --- 169,175 > pg_regexec(regex_t *re, > const chr *string, > size_t len, > +size_t search_start, > rm_detail_t *details, > size_t nmatch, > regmatch_t pmatch[], > *** > *** 219,224 > --- 221,227 > v->pmatch = pmatch; > v->details = details; > v->start = (chr *) string; > + v->search_start = (chr *) string + search_start; > v->stop = (chr *) string + len; > v->err = 0; > if (backref) > *** > *** 288,294 > NOERR(); > MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); > cold = NULL; > ! close = shortest(v, s, v->start, v->start, v->stop, &cold, (int *) > NULL); > freedfa(s); > NOERR(); > if (v->g->cflags & REG_EXPECT) > --- 291,298 > NOERR(); > MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); > cold = NULL; > ! close = shortest(v, s, v->search_start, v->search_start, v->stop, > ! &cold, (int *) NULL); > freedfa(s); > NOERR(); > if (v->g->cflags & REG_EXPECT) > *** > *** 415,421 > > assert(d != NULL && s != NULL); > cold = NULL; > ! close = v->start; > do > { > MDEBUG(("\ncsearch at %ld\n", LOFF(close))); > --- 419,425 > > assert(d != NULL && s != NULL); > cold = NULL; > ! close = v->search_start; > do > { > MDEBUG(("\ncsearch at %ld\n", LOFF(close))); > *** ./src/backend/utils/adt/regexp.c.orig Tue Jul 20 08:50:08 2004 > --- ./src/backend/utils/adt/regexp.c Tue Jul 20 09:00:05 2004 > *** > *** 80,116 > > > /* > ! * RE_compile_and_execute - compile and execute a RE, caching if possible >* > ! * Returns TRUE on match, FALSE on no match >* > ! * text_re --- the pattern, expressed as an *untoasted* TEXT object > ! * dat --- the data to match against (need not be null-terminated) > ! * dat_len --- the length of the data string > ! * cflags --- compile options for the pattern > ! * nmatch, pmatch --- optional return area for match details >* > ! * Both pattern and data are given in the database encoding. We > internally > ! * convert to array of pg_wchar which is what Spencer's regex package > wants. >*/ > ! static bool > ! RE_compile_and_execute(text *text_re, unsigned char *dat, int dat_len, > !int cflags, int nmatch, regmatch_t *pmatch) > { > int text_re_len = VARSIZE(text_re); > - pg_wchar *data; > - size_t data_len; > pg_wchar *pattern; > size_t pattern_len; > int i; > int regcomp_result; > - int regexec_result; > cached_re_str re_temp; > > - /* Convert data string to wide characters */ > - data = (pg_wchar *) palloc((dat_len + 1) * sizeof(pg_wchar)); > - data_len = pg_mb2wchar_with_len(dat, data, dat_len); > - > /* >* Look for a match among previously compiled REs. Since the data >
[PATCHES] regexp_replace
This patch provides a new function regexp_replace. regexp_replace extends a replace function and enables text search by the regular expression. And, a back reference can be used within a replace string. (This patch for PostgreSQL 7.4.3) Function: regexp_replace(str, pattern, replace_str) Retuen Type: text Description: Replace all matched string in str. pattern is regular expression pattern. replace_str is replace string that can use '\1' - '\9', and '\&'. '\1' - '\9' is back reference to the n'th subexpression. '\&' is matched string. (example1) select regexp_replace('ABC-DEF', '(\\w+)-(\\w+)', '\\2-\\1') result: DEF-ABC (example2) update tab1 set col1 = regexp_replace(col1, '[A-Z]', ''); --- Atsushi Ogawa [EMAIL PROTECTED] --- cut here --- *** ./src/backend/regex/regexec.c.orig Tue Jul 20 08:45:39 2004 --- ./src/backend/regex/regexec.c Tue Jul 20 08:49:36 2004 *** *** 110,115 --- 110,116 regmatch_t *pmatch; rm_detail_t *details; chr*start; /* start of string */ + chr*search_start; /* search start of string */ chr*stop; /* just past end of string */ int err;/* error code if any (0 none) */ regoff_t *mem;/* memory vector for backtracking */ *** *** 168,173 --- 169,175 pg_regexec(regex_t *re, const chr *string, size_t len, + size_t search_start, rm_detail_t *details, size_t nmatch, regmatch_t pmatch[], *** *** 219,224 --- 221,227 v->pmatch = pmatch; v->details = details; v->start = (chr *) string; + v->search_start = (chr *) string + search_start; v->stop = (chr *) string + len; v->err = 0; if (backref) *** *** 288,294 NOERR(); MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); cold = NULL; ! close = shortest(v, s, v->start, v->start, v->stop, &cold, (int *) NULL); freedfa(s); NOERR(); if (v->g->cflags & REG_EXPECT) --- 291,298 NOERR(); MDEBUG(("\nsearch at %ld\n", LOFF(v->start))); cold = NULL; ! close = shortest(v, s, v->search_start, v->search_start, v->stop, !&cold, (int *) NULL); freedfa(s); NOERR(); if (v->g->cflags & REG_EXPECT) *** *** 415,421 assert(d != NULL && s != NULL); cold = NULL; ! close = v->start; do { MDEBUG(("\ncsearch at %ld\n", LOFF(close))); --- 419,425 assert(d != NULL && s != NULL); cold = NULL; ! close = v->search_start; do { MDEBUG(("\ncsearch at %ld\n", LOFF(close))); *** ./src/backend/utils/adt/regexp.c.orig Tue Jul 20 08:50:08 2004 --- ./src/backend/utils/adt/regexp.cTue Jul 20 09:00:05 2004 *** *** 80,116 /* ! * RE_compile_and_execute - compile and execute a RE, caching if possible * ! * Returns TRUE on match, FALSE on no match * ! *text_re --- the pattern, expressed as an *untoasted* TEXT object ! *dat --- the data to match against (need not be null-terminated) ! *dat_len --- the length of the data string ! *cflags --- compile options for the pattern ! *nmatch, pmatch --- optional return area for match details * ! * Both pattern and data are given in the database encoding. We internally ! * convert to array of pg_wchar which is what Spencer's regex package wants. */ ! static bool ! RE_compile_and_execute(text *text_re, unsigned char *dat, int dat_len, ! int cflags, int nmatch, regmatch_t *pmatch) { int text_re_len = VARSIZE(text_re); - pg_wchar *data; - size_t data_len; pg_wchar *pattern; size_t pattern_len; int i; int regcomp_result; - int regexec_result; cached_re_str re_temp; - /* Convert data string to wide characters */ - data = (pg_wchar *) palloc((dat_len + 1) * sizeof(pg_wchar)); - data_len = pg_mb2wchar_with_len(dat, data, dat_len); - /*