Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes: > If it did that might be a nice solution, but I'm not sure that it does > use B-M ... I can't find either "Boyer" or "Moore" in its source code. > > There's no particular reason to suppose offhand that a regex engine > would be faster than the naive code for

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Alvaro Herrera
Hannu Krosing wrote: > I had a (false ?) memory that we used some variant of pcre, and that > pcre uses BM. I may be false on both accounts. (I know that python > borrowed its re module from pcre). Our code is a derivative from Henry Spencer's code found in Tcl. It certainly isn't Boyer Moore,

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Hannu Krosing
Ühel kenal päeval, L, 2006-05-20 kell 01:34, kirjutas Hannu Krosing: > Ühel kenal päeval, R, 2006-05-19 kell 18:18, kirjutas Tom Lane: > > Hannu Krosing <[EMAIL PROTECTED]> writes: > > > I guess our regex implementation already uses boyer-moore or similar. > > > Why not just expose the match positi

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-05-19 kell 18:18, kirjutas Tom Lane: > Hannu Krosing <[EMAIL PROTECTED]> writes: > > I guess our regex implementation already uses boyer-moore or similar. > > Why not just expose the match position of substring('text' in 'regex') > > using some function, called match_pos

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Tom Lane
Hannu Krosing <[EMAIL PROTECTED]> writes: > I guess our regex implementation already uses boyer-moore or similar. > Why not just expose the match position of substring('text' in 'regex') > using some function, called match_position(int searched_text, int > regex, int matchnum) ? If it did that mi

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Mark Dilger
Tom Lane wrote: > Greg Stark <[EMAIL PROTECTED]> writes: > >>Tom Lane <[EMAIL PROTECTED]> writes: >> >>>And how much code would those take? The bottom line here is that we >>>don't have a pile of complaints about the performance of text_position, >>>so it's difficult to justify making it much mor

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Hannu Krosing
Ühel kenal päeval, R, 2006-05-19 kell 11:20, kirjutas Jim C. Nasby: > On Thu, May 18, 2006 at 06:49:38PM -0700, Mark Dilger wrote: > > > I would think that the worst-case times would be fairly improbable. > > > I'm disinclined to push something as complicated as Boyer-Moore matching > > > into this

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Tom Lane
Greg Stark <[EMAIL PROTECTED]> writes: > Tom Lane <[EMAIL PROTECTED]> writes: >> And how much code would those take? The bottom line here is that we >> don't have a pile of complaints about the performance of text_position, >> so it's difficult to justify making it much more complicated than it >>

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Greg Stark
Tom Lane <[EMAIL PROTECTED]> writes: > Alvaro Herrera <[EMAIL PROTECTED]> writes: > > Tom Lane wrote: > >> You've obviously missed the point of my concern, which is code bloat. > > > So why not just replace our code with better algorithms? We could use > > Shift-Or or Shift-And which AFAIK are e

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> You've obviously missed the point of my concern, which is code bloat. > So why not just replace our code with better algorithms? We could use > Shift-Or or Shift-And which AFAIK are even better than Boyer-Moore. And how much code wo

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Alvaro Herrera
Tom Lane wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > Perhaps it would be best to add a seperate set of functions that use > > boyer-moore, and reference them in appropriate places in the > > documentation. Unless someone has a better idea on how we can find out > > what people are actua

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > Perhaps it would be best to add a seperate set of functions that use > boyer-moore, and reference them in appropriate places in the > documentation. Unless someone has a better idea on how we can find out > what people are actually doing in the field...

Re: [HACKERS] text_position worst case runtime

2006-05-19 Thread Jim C. Nasby
On Thu, May 18, 2006 at 06:49:38PM -0700, Mark Dilger wrote: > > I would think that the worst-case times would be fairly improbable. > > I'm disinclined to push something as complicated as Boyer-Moore matching > > into this function without considerable evidence that it's a performance > > bottlene

Re: [HACKERS] text_position worst case runtime

2006-05-18 Thread Mark Dilger
Tom Lane wrote: > Mark Dilger <[EMAIL PROTECTED]> writes: > >>The function >> static int32 text_position(text *t1, text *t2, int matchnum) >>defined in src/backend/utils/adt/varlena.c uses repeated calls to >>strncmp (or pg_wchar_strncmp) to find the location of the pattern in >>the text. The wo

Re: [HACKERS] text_position worst case runtime

2006-05-18 Thread Tom Lane
Mark Dilger <[EMAIL PROTECTED]> writes: > The function > static int32 text_position(text *t1, text *t2, int matchnum) > defined in src/backend/utils/adt/varlena.c uses repeated calls to > strncmp (or pg_wchar_strncmp) to find the location of the pattern in > the text. The worst case runtime for