On Mon, Feb 8, 2010 at 6:07 PM, David E. Wheeler da...@kineticode.com wrote:
On Feb 8, 2010, at 5:15 AM, Magnus Hagander wrote:
The text is about 180Kb. PostgreSQL takes ~40 seconds without the
patch, ~36 seconds with it, to extract the match from it. Perl takes
0.016 seconds.
Obviously we
2010/2/1 Michael Glaesemann michael.glaesem...@myyearbook.com:
On Jan 31, 2010, at 22:14 , Tom Lane wrote:
The Tcl folk accepted that patch, so I went ahead and applied it to
our code. It would still be a good idea for us to do any testing we
can on it, though.
I applied the patch and ran
On Feb 8, 2010, at 5:15 AM, Magnus Hagander wrote:
The text is about 180Kb. PostgreSQL takes ~40 seconds without the
patch, ~36 seconds with it, to extract the match from it. Perl takes
0.016 seconds.
Obviously we need to support Perl regular expressions in core. Not PCRE, but
Perl. ;-P
On Jan 31, 2010, at 22:14 , Tom Lane wrote:
The Tcl folk accepted that patch, so I went ahead and applied it to
our code. It would still be a good idea for us to do any testing we
can on it, though.
I applied the patch and ran both the test query I submitted as well as
original
On Sat, Jan 30, 2010 at 08:07, Tom Lane t...@sss.pgh.pa.us wrote:
Michael Glaesemann michael.glaesem...@myyearbook.com writes:
We came across a regexp that takes very much longer than expected.
I poked into this a little bit. What seems to be happening is that the
use of non-greedy
I wrote:
I found a possible patch that seems to improve matters: AFAICS the DFA
matching is independent of the checking that cdissect() and friends do,
so we can apply that check first instead of second. This brings the
runtime down from minutes to sub-second on my machine. However I don't
2010/1/29 Alvaro Herrera alvhe...@commandprompt.com:
Hi Michael,
Michael Glaesemann wrote:
We came across a regexp that takes very much longer than expected.
PostgreSQL 8.4.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC)
4.1.2 20080704 (Red Hat 4.1.2-44), 64-bit
SELECT 'ooo...'
Magnus Hagander wrote:
2010/1/29 Alvaro Herrera alvhe...@commandprompt.com:
(There's a badly needed CHECK_FOR_INTERRUPTS in this code BTW)
Incidentally, I ran across the exact same issue with a non-greedy
regexp with a client earlier this week, and put on my TODO to figure
out a good place
Michael Glaesemann michael.glaesem...@myyearbook.com writes:
We came across a regexp that takes very much longer than expected.
I poked into this a little bit. What seems to be happening is that the
use of non-greedy quantifiers plus backreferences turns off most of the
optimization that the
We came across a regexp that takes very much longer than expected.
PostgreSQL 8.4.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-44), 64-bit
SELECT 'ooo...' ~ $r$Z(Q)[^Q]*A.*?(\1)$r$; -- omitted for email brevity
?column?
--
t
(1 row)
Time:
Hi Michael,
Michael Glaesemann wrote:
We came across a regexp that takes very much longer than expected.
PostgreSQL 8.4.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20080704 (Red Hat 4.1.2-44), 64-bit
SELECT 'ooo...' ~ $r$Z(Q)[^Q]*A.*?(\1)$r$; -- omitted for email
On Jan 28, 2010, at 21:59 , Alvaro Herrera wrote:
Hi Michael,
Michael Glaesemann wrote:
We came across a regexp that takes very much longer than expected.
PostgreSQL 8.4.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc
(GCC) 4.1.2 20080704 (Red Hat 4.1.2-44), 64-bit
SELECT 'ooo...' ~
Michael Glaesemann wrote:
On Jan 28, 2010, at 21:59 , Alvaro Herrera wrote:
Hi Michael,
Michael Glaesemann wrote:
We came across a regexp that takes very much longer than expected.
PostgreSQL 8.4.1 on x86_64-unknown-linux-gnu, compiled by GCC
gcc (GCC) 4.1.2 20080704 (Red Hat
Michael Glaesemann wrote:
However, as you point out, Postgres doesn't appear to take this into
account:
postgres=# select regexp_replace('oooZQoooAoooQooQooQooo', $r$(Z(Q)
[^Q]*A.*(\2))$r$, $s$X$s$);
regexp_replace
oooXooo
(1 row)
postgres=# select
On Jan 28, 2010, at 23:21 , Alvaro Herrera wrote:
I think the reason for this is that the first * is greedy and thus the
entire expression is considered greedy. The fact that you've made the
second * non-greedy does not ungreedify the RE ... Note the docs say:
The above rules
15 matches
Mail list logo