I redid my trials with the same data set on 7.2.3 --with-multibyte and I
get the same brutal performance hit, so it is definitely a
multibyte-specific problem.
  WRT the distribution of the data in the table, I used the following:
All g-words in /usr/share/dict with different processes attached:
  no process
  init caps.
  word || row_id

There are only about 1000 words that appear more than once (2 or 3 times)
in 27k rows.
  -Wade Klaver

At 11:08 PM 2/3/03 -0500, Tom Lane wrote:
>Next question: may I guess that you weren't using MULTIBYTE in 7.2?
>After still more digging, I'm coming round to the opinion that the
>problem is that MULTIBYTE is forced on in 7.3, and this imposes a
>factor-of-256 overhead in a bunch of the operations in regcomp.c.
>In particular, compiling a case-independent regex is now hugely
>more expensive than it used to be.
>The parties who wanted to force MULTIBYTE on promised that there 
>would be no such penalties :-(
>                       regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to