Re: [HACKERS] like/ilike improvements

Andrew Dunstan Wed, 19 Sep 2007 11:40:45 -0700


Guillaume Smet wrote:

Andrew, All,

On 5/22/07, Andrew Dunstan <[EMAIL PROTECTED]> wrote:

But before I commit this I'd appreciate seeing some more testing, both
for correctness and performance.


I finally found some time to test this patch on our data. As our
production database is still using 8.1, I made my tests with 8.1.10
and 8.3devel. As I had very weird results, I tested also 8.2.5.

The patch seems to work as expected in my locale. I didn't notice
problems during the tests I made except for the performance problem I
describe below.

The box is a recent dual core box using CentOS 5. It's a test box
installed specifically to test PostgreSQL 8.3. Every version is
compiled with the same compiler. Locale is fr_FR.UTF-8 and database is
UTF-8 too.
The table used to make the tests fits entirely in RAM.

I tested a simple ILIKE query on our data with 8.3devel and it was far
slower than with 8.1.10 (2 times slower). It was obviously not the
expected result as it should have been faster considering your work.
So I decided to test also with 8.2.5 and it seems a performance
regression was introduced in 8.2 (and not in 8.3 which is in fact a
bit faster than 8.2).

I saw this item in 8.2 release notes:
Allow ILIKE to work for multi-byte encodings (Tom)
Internally, ILIKE now calls lower() and then uses LIKE.
Locale-specific regular expression patterns still do not work in these
encodings.

Could it be responsible of such a slow down?

I attached the results of my tests. If anyone needs more information,
I'll be glad to provide them.


Ugh.

It's at least good to see that the LIKE case has some useful speedup in8.3.


Can you run the same set of tests in a single byte encoding like latin1?

We might have to look at doing on-demand lowering, but in a case likeyours it looks like we'd still end up lowering almost every characteranyway, so I'm not quite sure what to do. Note that the 8.2 change was abug fix, so we can't just revert it. Maybe we need to look closely atthe efficiency of lower().


cheers

andrew



---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

               http://www.postgresql.org/about/donate

Re: [HACKERS] like/ilike improvements

Reply via email to