Re: [HACKERS] ILIKE vs indices

2012-12-29 Thread James Cloos
 TL == Tom Lane t...@sss.pgh.pa.us writes:

JC Is there any contraindication to recasting:
JC foo ILIKE 'bar'
JC into:
JC LOWER(foo) LIKE LOWER('bar')

TL In some locales those are not equivalent, I believe, or at least
TL shouldn't be.  (What the current code actually does is a separate
TL question.)

I see.  After determining indexing based on th existance of an initial
fixed string, exluding anything matching isalpha(), it uses tolower(3)
and friends to do the actual match.

So my proposal wouldn't change what matches, but might make fixing any
bugs in what *should* match more difficult?

TL In any case it's not obvious why LOWER rather than UPPER.

I suggested lower() because that matched all of the suggestions I found.
And as it turns out matches the current behaviour, too.

The footnote about adding explicit mention to the docs was expressly
because it is not otherwise obvious whether indices should use lower()
or upper().

I'll ask on one of the unicode lists whether there are any locales where
a case-insensitive match should be different than a case-preserving match
of tolower() vs tolower().

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ILIKE vs indices

2012-12-29 Thread Greg Stark
On Fri, Dec 28, 2012 at 11:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 James Cloos cl...@jhcloos.com writes:
 Is there any contraindication to recasting:
   foo ILIKE 'bar'
 into:
   LOWER(foo) LIKE LOWER('bar')

 In some locales those are not equivalent, I believe, or at least
 shouldn't be.  (What the current code actually does is a separate
 question.)

What it actually does is actually *precisely* the above.

I can't quite wrap my head around the idea of LIKE and collations
having any meaningful interaction anyways. I certainly can't come up
with anything better than lower() like lower() (or upper() like
upper()).

It would be nice to document what ILIKE actually means. Right now it's
kind of mysterious. And if we can't come up with anything better than
lower() like lower() then why not go ahead and document it and take
advantage of it.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ILIKE vs indices

2012-12-29 Thread Greg Stark
On Sat, Dec 29, 2012 at 2:05 PM, Greg Stark st...@mit.edu wrote:
 I can't quite wrap my head around the idea of LIKE and collations
 having any meaningful interaction anyways. I certainly can't come up
 with anything better than lower() like lower() (or upper() like
 upper()).

Hm. Maybe I spoke too fast. Perhaps we should just call strcasecmp()
character by character, or even call strcasecmp() on any substring of
the pattern that doesn't contain _ or % ? The latter would be pretty
hopeless to ever use a btree index though.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ILIKE vs indices

2012-12-28 Thread James Cloos
While tuning an application, I found the posts from 2003 recomending the
use of LOWER() and LIKE in place of ILIKE to take advantage of indices.

For this app, given the limitations of the upper-layer protocol it must
support, that change replaced about 30 minutes of repeated seq scans with
about 1 minute of repeated index scans!  On a query-set often repeated
several times per day.  (Probably more times per day now.)

Is there any contraindication to recasting:

  foo ILIKE 'bar'

into:

  LOWER(foo) LIKE LOWER('bar')

and documenting that an index has to be on LOWER(column) to benefit ILIKE?

Perhaps the parser could read the former as the latter?

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ILIKE vs indices

2012-12-28 Thread Tom Lane
James Cloos cl...@jhcloos.com writes:
 Is there any contraindication to recasting:
   foo ILIKE 'bar'
 into:
   LOWER(foo) LIKE LOWER('bar')

In some locales those are not equivalent, I believe, or at least
shouldn't be.  (What the current code actually does is a separate
question.)

 Perhaps the parser could read the former as the latter?

Not unless the equivalence can be shown to be exact, which I doubt.
In any case it's not obvious why LOWER rather than UPPER.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers