Re: [PERFORM] LIKE search and performance

Richard Huxton Fri, 25 May 2007 11:17:06 -0700

Gregory Stark wrote:

"Richard Huxton" <[EMAIL PROTECTED]> writes:

Now you and I can look at a substring and probably make a good guess how common
it is (assuming we know the targets are British surnames or Japanese towns). PG
needs one number - or rather, it picks one number for each length of
search-string (afaik).


I don't think that's true. Postgres calculates the lower and upper bound
implied by the search pattern and then uses the histogram to estimate how
selective that range is. It's sometimes surprisingly good but obviously it's
not perfect.


Sorry - I'm obviously picking my words badly today.

I meant for the "contains" substring match. It gives different (goesaway and checks...yes) predictions based on string length. So it guessesthat LIKE '%aaa%' will match more than LIKE '%aaaa%'. Of course, if wewere matching surnames you and I could say that this is very unlikely,but without some big statistics table I guess there's not much more PGcan do.

For a trailing wildcard LIKE 'aaa%' it can and does as you say convertthis into something along the lines of (>= 'aaa' AND < 'aab'). AlthoughIIRC that depends if your locale allows such (not sure, I don't reallyuse non-C/non-English locales enough).


--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] LIKE search and performance

Reply via email to