Alexander Korotkov <aekorot...@gmail.com> writes: > Next revision of patch is attached. Changes are so: > 1) Notion "penalty" is used instead of "size". > 2) We try to reduce total penalty to WISH_TRGM_PENALTY, but restriction is > MAX_TRGM_COUNT total trigrams count. > 3) Penalties are assigned to particular color trigram classes. I.e. > separate penalties for __a, _aa, _a_, aa_. It's based on analysis of > trigram frequencies in Oscar Wilde writings. We can end up with different > numbers, but I don't think they will be dramatically different.
Committed with cosmetic improvements (adjusting the comments mostly). The new whitespace penalties look reasonably sane to me. I wonder though if WISH_TRGM_PENALTY is too small --- it seems like this code will tend to select many fewer trigrams than the old code did. What testing did you do that led you to select the specific value of 16? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers