Re: [HACKERS] pg_trgm

Greg Stark Sun, 30 May 2010 10:00:18 -0700

On Sun, May 30, 2010 at 3:41 PM, Tom Lane <[email protected]> wrote:
> I don't think it's unreasonable to insist that behavioral changes be
> made in an upward compatible fashion ... especially ones that seem as
> least as likely to break some current usages as to enable new usages.


Fwiw I don't think we've traditionally been so tense about contrib
modules. With the advent of extensions that users can easily install
with a single command that might be about to change though.

There seem to be three behaviours on the table here:

1) Status quo -- only alpha and digit characters for the current
locale are considered word elements

2) All characters aside from space characters for the current locale
are considered word elements

3) Alpha and digit characters for the current locale, and for C locale
any non-ascii (high bit set) character is considered a word element

1 -> 3 seems like a pretty safe change considering that anyone using
non-ascii characters in C locale probably isn't using pg_tgrm or they
would be complaining about it already. How big a user-base do we think
pg_tgrm has anyways? How many of those are using it on non-ascii
characters in C locale? And of those how many expect the non-ascii
characters to be considered non-word characters? It doesn't sound like
terribly useful behaviour to me.

Behaviour 2 also seems like it would be useful so providing it as well
is also a perfectly reasonable option. But I agree that 1->2 would be
a user-visible change for basically all users so it would have to be
an additional option.

-- 
greg

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_trgm

Reply via email to