Alvaro Herrera wrote:

> What I was really suggesting was having a table mapping locale names
> into "tsearch languages".  Then the configuration could be made based on
> the language, not on the locale name.  So the stopword list is for
> "russian", regardless of whether the locale is Russian_Russia or ru_RU.
Agreed. But I'm afraid we couldn't map all of the locale names in a
right way. Man, it's a large list. ;)

> Is this only for the stopword list, or does it also affect selecting a
> stemmer?

> Note: it's possible that the stopword list is different for brazilian
> portuguese than portuguese portuguese, which is why I was suggesting
> using a language "portuguese_brazil" and not just "postuguese".  Whereas
> you need a single stopword list for all the countries speaking spanish,
> which is why you need only one language called spanish.
Indeed it's possible for portuguese, because we have some words that are
written in different ways, e.g.,
pt_BR     pt_PT     english
Mônica    Mónica    Monica
ação      acção     action
Irã       Irão      Iran

Will it be possible to disable stemming or stopwords removal? I'm asking
this 'cause sometimes stemming doesn't lead to good results and/or
stopwords are relevant. Maybe it could be an GUC variables
('enable_stemming' and 'enable_stopwords').

  Euler Taveira de Oliveira

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to