Alvaro Herrera wrote: > What I was really suggesting was having a table mapping locale names > into "tsearch languages". Then the configuration could be made based on > the language, not on the locale name. So the stopword list is for > "russian", regardless of whether the locale is Russian_Russia or ru_RU. > Agreed. But I'm afraid we couldn't map all of the locale names in a right way. Man, it's a large list. ;)
> Is this only for the stopword list, or does it also affect selecting a > stemmer? > Both. > Note: it's possible that the stopword list is different for brazilian > portuguese than portuguese portuguese, which is why I was suggesting > using a language "portuguese_brazil" and not just "postuguese". Whereas > you need a single stopword list for all the countries speaking spanish, > which is why you need only one language called spanish. > Indeed it's possible for portuguese, because we have some words that are written in different ways, e.g., pt_BR pt_PT english Mônica Mónica Monica ação acção action Irã Irão Iran . . . Will it be possible to disable stemming or stopwords removal? I'm asking this 'cause sometimes stemming doesn't lead to good results and/or stopwords are relevant. Maybe it could be an GUC variables ('enable_stemming' and 'enable_stopwords'). -- Euler Taveira de Oliveira http://www.timbira.com/ ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster