Re: [HACKERS] tsearch in core patch

Alvaro Herrera Fri, 22 Jun 2007 12:32:16 -0700

[EMAIL PROTECTED] wrote:
> > Why not do it the other way around?
> > es_ES               spanish
> > Spanish_Spain       spanish
> > ru_RU               russian
> > pt_BR               portuguese_brazil
> >
> > That way you don't need any funny index.  Or do you need the list of
> > locales for each language? (but even if you do, you can easily obtain it
> > by indexing both columns separately using btrees anyway)
> 
> Yes, that's possible but that icreases number of identical configuration:
> russian_win     Russian_Russia
> russian_unix    ru_RU
> 
> They doesn't differ except locale name.


But why do you need them to be different at all?  Just make it

russian     Russian_Russia
russian     ru_RU

Does that not work for some reason?

What I was really suggesting was having a table mapping locale names
into "tsearch languages".  Then the configuration could be made based on
the language, not on the locale name.  So the stopword list is for
"russian", regardless of whether the locale is Russian_Russia or ru_RU.

Is this only for the stopword list, or does it also affect selecting a
stemmer?

Note: it's possible that the stopword list is different for brazilian
portuguese than portuguese portuguese, which is why I was suggesting
using a language "portuguese_brazil" and not just "postuguese".  Whereas
you need a single stopword list for all the countries speaking spanish,
which is why you need only one language called spanish.

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Llegará una época en la que una investigación diligente y prolongada sacará
a la luz cosas que hoy están ocultas" (Séneca, siglo I)

---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

Re: [HACKERS] tsearch in core patch

Reply via email to