On Thu, 27 Jun 2013, [email protected] wrote:
> started to work on it, but I had several doubts: on DDD, I didn't know
> which language to choose, because, although I have no precise
> statistics, maybe we have one third among Catalan, Spanish and English,
> and some more.

Yes, an index has one stemming language only defined right now.  It may
be nice to be able to specify the language next to every document on the
system, and call appropriate stemmer then.  But the same thing would
have to be done to the search query patterns, so that matches would be
found.

Alternatively, you can try to choose the stemmer appropriate to the
majority language of your documents, and test whether such a stemmer
won't tokenise minority language documents too badly?

Best regards
--
Tibor Simko

Reply via email to