I don't think this is something to consider across the board for all languages. The same grammatical units that are part of a word in one language (and removed by stemmers) are independent morphemes in others (and should be stopwords)
so please take this advice on a case-by-case basis for each language. On Tue, Jan 12, 2010 at 9:20 PM, Lance Norskog <goks...@gmail.com> wrote: > There are a lot of projects that don't use stopwords any more. You > might consider dropping them altogether. > > On Mon, Jan 11, 2010 at 2:25 PM, Don Werve <d...@madwombat.com> wrote: >> This is the way I've implemented multilingual search as well. >> >> 2010/1/11 Markus Jelsma <mar...@buyways.nl> >> >>> Hello, >>> >>> >>> We have implemented language specific search in Solr using language >>> specific fields and field types. For instance, an en_text field type can >>> use an English stemmer, and list of stopwords and synonyms. We, however >>> did not use specific stopwords, instead we used one list shared by both >>> languages. >>> >>> So you would have a field type like: >>> <fieldType name="en_text" class="solr.TextField" ... >>> <analyzer type=""> >>> <filter class="solr.StopFilterFactory" words="stopwords.en.txt"> >>> <filter class="solr.SynonymFilterFactory" synonyms="synoyms.en.txt"> >>> >>> etc etc. >>> >>> >>> >>> Cheers, >>> >>> - >>> Markus Jelsma Buyways B.V. >>> Technisch Architect Friesestraatweg 215c >>> http://www.buyways.nl 9743 AD Groningen >>> >>> >>> Alg. 050-853 6600 KvK 01074105 >>> Tel. 050-853 6620 Fax. 050-3118124 >>> Mob. 06-5025 8350 In: http://www.linkedin.com/in/markus17 >>> >>> >>> On Mon, 2010-01-11 at 13:45 +0100, Daniel Persson wrote: >>> >>> > Hi Solr users. >>> > >>> > I'm trying to set up a site with Solr search integrated. And I use the >>> > SolJava API to feed the index with search documents. At the moment I >>> > have only activated search on the English portion of the site. I'm >>> > interested in using as many features of solr as possible. Synonyms, >>> > Stopwords and stems all sounds quite interesting and useful but how do >>> > I set up this in a good way for a multilingual site? >>> > >>> > The site don't have a huge text mass so performance issues don't >>> > really bother me but still I'd like to hear your suggestions before I >>> > try to implement an solution. >>> > >>> > Best regards >>> > >>> > Daniel >>> >> > > > > -- > Lance Norskog > goks...@gmail.com > -- Robert Muir rcm...@gmail.com