I would merge stop_en.txt and stop_fr.txt. Use same set of stop words for all fields that you search on.
You might find this useful : http://bibwild.wordpress.com/2010/04/14/solr-stop-wordsdismax-gotcha/ --- On Wed, 3/13/13, Burgmans, Tom <tom.burgm...@wolterskluwer.com> wrote: > From: Burgmans, Tom <tom.burgm...@wolterskluwer.com> > Subject: strange edismax parsing when searching in multiple fields (#TB) > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > Date: Wednesday, March 13, 2013, 5:22 PM > Hi group, > > Background: > I have a collection containing English and French documents. > I made sure to index the English content in field "body" > (fieldType=text_en) and the French content in field > "body_fr" (fieldType=text_fr). > > The user could be either English of French so the goal is to > execute the queries against both fields simultaneously > without knowing the query language upfront. The query is > analyzed differently for each field. For both fields a > stopFilter is configured with each its own list of stopwords > (different per language). > > The issue: > When I search for 'a result' (without single quotes) in > field "body" and "body_fr" at the same time, then "a" is > considered a stopword in English and removed for field > "body", but not in French so both terms are still searched > inside "body_fr". What happens is that the query is parsed > (edismax) into this construction: > > ((body_fr:a)~1.0 (body:result | body_fr:result)~1.0) > > This query returns only French documents, although there are > many English documents in the index that contain the term > 'result' as well. How can that happen? I think it is related > to the way my query is parsed: there seems to be an > AND-relationship between (body_fr:a) and (body:result | > body_fr:result). There is no English document that has > (body_fr:a), so that's why they don't show up. For me a much > more logic parsed query would be: > > ((body:result)~1.0 | (body_fr:a body_fr:result)~1.0) > > How should I interpret this? Is it a bug in edismax? Is it > intended and if yes: why? > > Thanks for any hint, > Tom > > This email and any attachments may contain confidential or > privileged information > and is intended for the addressee only. If you are not the > intended recipient, please > immediately notify us by email or telephone and delete the > original email and attachments > without using, disseminating or reproducing its contents to > anyone other than the intended > recipient. Wolters Kluwer shall not be liable for the > incorrect or incomplete transmission of > of this email or any attachments, nor for unauthorized use > by its employees. > > Wolters Kluwer nv has its registered address in Alphen aan > den Rijn, The Netherlands, and is registered > with the Trade Registry of the Dutch Chamber of Commerce > under number 33202517. >