Hi group,

Background:
I have a collection containing English and French documents. I made sure to 
index the English content in field "body" (fieldType=text_en) and the French 
content in field "body_fr" (fieldType=text_fr).

The user could be either English of French so the goal is to execute the 
queries against both fields simultaneously without knowing the query language 
upfront. The query is analyzed differently for each field. For both fields a 
stopFilter is configured with each its own list of stopwords (different per 
language).

The issue:
When I search for 'a result' (without single quotes) in field "body" and 
"body_fr" at the same time, then "a" is considered a stopword in English and 
removed for field "body", but not in French so both terms are still searched 
inside "body_fr". What happens is that the query is parsed (edismax) into this 
construction:

((body_fr:a)~1.0 (body:result | body_fr:result)~1.0)

This query returns only French documents, although there are many English 
documents in the index that contain the term 'result' as well. How can that 
happen? I think it is related to the way my query is parsed: there seems to be 
an AND-relationship between (body_fr:a) and (body:result | body_fr:result). 
There is no English document that has (body_fr:a), so that's why they don't 
show up. For me a much more logic parsed query would be:

((body:result)~1.0 | (body_fr:a body_fr:result)~1.0)

How should I interpret this? Is it a bug in edismax? Is it intended and if yes: 
why?

Thanks for any hint,
Tom

This email and any attachments may contain confidential or privileged 
information
and is intended for the addressee only. If you are not the intended recipient, 
please
immediately notify us by email or telephone and delete the original email and 
attachments
without using, disseminating or reproducing its contents to anyone other than 
the intended
recipient. Wolters Kluwer shall not be liable for the incorrect or incomplete 
transmission of
of this email or any attachments, nor for unauthorized use by its employees.

Wolters Kluwer nv has its registered address in Alphen aan den Rijn, The 
Netherlands, and is registered
with the Trade Registry of the Dutch Chamber of Commerce under number 33202517.

Reply via email to