Re: Multi words query time synonyms

Steve Rowe Sat, 10 Feb 2018 11:31:39 -0800

Hi Dominique,

Looks like it’s a bug, not sure where exactly though.  Can you please create a 
JIRA?


I can see the same behavior on master too, not just on the 
releases/lucene-solr/6.6.2 tag.

One interesting thing I found is that if I remove the stop filter from the 
query analyzer, I get the following for qq=“maillot om”:

+((name_text_gp:maillot) (((+name_text_gp:olympiqu +name_text_gp:de 
+name_text_gp:marseil) name_text_gp:om)))

(btw my stop list only has “de” on it)

Thanks,

--
Steve
www.lucidworks.com

> On Feb 10, 2018, at 2:12 AM, Dominique Bejean <dominique.bej...@eolya.fr> 
> wrote:
> 
> Hi,
> 
> More info.
> 
> When I test the analisys for the field type the synonyms are correctly
> expanded for both expressions
> 
> om maillot
> maillot om
> olympique de marseille maillot
> maillot olympique de marseille
> 
> resulting outputs always include the following terms (obvioulsly not always
> in the same order)
> 
> olympiqu om marseil maillot
> 
> 
> So, i suspect an issue with edismax query parser.
> 
> Regards.
> 
> Dominique
> 
> 
> Le ven. 9 févr. 2018 à 18:25, Dominique Bejean <dominique.bej...@eolya.fr>
> a écrit :
> 
>> Hi,
>> 
>> I am trying multi words query time synonyms with Solr 6.6.2and
>> SynonymGraphFilterFactory filter as explain in this article
>> 
>> https://lucidworks.com/2017/04/18/multi-word-synonyms-solr-adds-query-time-support/
>> 
>> My field type is :
>> 
>> <fieldType name="textSyn" class="solr.TextField"
>> positionIncrementGap="100">
>>    <analyzer type="index">
>>      <tokenizer class="solr.StandardTokenizerFactory"/>
>>      <filter class="solr.ElisionFilterFactory" ignoreCase="true"
>>            articles="lang/contractions_fr.txt"/>
>>      <filter class="solr.LowerCaseFilterFactory"/>
>>      <filter class="solr.ASCIIFoldingFilterFactory"/>
>>      <filter class="solr.StopFilterFactory" words="stopwords.txt"
>> ignoreCase="true"/>
>>      <filter class="solr.FrenchMinimalStemFilterFactory"/>
>>    </analyzer>
>>    <analyzer type="query">
>>      <tokenizer class="solr.StandardTokenizerFactory"/>
>>      <filter class="solr.ElisionFilterFactory" ignoreCase="true"
>>            articles="lang/contractions_fr.txt"/>
>>      <filter class="solr.LowerCaseFilterFactory"/>
>>      <filter class="solr.SynonymGraphFilterFactory"
>> synonyms="synonyms.txt"
>>            ignoreCase="true" expand="true"/>
>>      <filter class="solr.ASCIIFoldingFilterFactory"/>
>>      <filter class="solr.StopFilterFactory" words="stopwords.txt"
>> ignoreCase="true"/>
>>      <filter class="solr.FrenchMinimalStemFilterFactory"/>
>>    </analyzer>
>>  </fieldType>
>> 
>> 
>> synonyms.txt contains the line
>> 
>> om, olympique de marseille
>> 
>> 
>> The order of words in my query has an impact on the generated query in
>> edismax
>> 
>> q={!edismax qf='name_text_gp' v=$qq}
>> &sow=false
>> &qq=...
>> 
>> with "qq=om maillot" or "qq=olympique de marseille maillot", I can see the
>> synonyms expansion. It is working as expected.
>> 
>> "parsedquery_toString":"+(((+name_text_gp:olympiqu +name_text_gp:marseil
>> +name_text_gp:maillot) name_text_gp:om))",
>> "parsedquery_toString":"+((name_text_gp:om (+name_text_gp:olympiqu
>> +name_text_gp:marseil +name_text_gp:maillot)))",
>> 
>> 
>> with "qq=maillot om" or "qq=maillot olympique de marseille", I can see the
>> same generated query
>> 
>> "parsedquery_toString":"+((name_text_gp:maillot) (name_text_gp:om))",
>> "parsedquery_toString":"+((name_text_gp:maillot) (name_text_gp:om))",
>> 
>> I don't understand these generated queries. The first one looks like the
>> synonym expansion is ignored, but the second one shows it is not ignored
>> and only the synonym term is used.
>> 
>> 
>> What is wrong in the way I am doing this ?
>> 
>> Regards
>> 
>> Dominique
>> 
>> --
>> Dominique Béjean
>> 06 08 46 12 43
>> 
> -- 
> Dominique Béjean
> 06 08 46 12 43

Re: Multi words query time synonyms

Reply via email to