[
https://issues.apache.org/jira/browse/SOLR-11022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim Ferenczi moved LUCENE-7893 to SOLR-11022:
---------------------------------------------
Affects Version/s: (was: 6.6)
6.6
Security: Public
Lucene Fields: (was: New)
Key: SOLR-11022 (was: LUCENE-7893)
Project: Solr (was: Lucene - Core)
> SynonymGraphFilterFactory proximity search error
> ------------------------------------------------
>
> Key: SOLR-11022
> URL: https://issues.apache.org/jira/browse/SOLR-11022
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 6.6
> Reporter: Diogo Guilherme Leão Edelmuth
>
> There seems to be an issue when doing proximity searches that include terms
> that have multi-word synonyms.
> Example:
> consider there's is configured in synonyms.txt
> (
> grand mother, grandmother
> grandfather, granddad
> )
> and there's an indexed field with: (My mother and my grandmother went...)
> Proximity search with: ("mother grandmother"~8)
> won't return the file, while ("father grandfather"~8) does return the
> analogous file.
> I am not a developer of Solr, so pardon if I am wrong, but I ran it with
> debug=query and saw that when proximity searches are done with multi-term
> synonyms, the called function is spanNearQuery:
> "parsedquery":"SpanNearQuery(spanNear([laudo:mother,
> spanOr([laudo:grand mother, laudo:grandmother])],*0*, true))"
> while proximity searches with one-term synonyms are executed with:
> "MultiPhraseQuery(laudo:\"father (grandfather granddad)\"~10)"
> Note that the SpanNearQuery is called with a slope parameter of 0, no matter
> what is passed after the tilde. So if I search the exact phrase it does match.
> Here is my field-type, just in case:
> <fieldType name="text_pt_synonyms_ascii_minimal_lightStem"
> class="solr.TextField" positionIncrementGap="100">
> <analyzer type="index">
> <tokenizer class="solr.StandardTokenizerFactory"/>
> <filter class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" format="snowball"
> words="lang/stopwords_pt.txt" ignoreCase="true"/>
> <filter class="solr.PortugueseLightStemFilterFactory"/>
> </analyzer>
> <analyzer type="query">
> <tokenizer class="solr.StandardTokenizerFactory"/><filter
> class="solr.LowerCaseFilterFactory"/>
> <filter class="solr.StopFilterFactory" format="snowball"
> words="lang/stopwords_pt.txt" ignoreCase="true"/><filter
> class="solr.ASCIIFoldingFilterFactory" preserveOriginal="true"/>
> <filter class="solr.SynonymGraphFilterFactory" expand="true"
> ignoreCase="true" synonyms="synonyms_radex.txt"/>
> <filter class="solr.PortugueseLightStemFilterFactory"/>
> </analyzer>
> </fieldType>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]