Chad Siongco created SOLR-11700:
-----------------------------------
Summary: WordDelimiterGraphFilterFactory token positions
Key: SOLR-11700
URL: https://issues.apache.org/jira/browse/SOLR-11700
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: Schema and Analysis
Affects Versions: 7.1
Environment: Mac OSX, JDK 8
Reporter: Chad Siongco
Token position Generated after WordDelimiterGraphFilterFactory are incorrect.
This causes problems when doing phrase searches.
As stated in the following link,
https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html
<analyzer type="query">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.WordDelimiterGraphFilterFactory" catenateAll="1"/>
</analyzer>
In: "XL-4000/ES"
Tokenizer to Filter: "XL-4000/ES"(1)
Out: "XL"(1), "4000"(2), "ES"(3), "XL4000ES"(3)
But in my Machine, notice that the concatenated word is at position 1, it
should be position 3:
Out: XL4000ES"(1)", XL"(1), "4000"(2), "ES"(3), "
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]