Chad Siongco created SOLR-11700:
-----------------------------------

             Summary: WordDelimiterGraphFilterFactory token positions
                 Key: SOLR-11700
                 URL: https://issues.apache.org/jira/browse/SOLR-11700
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: Schema and Analysis
    Affects Versions: 7.1
         Environment: Mac OSX, JDK 8
            Reporter: Chad Siongco


Token position Generated after WordDelimiterGraphFilterFactory are incorrect. 

This causes problems when doing phrase searches.

As stated in the following link,
https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html 

<analyzer type="query">
  <tokenizer class="solr.WhitespaceTokenizerFactory"/>
  <filter class="solr.WordDelimiterGraphFilterFactory" catenateAll="1"/>
</analyzer>

In: "XL-4000/ES"
Tokenizer to Filter: "XL-4000/ES"(1)
Out: "XL"(1), "4000"(2), "ES"(3), "XL4000ES"(3)

But in my Machine, notice that the concatenated word is at position 1, it 
should be position 3:
Out: XL4000ES"(1)", XL"(1), "4000"(2), "ES"(3), "




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to