Holger Bruch created LUCENE-8132:
------------------------------------

             Summary: HyphenationDecompoundTokenFilter does not set 
position/offset attributes correctly
                 Key: LUCENE-8132
                 URL: https://issues.apache.org/jira/browse/LUCENE-8132
             Project: Lucene - Core
          Issue Type: Bug
          Components: modules/analysis
    Affects Versions: 7.2.1, 6.6.1
            Reporter: Holger Bruch


HyphenationDecompoundTokenFilter and DictionaryDecompoundTokenFilter set 
positionIncrement to 0 for all subwords, reuse start/endoffset of the original 
token and ignore positionLength completly.

In consequence, the QueryBuilder generates a SynonymQuery comprising all 
subwords, which should rather treated as individual terms.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to