[
https://issues.apache.org/jira/browse/SOLR-13233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763700#comment-16763700
]
Alan Woodward commented on SOLR-13233:
--------------------------------------
I'm honestly not sure what the correct fix here is - possibly we should change
WordDelimiterGraphFilter to emit its original token first? And check our other
TokenFilters to ensure that they all have this behaviour?
> SpellCheckCollator ignores stacked tokens
> -----------------------------------------
>
> Key: SOLR-13233
> URL: https://issues.apache.org/jira/browse/SOLR-13233
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Alan Woodward
> Priority: Major
>
> When building collations, SpellCheckCollator ignores any tokens with a
> position increment of 0, assuming that they've been injected and may
> therefore have incorrect offsets (injected terms generally keep the offsets
> of the terms they're replacing, as they don't themselves appear anywhere in
> the original source). However, this assumption is not necessarily correct -
> for example, WordDelimiterGraphFilter emits stacked tokens *before* the
> original token, because it needs to iterate through all stacked tokens to
> correctly set the original token's position length.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]