[ https://issues.apache.org/jira/browse/SOLR-11976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361452#comment-16361452 ]
Tim Allison commented on SOLR-11976: ------------------------------------ I'm happy to open a separate issue/PR to factor out {{TextField}}'s {{analyzeMultiTerm}} in favor of {{Analyzer#normalize()}}. > TokenizerChain is overwriting, not chaining in normalize() > ---------------------------------------------------------- > > Key: SOLR-11976 > URL: https://issues.apache.org/jira/browse/SOLR-11976 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: search > Affects Versions: master (8.0) > Reporter: Tim Allison > Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > TokenizerChain is overwriting, not chaining tokenfilters in {{normalize}}. > This doesn't currently break search because {{normalize}} is not currently > being used at the Solr level (AFAICT); rather, TextField has its own > {{analyzeMultiTerm()}} that duplicates codes from the newer {{normalize}}. > Code as is: > {noformat} > TokenStream result = in; > for (TokenFilterFactory filter : filters) { > if (filter instanceof MultiTermAwareComponent) { > filter = (TokenFilterFactory) ((MultiTermAwareComponent) > filter).getMultiTermComponent(); > result = filter.create(in); > } > } > {noformat} > The fix is simple: > {noformat} > - result = filter.create(in); > + result = filter.create(result); > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org