[ https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641404#action_12641404 ]
Grant Ingersoll commented on SOLR-532: -------------------------------------- I consolidated this down to take advantage of Lucene's new clone method: Index: src/java/org/apache/solr/analysis/WordDelimiterFilter.java =================================================================== --- src/java/org/apache/solr/analysis/WordDelimiterFilter.java (revision 706648) +++ src/java/org/apache/solr/analysis/WordDelimiterFilter.java (working copy) @@ -236,11 +236,7 @@ startOff += start; } - Token newTok = new Token(startOff, - endOff, - orig.type()); - newTok.setTermBuffer(orig.termBuffer(), start, (end - start)); - return newTok; + return (Token)orig.clone(orig.termBuffer(), start, (end - start), startOff, endOff); } I will likely commit today or tomorrow. Let me know if this works for you, Tricia. The tests pass for me. > WordDelimiterFilter ignores payloads > ------------------------------------ > > Key: SOLR-532 > URL: https://issues.apache.org/jira/browse/SOLR-532 > Project: Solr > Issue Type: Bug > Reporter: Tricia Williams > Assignee: Grant Ingersoll > Priority: Minor > Attachments: SOLR-532-WordDelimiterFilter.patch > > > When a WordDelimiterFilter ingests a token stream and creates a new token > (newTok) it appears to copy most of the old token attributes, except the > payload. I believe this is a bug. My solution is for the > WordDelimiterFilter to use the Token clone() method to create a carbon copy > and then modify the appropriate attributes (offsets and term text). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.