[ 
https://issues.apache.org/jira/browse/SOLR-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641404#action_12641404
 ] 

Grant Ingersoll commented on SOLR-532:
--------------------------------------

I consolidated this down to take advantage of Lucene's new clone method:
Index: src/java/org/apache/solr/analysis/WordDelimiterFilter.java
===================================================================
--- src/java/org/apache/solr/analysis/WordDelimiterFilter.java  (revision 
706648)
+++ src/java/org/apache/solr/analysis/WordDelimiterFilter.java  (working copy)
@@ -236,11 +236,7 @@
       startOff += start;     
     }
 
-    Token newTok = new Token(startOff,
-            endOff,
-            orig.type());
-    newTok.setTermBuffer(orig.termBuffer(), start, (end - start));
-    return newTok;
+    return (Token)orig.clone(orig.termBuffer(), start, (end - start), 
startOff, endOff);
   }

I will likely commit today or tomorrow.  Let me know if this works for you, 
Tricia.  The tests pass for me.

> WordDelimiterFilter ignores payloads
> ------------------------------------
>
>                 Key: SOLR-532
>                 URL: https://issues.apache.org/jira/browse/SOLR-532
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Tricia Williams
>            Assignee: Grant Ingersoll
>            Priority: Minor
>         Attachments: SOLR-532-WordDelimiterFilter.patch
>
>
> When a WordDelimiterFilter ingests a token stream and creates a new token 
> (newTok) it appears to copy most of the old token attributes, except the 
> payload.  I believe this is a bug.  My solution is for the 
> WordDelimiterFilter to use the Token clone() method to create a carbon copy 
> and then modify the appropriate attributes (offsets and term text). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to