[ 
https://issues.apache.org/jira/browse/SOLR-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-2800:
------------------------------

    Issue Type: Improvement  (was: Bug)
       Summary: optimize RemoveDuplicatesTokenFilterFactory  (was: 
RemoveDuplicatesTokenFilterFactory can not remove the duplicated term)
    
> optimize RemoveDuplicatesTokenFilterFactory
> -------------------------------------------
>
>                 Key: SOLR-2800
>                 URL: https://issues.apache.org/jira/browse/SOLR-2800
>             Project: Solr
>          Issue Type: Improvement
>          Components: Schema and Analysis
>    Affects Versions: 3.4
>         Environment: Windows
>            Reporter: Han Hui Wen 
>            Assignee: Robert Muir
>              Labels: RemoveDuplicatesTokenFilterFactory, Solr
>
> Using RemoveDuplicatesTokenFilterFactory can not remove the duplicated term.
> in 
> http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_3_4/solr/core/src/java/org/apache/solr/analysis/RemoveDuplicatesTokenFilter.java?view=markup
> @Override
> 53    public boolean incrementToken() throws IOException {
> 54    while (input.incrementToken()) {
> 55    final char term[] = termAttribute.buffer();
> 56    final int length = termAttribute.length();
> 57    final int posIncrement = posIncAttribute.getPositionIncrement();
> 58    
> 59    if (posIncrement > 0) {
> 60    previous.clear();
> 61    }
> 62    
> 63    boolean duplicate = (posIncrement == 0 && previous.contains(term, 0, 
> length));
> 64    
> 65    // clone the term, and add to the set of seen terms.
> 66    char saved[] = new char[length];
> 67    System.arraycopy(term, 0, saved, 0, length);
> 68    previous.add(saved);
> 69    
> 70    if (!duplicate) {
> 71    return true;
> 72    }
> 73    }
> 74    return false;
> 75    }
> it should be like following:
> @Override
> public boolean incrementToken() throws IOException {
>       while (input.incrementToken()) {
>               final char term[] = termAttribute.buffer();
>               final int length = termAttribute.length();
>               final int posIncrement = posIncAttribute.getPositionIncrement();
>               if (posIncrement > 0) {
>                       previous.clear();
>               }
>               boolean duplicate = (posIncrement == 0 && 
> previous.contains(term, 0, length));
>                
>               if(duplicate )
>               {
>                 return false;
>               }
>               else
>               {
>                       // clone the term, and add to the set of seen terms.
>                       char saved[] = new char[length];
>                       System.arraycopy(term, 0, saved, 0, length);
>                       previous.add(saved);
>               }
>       }
>       return true;
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to