[
https://issues.apache.org/jira/browse/SOLR-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated SOLR-2800:
------------------------------
Issue Type: Improvement (was: Bug)
Summary: optimize RemoveDuplicatesTokenFilterFactory (was:
RemoveDuplicatesTokenFilterFactory can not remove the duplicated term)
> optimize RemoveDuplicatesTokenFilterFactory
> -------------------------------------------
>
> Key: SOLR-2800
> URL: https://issues.apache.org/jira/browse/SOLR-2800
> Project: Solr
> Issue Type: Improvement
> Components: Schema and Analysis
> Affects Versions: 3.4
> Environment: Windows
> Reporter: Han Hui Wen
> Assignee: Robert Muir
> Labels: RemoveDuplicatesTokenFilterFactory, Solr
>
> Using RemoveDuplicatesTokenFilterFactory can not remove the duplicated term.
> in
> http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_3_4/solr/core/src/java/org/apache/solr/analysis/RemoveDuplicatesTokenFilter.java?view=markup
> @Override
> 53 public boolean incrementToken() throws IOException {
> 54 while (input.incrementToken()) {
> 55 final char term[] = termAttribute.buffer();
> 56 final int length = termAttribute.length();
> 57 final int posIncrement = posIncAttribute.getPositionIncrement();
> 58
> 59 if (posIncrement > 0) {
> 60 previous.clear();
> 61 }
> 62
> 63 boolean duplicate = (posIncrement == 0 && previous.contains(term, 0,
> length));
> 64
> 65 // clone the term, and add to the set of seen terms.
> 66 char saved[] = new char[length];
> 67 System.arraycopy(term, 0, saved, 0, length);
> 68 previous.add(saved);
> 69
> 70 if (!duplicate) {
> 71 return true;
> 72 }
> 73 }
> 74 return false;
> 75 }
> it should be like following:
> @Override
> public boolean incrementToken() throws IOException {
> while (input.incrementToken()) {
> final char term[] = termAttribute.buffer();
> final int length = termAttribute.length();
> final int posIncrement = posIncAttribute.getPositionIncrement();
> if (posIncrement > 0) {
> previous.clear();
> }
> boolean duplicate = (posIncrement == 0 &&
> previous.contains(term, 0, length));
>
> if(duplicate )
> {
> return false;
> }
> else
> {
> // clone the term, and add to the set of seen terms.
> char saved[] = new char[length];
> System.arraycopy(term, 0, saved, 0, length);
> previous.add(saved);
> }
> }
> return true;
> }
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]