StandardFilter only works with ClassicTokenizer and only when version < 3.1
---------------------------------------------------------------------------

                 Key: LUCENE-3366
                 URL: https://issues.apache.org/jira/browse/LUCENE-3366
             Project: Lucene - Java
          Issue Type: Improvement
          Components: modules/analysis
    Affects Versions: 3.3
            Reporter: David Smiley


The StandardFilter used to remove periods from acronyms and apostrophes-S's 
where they occurred. And it used to work in conjunction with the 
StandardTokenizer.  Presently, it only does this with ClassicTokenizer and when 
the lucene match version is before 3.1. Here is a excerpt from the code:
{code:lang=java}
  public final boolean incrementToken() throws IOException {
    if (matchVersion.onOrAfter(Version.LUCENE_31))
      return input.incrementToken(); // TODO: add some niceties for the new 
grammar
    else
      return incrementTokenClassic();
  }
{code}

It seems to me that in the great refactor of the standard tokenizer, 
LUCENE-2167, something was forgotten here. I think that if someone uses the 
ClassicTokenizer then no matter what the version is, this filter should do what 
it used to do. And the TODO suggests someone forgot to make this filter do 
something useful for the StandardTokenizer.  Or perhaps that idea should be 
discarded and this class should be named ClassicTokenFilter.

In any event, the javadocs for this class appear out of date as there is no 
mention of ClassicTokenizer, and the wiki is out of date too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to