The token types of the standard tokenizer is not accessible
-----------------------------------------------------------

                 Key: LUCENE-1150
                 URL: https://issues.apache.org/jira/browse/LUCENE-1150
             Project: Lucene - Java
          Issue Type: Bug
          Components: Analysis
    Affects Versions: 2.3
            Reporter: Nicolas Lalevée


The StandardTokenizerImpl not being public, these token types are not 
accessible :

{code:java}
public static final int ALPHANUM          = 0;
public static final int APOSTROPHE        = 1;
public static final int ACRONYM           = 2;
public static final int COMPANY           = 3;
public static final int EMAIL             = 4;
public static final int HOST              = 5;
public static final int NUM               = 6;
public static final int CJ                = 7;
/**
 * @deprecated this solves a bug where HOSTs that end with '.' are identified
 *             as ACRONYMs. It is deprecated and will be removed in the next
 *             release.
 */
public static final int ACRONYM_DEP       = 8;

public static final String [] TOKEN_TYPES = new String [] {
    "<ALPHANUM>",
    "<APOSTROPHE>",
    "<ACRONYM>",
    "<COMPANY>",
    "<EMAIL>",
    "<HOST>",
    "<NUM>",
    "<CJ>",
    "<ACRONYM_DEP>"
};
{code}

So no custom TokenFilter can be based of the token type. Actually even the 
StandardFilter cannot be writen outside the org.apache.lucene.analysis.standard 
package.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to