What is #LETTER definition in SnardarTokernize.jj?

I saw:

| <#P: ("_"|"-"|"/"|"."|",") >
| <#HAS_DIGIT:                                    // at least one digit
    (<LETTER>|<DIGIT>)*
    <DIGIT>
    (<LETTER>|<DIGIT>)*
  >


Should I remove "_" and recompile the source code?

Sincerely,


Anh Ngo

-----Original Message-----
From: Daniel Naber [mailto:[EMAIL PROTECTED] 
Sent: Friday, July 21, 2006 2:49 PM
To: java-user@lucene.apache.org
Subject: Re: StandardAnalyzer question

On Freitag 21 Juli 2006 16:16, Ngo, Anh (ISS Southfield) wrote:

> The lucene 2.0.0 StandardAnalyzer does treat the "_"(underscore) as a
> token.  Is there a way I can make StandardAnalyzer don't tokenize for
> "_" or any given characters?

You need to add "_" to the #LETTER definition in StandardTokenizer.jj, then 
rebuild StandardTokenizer.java using the appropriate and task.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to