Re: JavaCC Tokenizer

Peter Carlson Wed, 29 May 2002 07:17:13 -0700

Hi Christian,

You will need to create your own Tokenizer.
Use the StandardTokenizer.jj file as a guide and instead of using a tokens
like



  // basic word: a sequence of digits & letters
  <ALPHANUM: (<LETTER>|<DIGIT>)+ >


Use

<ALPHAONLY: (<LETTER>)+>

And

<NUMONLY: (<DIGIT>)+>


I don't know what your patterns are, but this will help you out.

Also, you may have to change the QueryParser.jj to do the same thing.

--Peter




On 5/29/02 2:19 AM, "Christian Schrader" <[EMAIL PROTECTED]> wrote:

> I need to construct a Tokenizer that tokenizes at word/number boundaries, so
> that "IBM Deskstar IC35L060AVER07" would result in the following tokens:
> IBM
> Deskstar
> IC
> 35
> L
> 060
> AVER
> 07
> 
> Has anybody solved this with the StandardTokenizer?
> 
> Christian
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
> 
> 


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Re: JavaCC Tokenizer

Reply via email to