hy ,
when i use standardTokenizer
for parse for example "I.B.M"
the type of the Token is HOST and not ACRONYM
WHY ???
in StandardTokenizer.jj
// acronyms: U.S.A., I.B.M., etc.
// use a post-filter to remove dots
| <ACRONYM: <ALPHA> "." (<ALPHA> ".")+ >
// hostname
| <HOST: <ALPHANUM> ("." <ALPHANUM>)+ >
"I.B.M" can be a host or acronym, so threre is a problem , no ?
----- Original Message -----
From: "petite_abeille" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Thursday, September 04, 2003 3:19 PM
Subject: Re: Lucene app to index Java code
> Hi Erik,
>
> On Thursday, Sep 4, 2003, at 15:03 Europe/Zurich, Erik Hatcher wrote:
>
> > - XDoclet could be used to sweep through Java code and build a
> > text/XML file as richly as you'd like from the information there
> > (complete with JavaDoc tags, which Zapata will miss :)),
>
> Correct. This happen to be on purpose :) Does XDoclet build an
> "intertwingled" object graph of your code along the way? Performing a
> plain search on a code base is pretty trivial... what seems to be more
> interesting would be to put that in context.
>
> Zapata does something along the line of what MagicHat does for
> Objective-C:
>
> http://homepage.mac.com/petite_abeille/MagicHat/
>
> But from the sound of what Otis is saying this is not what you guys are
> looking for... back to the pampa then...
>
> Cheers,
>
> PA.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]