I got the same warning when I compiled the patch. I haven't tried my patch with the patch for Bug 30844 (or the latest CVS) to see if it removes the warning. I assume that would fix the problem, but I haven't tested that theory out. I'll get around to that after I finish my current work (which uses Lucene to index Japanese documents) under a looming deadline. :-)

Joey

On Tuesday, September 7, 2004, at 01:19  PM, Daniel Naber wrote:

On Monday 23 August 2004 13:46, Joey Lawrance wrote:

I've attached the HTMLParser.jj file that successfully parses Japanese
HTML for indexing.

Joey,

thanks for the patch. When I compile it with "ant javacc-HTMLParser" I get
this warning:


"Warning: Line 364, Column 3: Non-ASCII characters used in regular
expression.
Please make sure you use the correct Reader when you create the parser that
can handle your character set."


Is it okay to get this warning? The line the warning refers to is this one:

| < CJK:                                          // non-alphabets

Besides that, the patch seems to work, i.e. the parser doesn't stop on
Japanese HTML files anymore, but that's all I can say, as I don't speak
Japanese.

Regards
 Daniel

--
http://www.danielnaber.de


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to