Re: CJK Support for HTMLParser.jj

Joey Lawrance Tue, 07 Sep 2004 14:42:30 -0700

I got the same warning when I compiled the patch. I haven't tried my patch with the patch for Bug 30844 (or the latest CVS) to see if it removes the warning. I assume that would fix the problem, but I haven't tested that theory out. I'll get around to that after I finish my current work (which uses Lucene to index Japanese documents) under a looming deadline. :-)

Joey

On Tuesday, September 7, 2004, at 01:19  PM, Daniel Naber wrote:

On Monday 23 August 2004 13:46, Joey Lawrance wrote:
I've attached the HTMLParser.jj file that successfully parses Japanese
HTML for indexing.
Joey,
thanks for the patch. When I compile it with "ant javacc-HTMLParser" I get this warning:

"Warning: Line 364, Column 3: Non-ASCII characters used in regular expression. Please make sure you use the correct Reader when you create the parser that can handle your character set."

Is it okay to get this warning? The line the warning refers to is this one:
| < CJK:                                          // non-alphabets
Besides that, the patch seems to work, i.e. the parser doesn't stop on
Japanese HTML files anymore, but that's all I can say, as I don't speak
Japanese.
Regards
 Daniel
--
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: CJK Support for HTMLParser.jj

Reply via email to