Hi folk, Remember a little ago Olivier Perrin was having trouble when indexing and searching text in grec or russian.
I dig the source code and learned a little about javacc here is what I found: When you don't specify the Options UNICODE_INPUT the charactere table created is a ASCII table. (a charctere table is different from the encoding format !!! Unicode 3.0 is a characters table and UTF-8 is a character encoding). So when dealing with characters over than the one in the ASCII table javacc do not recognized it. For exemple a russian character is not in the table. So when the queryparser or the standard analyzer receive that, he doesn't know what to do with it and abort. Just by adding the UNICODE_INPUT = true; in both .jj file fixed the problem. Sorry for my poor english I hope you got the idea. Maybe I can submit a little patch if wanted but I m not used to diff :) -- tr�mont romain <[EMAIL PROTECTED]> A.I.S. http://www.xml-ais.com -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
