Hi all, I've a matter with indexing then searching docs written in non-latin languages and encoded in utf-8 (Russian, by example).
I have a web application, with a simple form to search in the contents of the docs. When I submit the form, I encode the query term in utf-8 with encodeURI(String) but I match no doc. I think that is due to a bad indexing but I'm not sure. Lucene is normally indexing docs in writing Terms in the 'xxx.tis' file, encoding it in utf-8, I believe. So when it reads the file, it correctly gets russian characters (2 bytes) but when writing them in the index, they seem different (I've listed the terms in my application console). If someone has a solution to resolve my problem, all advices are welcome. Thanks. Alex --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
