For instance look at http://www.zilverline.org/zilverlineweb/space/faq
Michael
Karl Øie wrote:
If you use a servlet and a HTML Form to feed queries to the
QueryParser take good care of all configurations around the servlet
container. If you, like me, use tomcat you might have to recode the
query into internal java form (utf-8) before you pass it to lucene.
read this:
http://www.crazysquirrel.com/compgen/form-encoding.php
then in your receiving servlet:
String query_string = request.getParameter("query");
String query_string = new
String(query_string.getBytes(),request.getCharacterEncoding());
then pass query_string to lucene. This ensures that the string fetched
by getParameter() is encoded by the right encoding.
Hope this helps!
Mvh Karl Øie
On 11. apr. 2005, at 11.54, Eric Chow wrote:
Hello,
I am a beginner in using Lucene.
My files are contains different language (English, Chinese,
Portuguese, Japanese and some Asian languages, non-latin languages).
They always contain in one file.
Therefore, I have to use UTF-8 to save the contents.
I am now developing a web-based search engine. I use Lucene to create
index for those files and search it in web. The charset of the web
page is UTF-8, but it cannot search anything.
I try to use some Analyser (CJKAnalyser, ChineseAnalyser,
StandardAnalyser, SimpleAnalyser), still failed.
Finally, I tested to use original charset, for example, the Chinese
contents I used BIG5, and I can search it very well. For those
English, of couse, no problem.
But I can't use UTF-8 as the charset for documents. Any suggest and
examples ?
Best regards,
Eric
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
- ...I wonder if the really nerdy Klingons learn how to speak english?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]