Re: lucene web-app russian language
Hi, Sorry, Lucene supports other languages but the webapp was written to English. Change out the analyzer. If you can adapt it to make it configurable I'd be happy to adapt both the getting started guide and commit the changes. Thanks, Andy On Fri, 2002-03-01 at 15:49, Ype Kingma wrote: Philipp, Hi! I was trying the lucene web-app (lucene-1.2-rc5-dev.jar). I've created and indexed a simple html document with both english and russian words. it was ANSI encoded, if I check _3.fdt from created index, I can see my document indexed and both russian and english terms indexed (it opens in utf encoding, i suppose). but the problem starts when searching. If i search with russian word, it returns nothing, if I search with engglish, it returns a result, but all russian words are returned as ? signs. I've changed .jsp contenttypes to return in UTF-8 encoding, but the resukt is still the same. So, finally, does Lucene those multilingual search or not? What am I doing wrong? I am trying to make it work since version 1.0 with russian docs, but still no idea and no resutls :(( Did you read the FAQ on the use of the StandardAnalyzer during indexing and query parsing? You might need to replace it with a RussianAnalyzer which you'll have to make yourself when no one has done this before you. Have a look at the GermanAnalyzer for some inspiration. Good luck, Ype -- -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- http://www.superlinksoftware.com http://jakarta.apache.org - port of Excel/Word/OLE 2 Compound Document format to java http://developer.java.sun.com/developer/bugParade/bugs/4487555.html - fix java generics! The avalanche has already started. It is too late for the pebbles to vote. -Ambassador Kosh -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: lucene web-app russian language
Ok, I'll try to make the russian analyzer and report to you in 2-3 days. Hope, about success. But if i fail, I'll report anyway :) - Original Message - From: Andrew C. Oliver [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Saturday, March 02, 2002 9:28 PM Subject: Re: lucene web-app russian language Hi, Sorry, Lucene supports other languages but the webapp was written to English. Change out the analyzer. If you can adapt it to make it configurable I'd be happy to adapt both the getting started guide and commit the changes. Thanks, Andy On Fri, 2002-03-01 at 15:49, Ype Kingma wrote: Philipp, Hi! I was trying the lucene web-app (lucene-1.2-rc5-dev.jar). I've created and indexed a simple html document with both english and russian words. it was ANSI encoded, if I check _3.fdt from created index, I can see my document indexed and both russian and english terms indexed (it opens in utf encoding, i suppose). but the problem starts when searching. If i search with russian word, it returns nothing, if I search with engglish, it returns a result, but all russian words are returned as ? signs. I've changed .jsp contenttypes to return in UTF-8 encoding, but the resukt is still the same. So, finally, does Lucene those multilingual search or not? What am I doing wrong? I am trying to make it work since version 1.0 with russian docs, but still no idea and no resutls :(( Did you read the FAQ on the use of the StandardAnalyzer during indexing and query parsing? You might need to replace it with a RussianAnalyzer which you'll have to make yourself when no one has done this before you. Have a look at the GermanAnalyzer for some inspiration. Good luck, Ype -- -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- http://www.superlinksoftware.com http://jakarta.apache.org - port of Excel/Word/OLE 2 Compound Document format to java http://developer.java.sun.com/developer/bugParade/bugs/4487555.html - fix java generics! The avalanche has already started. It is too late for the pebbles to vote. -Ambassador Kosh -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED] -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
Re: lucene web-app russian language
Philipp, Hi! I was trying the lucene web-app (lucene-1.2-rc5-dev.jar). I've created and indexed a simple html document with both english and russian words. it was ANSI encoded, if I check _3.fdt from created index, I can see my document indexed and both russian and english terms indexed (it opens in utf encoding, i suppose). but the problem starts when searching. If i search with russian word, it returns nothing, if I search with engglish, it returns a result, but all russian words are returned as ? signs. I've changed .jsp contenttypes to return in UTF-8 encoding, but the resukt is still the same. So, finally, does Lucene those multilingual search or not? What am I doing wrong? I am trying to make it work since version 1.0 with russian docs, but still no idea and no resutls :(( Did you read the FAQ on the use of the StandardAnalyzer during indexing and query parsing? You might need to replace it with a RussianAnalyzer which you'll have to make yourself when no one has done this before you. Have a look at the GermanAnalyzer for some inspiration. Good luck, Ype -- -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]
lucene web-app russian language
Hi! I was trying the lucene web-app (lucene-1.2-rc5-dev.jar). I've created and indexed a simple html document with both english and russian words. it was ANSI encoded, if I check _3.fdt from created index, I can see my document indexed and both russian and english terms indexed (it opens in utf encoding, i suppose). but the problem starts when searching. If i search with russian word, it returns nothing, if I search with engglish, it returns a result, but all russian words are returned as ? signs. I've changed .jsp contenttypes to return in UTF-8 encoding, but the resukt is still the same. So, finally, does Lucene those multilingual search or not? What am I doing wrong? I am trying to make it work since version 1.0 with russian docs, but still no idea and no resutls :(( -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]