Russian Analyzer

Boris Okner Mon, 01 Apr 2002 22:09:07 -0800

Hi all,

I have just finished the implementation of Russian stemming algorithm (described at 
http://snowball.sourceforge.net/russian/stemmer.html). Today it has passed all  tests 
on a sample Russian vocabulary of almost 50,000 words 
(http://snowball.sourceforge.net/russian/voc.txt), i.e. all stems generated for this 
vocabulary, are matched to Snowball's stems 
(http://snowball.sourceforge.net/russian/output.txt). The stemmer supports Russian 
Unicode, KOI8 and Win1251 charsets. I'm planning to finish full-featured Russian 
Analyser for Lucene by the end of next week. Could you please tell me how do I 
contribute my source code to Lucene?


Thanks,
Boris Okner

Russian Analyzer

Reply via email to