I can't share any experiences with K-Stem, but I can share that I do remember K-stem people contributing a piece of code that integrated their K-Stem work with Lucene a few (2?) years ago. Their code had some funky license attached, so it never made it into Lucene, but it was available for download, so you should be able to try both K-stem and Porter and compare.
Otis ----- Original Message ---- From: "Yilmazel, Sibel" <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Mon 13 Feb 2006 01:41:52 PM EST Subject: Stemmer algorithms Hello all, We have done some preliminary research on Porter2 and K-stem algorithms and have some questions. Porter2 was found to be a 'strong' stemming algorithm where it strips off both inflectional suffixes (-s, -es, -ed) and derivational suffixes (-able, -aciousness, -ability). K-Stem seemed to be a weak stemming algorithm as it strips off only the inflectional suffixes (-s, -es, -ed). In IR, it is usually recommended using a "weak" stemmer, as the "weak" stemmer seldom hurts performance, but it usually provides significant improvement with precision. However, Porter2 is the most widely used stemming algorithm AND it is a 'strong' stemmer which is contrary to what is said above. Can you share your ideas, experiences with stemmer algorithms? Thanks in advance. Sibel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]