Re: stemmer

2006-11-18 Thread Erick Erickson
Thomas: There are some rather extensive threads on this list about the "interesting" issues that exist when indexing/searching other languages. I think you'd find it worthwhile to search the list archive for foreign language or some such... The short answer as I remember is that there *is* a bui

Re: Stemmer Implementation Strategy - feedback?

2006-08-08 Thread eks dev
I would suggest you to have a look at Egothor stemmer (http://www.egothor.org/book/bk01ch01s06.html), can be trained rather easily (if your only use of "roots" is for searching) I have only heard of it as a good thing, never tried it On Aug 4, 2006, at 1:29 PM, Marios Skounakis wrote: > > > >

Re: Stemmer Implementation Strategy - feedback?

2006-08-07 Thread Marvin Humphrey
On Aug 7, 2006, at 11:23 PM, Marios Skounakis wrote: I directed the question to the lucene list in order to find out what people think about the general case Martin Porter touches on some of the pros and cons of a dictionary- based approach to stemming at

Re: Stemmer Implementation Strategy - feedback?

2006-08-07 Thread Marios Skounakis
Hi Grant, Thanks for the interesting reply. Grant Ingersoll wrote: Hey Marios, It sounds like you have a reasonable plan and you have thought through the ideas. And the answer to many of your questions below is "it depends". Do you have enough memory to hold the whole lexicon in memory?

Re: Stemmer Implementation Strategy - feedback?

2006-08-07 Thread Grant Ingersoll
Hey Marios, It sounds like you have a reasonable plan and you have thought through the ideas. And the answer to many of your questions below is "it depends". Do you have enough memory to hold the whole lexicon in memory? Is this lexicon going to grow significantly over time? I have, in

Re: Stemmer algorithms

2006-02-13 Thread jason
Hi, I have test some stemmer algorithms in my application. However, i think we'd better writer a weaker algorithm. I mean, the Porter and some other algorithms are too strong. maybe an algorithm which can convert plural to single noun is enough. On 2/14/06, Yilmazel, Sibel <[EMAIL PROTECTED]> wro

Re: Stemmer algorithms

2006-02-13 Thread Otis Gospodnetic
I can't share any experiences with K-Stem, but I can share that I do remember K-stem people contributing a piece of code that integrated their K-Stem work with Lucene a few (2?) years ago. Their code had some funky license attached, so it never made it into Lucene, but it was available for down