If you need the full fidelity solution taking care of multiple edge-cases, it could be worth looking at commercial solutions.
http://www.basistech.com/ has one, including a free-level SAAS plan. Regards, Alex. ---- http://www.solr-start.com/ - Resources for Solr users, new and experienced On 15 December 2016 at 21:28, Lasitha Wattaladeniya <watt...@gmail.com> wrote: > Hi all, > > Thanks for the replies, > > @eric, ahmet : since those stemmers are logical stemmers it won't work on > words such as caught, ran and so on. So in our case it won't work > > @susheel : Yes I thought about it but problems we have is, the documents we > index are some what large text, so copy fielding these into duplicate > fields will affect on the index time ( we have jobs to index data > periodically) and query time. I wonder why there isn't a correct solution > to this > > Regards, > Lasitha > > Lasitha Wattaladeniya > Software Engineer > > Mobile : +6593896893 > Blog : techreadme.blogspot.com > > On Fri, Dec 16, 2016 at 12:58 AM, Susheel Kumar <susheel2...@gmail.com> > wrote: > >> We did extensive comparison in the past for Snowball, KStem and Hunspell >> and there are cases where one of them works better but not other or >> vice-versa. You may utilise all three of them by having 3 different fields >> (fieldTypes) and during query, search in all of them. >> >> For some of the cases where none of them works (e.g wolves, wolf etc)., use >> StemOverriderFactory. >> >> HTH. >> >> Thanks, >> Susheel >> >> On Thu, Dec 15, 2016 at 11:32 AM, Ahmet Arslan <iori...@yahoo.com.invalid> >> wrote: >> >> > Hi, >> > >> > KStemFilter returns legitimate English words, please use it. >> > >> > Ahmet >> > >> > >> > >> > On Thursday, December 15, 2016 6:17 PM, Lasitha Wattaladeniya < >> > watt...@gmail.com> wrote: >> > Hello devs, >> > >> > I'm trying to develop this indexing and querying flow where it converts >> the >> > words to its original form (lemmatization). I was doing bit of research >> > lately but the information on the internet is very limited. I tried using >> > hunspellfactory but it doesn't convert the word to it's original form, >> > instead it gives suggestions for some words (hunspell works for some >> > english words correctly but for some it gives multiple suggestions or no >> > suggestions, i used the en_us.dic provided by openoffice) >> > >> > I know this is a generic problem in searching, so is there anyone who can >> > point me to correct direction or some information :) >> > >> > Best regards, >> > Lasitha Wattaladeniya >> > Software Engineer >> > >> > Mobile : +6593896893 >> > Blog : techreadme.blogspot.com >> > >>