Re: Stemming with SOLR

Lasitha Wattaladeniya Sun, 18 Dec 2016 17:55:28 -0800

Thank you all for the replies.  I am considering the suggestions

On 17 Dec 2016 01:50, "Susheel Kumar" <[email protected]> wrote:


> To handle irregular nouns (
> http://www.ef.com/english-resources/english-grammar/
> singular-and-plural-nouns/),
> the simplest way is handle them using StemOverriderFactory.  The list is
> not so long. Or otherwise go for commercial solutions like basistech etc.
> as Alex suggested  oR you can customize Hunspell extensively to handle most
> of them.
>
> Thanks,
> Susheel
>
> On Thu, Dec 15, 2016 at 9:46 PM, Alexandre Rafalovitch <[email protected]
> >
> wrote:
>
> > If you need the full fidelity solution taking care of multiple
> > edge-cases, it could be worth looking at commercial solutions.
> >
> >
> > http://www.basistech.com/ has one, including a free-level SAAS plan.
> >
> > Regards,
> >    Alex.
> > ----
> > http://www.solr-start.com/ - Resources for Solr users, new and
> experienced
> >
> >
> > On 15 December 2016 at 21:28, Lasitha Wattaladeniya <[email protected]>
> > wrote:
> > > Hi all,
> > >
> > > Thanks for the replies,
> > >
> > > @eric, ahmet : since those stemmers are logical stemmers it won't work
> on
> > > words such as caught, ran and so on. So in our case it won't work
> > >
> > > @susheel : Yes I thought about it but problems we have is, the
> documents
> > we
> > > index are some what large text, so copy fielding these into duplicate
> > > fields will affect on the index time ( we have jobs to index data
> > > periodically) and query time. I wonder why there isn't a correct
> solution
> > > to this
> > >
> > > Regards,
> > > Lasitha
> > >
> > > Lasitha Wattaladeniya
> > > Software Engineer
> > >
> > > Mobile : +6593896893
> > > Blog : techreadme.blogspot.com
> > >
> > > On Fri, Dec 16, 2016 at 12:58 AM, Susheel Kumar <[email protected]
> >
> > > wrote:
> > >
> > >> We did extensive comparison in the past for Snowball, KStem and
> Hunspell
> > >> and there are cases where one of them works better but not other or
> > >> vice-versa. You may utilise all three of them by having 3 different
> > fields
> > >> (fieldTypes) and during query, search in all of them.
> > >>
> > >> For some of the cases where none of them works (e.g wolves, wolf
> etc).,
> > use
> > >> StemOverriderFactory.
> > >>
> > >> HTH.
> > >>
> > >> Thanks,
> > >> Susheel
> > >>
> > >> On Thu, Dec 15, 2016 at 11:32 AM, Ahmet Arslan
> > <[email protected]>
> > >> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > KStemFilter returns legitimate English words, please use it.
> > >> >
> > >> > Ahmet
> > >> >
> > >> >
> > >> >
> > >> > On Thursday, December 15, 2016 6:17 PM, Lasitha Wattaladeniya <
> > >> > [email protected]> wrote:
> > >> > Hello devs,
> > >> >
> > >> > I'm trying to develop this indexing and querying flow where it
> > converts
> > >> the
> > >> > words to its original form (lemmatization). I was doing bit of
> > research
> > >> > lately but the information on the internet is very limited. I tried
> > using
> > >> > hunspellfactory but it doesn't convert the word to it's original
> form,
> > >> > instead it gives suggestions for some words (hunspell works for some
> > >> > english words correctly but for some it gives multiple suggestions
> or
> > no
> > >> > suggestions, i used the en_us.dic provided by openoffice)
> > >> >
> > >> > I know this is a generic problem in searching, so is there anyone
> who
> > can
> > >> > point me to correct direction or some information :)
> > >> >
> > >> > Best regards,
> > >> > Lasitha Wattaladeniya
> > >> > Software Engineer
> > >> >
> > >> > Mobile : +6593896893
> > >> > Blog : techreadme.blogspot.com
> > >> >
> > >>
> >
>

Re: Stemming with SOLR

Reply via email to