On Fri, Jul 8, 2016 at 8:56 PM, Tom Ivar Helbekkmo <[email protected]> wrote: > Abhinav Upadhyay <[email protected]> writes: > >> We just need to handle the special cases where we don't want to stem :) > > ...or perhaps do the stemming only when the resulting stem is found in > /usr/share/dict/words?
Yes, that's probably a good idea. I first need to write the custom tokenizer and I can probably use that dictionary to decide what to stem and what not to stem. - Abhinav
