Re: Problem with porter stemming

Benson Margulies Mon, 14 Mar 2016 09:20:04 -0700

Stemming is an inherently limited process. It doesn't know about the
word 'news', it just has a rule about 's'.


Some of us sell commercial products that do more complex linguistic
processing that knows about which words are which.

There may be open source implementations of similar technology.


On Mon, Mar 14, 2016 at 12:13 PM, Ahmet Arslan
<[email protected]> wrote:
> Hi Dwaipayan,
>
> Another way is to use KeywordMarkerFilter. Stemmer implementations respect 
> this attribute.
> If you want to supply your own mappings, StemmerOverrideTokenFilter could be 
> used as well.
>
> ahmet
>
>
> On Monday, March 14, 2016 4:31 PM, Dwaipayan Roy <[email protected]> 
> wrote:
>
>
>
> I am using EnglishAnalyzer with my own stopword list. EnglishAnalyzer uses
> the porter stemmer (snowball) to stem the words. But using the
> EnglishAnalyzer, I am getting erroneous result for 'news'. 'news' is
> getting stemmed into 'new'.
>
> Any help would be appreciated.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Problem with porter stemming

Reply via email to