Stemming is an inherently limited process. It doesn't know about the
word 'news', it just has a rule about 's'.

Some of us sell commercial products that do more complex linguistic
processing that knows about which words are which.

There may be open source implementations of similar technology.


On Mon, Mar 14, 2016 at 12:13 PM, Ahmet Arslan
<iori...@yahoo.com.invalid> wrote:
> Hi Dwaipayan,
>
> Another way is to use KeywordMarkerFilter. Stemmer implementations respect 
> this attribute.
> If you want to supply your own mappings, StemmerOverrideTokenFilter could be 
> used as well.
>
> ahmet
>
>
> On Monday, March 14, 2016 4:31 PM, Dwaipayan Roy <dwaipayan....@gmail.com> 
> wrote:
>
>
>
> I am using EnglishAnalyzer with my own stopword list. EnglishAnalyzer uses
> the porter stemmer (snowball) to stem the words. But using the
> EnglishAnalyzer, I am getting erroneous result for 'news'. 'news' is
> getting stemmed into 'new'.
>
> Any help would be appreciated.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to