Robert Muir created LUCENE-5818:
-----------------------------------
Summary: Fix hunspell zero-string overgeneration
Key: LUCENE-5818
URL: https://issues.apache.org/jira/browse/LUCENE-5818
Project: Lucene - Core
Issue Type: Bug
Reporter: Robert Muir
Currently, its allowed to strip suffixes/prefixes all the way down to the empty
string. But this is not really allowed, and creates overgeneration in some
cases (especially where endings can be standalone ... typically these are
stopwords so it causes a lot of damage).
Example is czech 'už' which should just stem to itself, but today also stems to
'úžit' because it has a flag compatible with that.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]