Hi Adrien, To me, it sounds simply a bug. Can you please open a JIRA (with a patch if possible)?
Tomoko 2019年7月23日(火) 22:05 Adrien Gallou <adriengal...@gmail.com>: > > Hi, > > I'm using both light and minimal French stemmers and encountered an issue > when using the minimal stemmer. > > The light stemmer removes the last character of a word if the last two > characters are identical. > We can see that here: > https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/java/org/apache/lucene/analysis/fr/FrenchLightStemmer.java#L263 > In this light stemmer, there is a check to avoid altering the token if the > token is a number. > > The minimal stemmer also removes the last character of a word if the last > two characters are identical. > We can see that here: > https://github.com/apache/lucene-solr/blob/master/lucene/analysis/common/src/java/org/apache/lucene/analysis/fr/FrenchMinimalStemmer.java#L77 > > But in this minimal stemmer there is no check to see if the character is a > letter or not. > So when we have numeric tokens with the last two characters identical they > are altered. > > Is there a reason for this? > Should I file an issue on Jira to add this check? > > Thanks, > > Adrien Gallou --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org