[ https://issues.apache.org/jira/browse/LUCENE-8416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16611862#comment-16611862 ]
Peter Cseh commented on LUCENE-8416: ------------------------------------ I've created a PR for this. I could not find any tests that should be changed after this. > Add tokenized version of o.o. to Stempel stopwords > -------------------------------------------------- > > Key: LUCENE-8416 > URL: https://issues.apache.org/jira/browse/LUCENE-8416 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Reporter: Trey Jones > Priority: Trivial > Labels: easyfix, newbie > Time Spent: 10m > Remaining Estimate: 0h > > The Stempel stopword list ( > lucene-solr/lucene/analysis/stempel/src/resources/org/apache/lucene/analysis/pl/stopwords.txt > ) contains "o.o." which is a good stopword (it's part of the abbreviation > for "limited liability company", which is "[sp. z > o.o.|https://en.wiktionary.org/wiki/sp._z_o.o.]". However, the standard > tokenizer changes "o.o." to "o.o" so the stopword filter has no effect. > Add "o.o" to the stopword list. (It's probably okay to leave "o.o." in the > list, though, in case a different tokenizer is used.) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org