[ https://issues.apache.org/jira/browse/LUCENE-7960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461695#comment-16461695 ]
Shawn Heisey commented on LUCENE-7960: -------------------------------------- My original idea would have been handled by one boolean -- keeping terms shorter than minGram. On more than one occasion, I've fielded questions where it turns out the user is trying to search for terms shorter than their minGram size. In discussing it, the notion of *long* terms being removed by the min/max range also came up. It was an idea I had not originally considered, but I have encountered someone since where they had ngram on the index side but not the query side, and wanted to search for terms longer than their maxGram size. It could be reduced to one "keep" boolean to keep both short and long terms, but I think we're going to have people who want to keep short terms but not long terms, and vice versa. > NGram filters -- add option to keep short terms > ----------------------------------------------- > > Key: LUCENE-7960 > URL: https://issues.apache.org/jira/browse/LUCENE-7960 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Reporter: Shawn Heisey > Priority: Major > Attachments: LUCENE-7960.patch, LUCENE-7960.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > When ngram or edgengram filters are used, any terms that are shorter than the > minGramSize are completely removed from the token stream. > This is probably 100% what was intended, but I've seen it cause a lot of > problems for users. I am not suggesting that the default behavior be > changed. That would be far too disruptive to the existing user base. > I do think there should be a new boolean option, with a name like > keepShortTerms, that defaults to false, to allow the short terms to be > preserved. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org