[ https://issues.apache.org/jira/browse/LUCENE-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253086#comment-13253086 ]
David Mason commented on LUCENE-3979: ------------------------------------- I'm happy to submit a patch for this, but haven't done so for this or similar projects so will take a while to go through the wiki and get set up to make a patch. > NGramTokenizer > -------------- > > Key: LUCENE-3979 > URL: https://issues.apache.org/jira/browse/LUCENE-3979 > Project: Lucene - Java > Issue Type: Bug > Components: modules/analysis > Affects Versions: 2.9.2, 3.0 > Environment: n/a > Reporter: David Mason > Priority: Minor > Labels: tokenizer, whitespace > Original Estimate: 5m > Remaining Estimate: 5m > > org.apache.lucene.analysis.ngram.NGramTokenizer removes whitespace, making a > search for literal strings like " test" and "test " equivalent to "test". > Searching with relevant whitespace is sometimes desired, particularly where > ngrams are used. > This could be fixed by either removing .trim() from the line shown below, or > by providing a flag to specifically set trimming behaviour (keeping trim=true > as the default so that existing code using this analyzer is not broken). > 111: inStr = new String(chars).trim(); // remove any trailing empty strings -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org