NGramTokenizer
--------------
Key: LUCENE-3979
URL: https://issues.apache.org/jira/browse/LUCENE-3979
Project: Lucene - Java
Issue Type: Bug
Components: modules/analysis
Affects Versions: 3.0, 2.9.2
Environment: n/a
Reporter: David Mason
Priority: Minor
org.apache.lucene.analysis.ngram.NGramTokenizer removes whitespace, making a
search for literal strings like " test" and "test " equivalent to "test".
Searching with relevant whitespace is sometimes desired, particularly where
ngrams are used.
This could be fixed by either removing .trim() from the line shown below, or by
providing a flag to specifically set trimming behaviour (keeping trim=true as
the default so that existing code using this analyzer is not broken).
111: inStr = new String(chars).trim(); // remove any trailing empty strings
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]