Trejkaz created LUCENE-6584: ------------------------------- Summary: Docs on StandardTokenizer don't mention the behaviour change in Version.LUCENE_4_7_0 Key: LUCENE-6584 URL: https://issues.apache.org/jira/browse/LUCENE-6584 Project: Lucene - Core Issue Type: Bug Components: modules/analysis Affects Versions: 4.10.4 Reporter: Trejkaz Priority: Minor
The following test shows that the behaviour of StandardTokenizer differs once you start passing Version.LUCENE_4_7_0 or greater: {code} import java.io.StringReader; import org.apache.lucene.analysis.TokenStream; import org.apache.lucene.analysis.standard.StandardTokenizer; import org.apache.lucene.util.Version; import org.junit.Test; import static org.hamcrest.Matchers.is; import static org.junit.Assert.assertThat; public class TestStandardTokenizerStandalone { @Test public void testLucene4_6_1() throws Exception { doTest(Version.LUCENE_4_6_1); } @Test public void testLucene4_7_0() throws Exception { doTest(Version.LUCENE_4_7_0); } public void doTest(Version version) throws Exception { try (TokenStream stream = new StandardTokenizer(version, new StringReader(makeLongString(2550)))) { stream.reset(); assertThat(stream.incrementToken(), is(false)); } } private String makeLongString(int length) { StringBuilder builder = new StringBuilder(length); for (int i = 0; i < length; i++) { builder.append('x'); } return builder.toString(); } } {code} However, the Javadoc only mentions the behaviour changes in versions 3.1 and 3.4. The constructor for passing the version is deprecated, presumably under the false impression that no changes occurred during Lucene 4. I know the Version parameter was killed off entirely in version 5, which presumably means that people who tokenised stuff in Lucene 4.6 or earlier have now been trapped and have to copy the tokeniser from Lucene 4 to keep their queries working. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org