[ 
https://issues.apache.org/jira/browse/LUCENE-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592934#comment-14592934
 ] 

Trejkaz commented on LUCENE-6584:
---------------------------------

http://lucene.apache.org/core/4_10_4/analyzers-common/org/apache/lucene/analysis/standard/StandardTokenizer.html


> Docs on StandardTokenizer don't mention the behaviour change in 
> Version.LUCENE_4_7_0
> ------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6584
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6584
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/analysis
>    Affects Versions: 4.10.4
>            Reporter: Trejkaz
>            Priority: Minor
>
> The following test shows that the behaviour of StandardTokenizer differs once 
> you start passing Version.LUCENE_4_7_0 or greater:
> {code}
> import java.io.StringReader;
> import org.apache.lucene.analysis.TokenStream;
> import org.apache.lucene.analysis.standard.StandardTokenizer;
> import org.apache.lucene.util.Version;
> import org.junit.Test;
> import static org.hamcrest.Matchers.is;
> import static org.junit.Assert.assertThat;
> public class TestStandardTokenizerStandalone
> {
>     @Test
>     public void testLucene4_6_1() throws Exception
>     {
>         doTest(Version.LUCENE_4_6_1);
>     }
>     @Test
>     public void testLucene4_7_0() throws Exception
>     {
>         doTest(Version.LUCENE_4_7_0);
>     }
>     public void doTest(Version version) throws Exception
>     {
>         try (TokenStream stream = new StandardTokenizer(version, new 
> StringReader(makeLongString(2550))))
>         {
>             stream.reset();
>             assertThat(stream.incrementToken(), is(false));
>         }
>     }
>     private String makeLongString(int length)
>     {
>         StringBuilder builder = new StringBuilder(length);
>         for (int i = 0; i < length; i++)
>         {
>             builder.append('x');
>         }
>         return builder.toString();
>     }
> }
> {code}
> However, the Javadoc only mentions the behaviour changes in versions 3.1 and 
> 3.4.
> The constructor for passing the version is deprecated, presumably under the 
> false impression that no changes occurred during Lucene 4. I know the Version 
> parameter was killed off entirely in version 5, which presumably means that 
> people who tokenised stuff in Lucene 4.6 or earlier have now been trapped and 
> have to copy the tokeniser from Lucene 4 to keep their queries working.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to