[ https://issues.apache.org/jira/browse/LUCENE-2074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12854885#action_12854885 ]
Shai Erera commented on LUCENE-2074: ------------------------------------ Uwe, must this be coupled with that issue? This one waits for a long time (why? for JFlex 1.5 release?) and protecting against a huge buffer allocation can be a real quick and tiny fix. And this one also focuses on getting Unicode 5 to work, which is unrelated to the buffer size. But the buffer size is not a critical issue either that we need to move fast with it ... so it's your call. Just thought they are two unrelated problems. > Use a separate JFlex generated Unicode 4 by Java 5 compatible > StandardTokenizer > ------------------------------------------------------------------------------- > > Key: LUCENE-2074 > URL: https://issues.apache.org/jira/browse/LUCENE-2074 > Project: Lucene - Java > Issue Type: Bug > Affects Versions: 3.0 > Reporter: Uwe Schindler > Assignee: Uwe Schindler > Fix For: 3.1 > > Attachments: jflex-1.4.1-vs-1.5-snapshot.diff, jflexwarning.patch, > LUCENE-2074-lucene30.patch, LUCENE-2074.patch, LUCENE-2074.patch, > LUCENE-2074.patch, LUCENE-2074.patch, LUCENE-2074.patch, LUCENE-2074.patch, > LUCENE-2074.patch > > > The current trunk version of StandardTokenizerImpl was generated by Java 1.4 > (according to the warning). In Java 3.0 we switch to Java 1.5, so we should > regenerate the file. > After regeneration the Tokenizer behaves different for some characters. > Because of that we should only use the new TokenizerImpl when > Version.LUCENE_30 or LUCENE_31 is used as matchVersion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org