[ https://issues.apache.org/jira/browse/LUCENE-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992014#comment-12992014 ]
Steven Rowe commented on LUCENE-2911: ------------------------------------- The generated top-level domain macro file has a bunch of new entries when I run this, but these are not included in your patch, and I think we should keep this list up-to-date. The patch is missing HangulSupp macro generation in modules/icu/src/tools/.../GenerateJFlexSupplementaryMacros.java, but since the Hangul macro is not used in the jflex grammar, this doesn't cause a problem. It would be nice to remove the hard-coded ranges for the intersection of Hangul & ALetter, but when I tried to use JFlex negation and union to produce the equivalent, memory usage exploded and I couldn't get JFlex to generate, so I guess we'll have to wait on native JFlex supplementary character support before we can change it. > synchronize grammar/token types across StandardTokenizer, > UAX29EmailURLTokenizer, ICUTokenizer, add CJK types. > -------------------------------------------------------------------------------------------------------------- > > Key: LUCENE-2911 > URL: https://issues.apache.org/jira/browse/LUCENE-2911 > Project: Lucene - Java > Issue Type: Sub-task > Components: Analysis > Reporter: Robert Muir > Assignee: Robert Muir > Fix For: 3.1 > > Attachments: LUCENE-2911.patch > > > I'd like to do LUCENE-2906 (better cjk support for these tokenizers) for a > future target such as 3.2 > But, in 3.1 I would like to do a little cleanup first, and synchronize all > these token types, etc. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org