[ https://issues.apache.org/jira/browse/LUCENE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446609#comment-17446609 ]
Robert Muir commented on LUCENE-10239: -------------------------------------- I manually ran the analyzers benchmark from luceneutil (http://people.apache.org/~mikemccand/lucenebench/analyzers.html) to make sure there was no regression with the upgrade. Any difference looks to be in the noise {noformat} jflex 1.7.0: Standard time=2125.72 msec hash=-1293025428516531 tokens=16275933 Standard time=2153.42 msec hash=-1293025428516531 tokens=16275933 Standard time=2094.67 msec hash=-1293025428516531 tokens=16275933 Standard time=2182.98 msec hash=-1293025428516531 tokens=16275933 Standard time=2138.22 msec hash=-1293025428516531 tokens=16275933 jflex 1.8.2: Standard time=2135.17 msec hash=-1293025428516531 tokens=16275933 Standard time=2076.18 msec hash=-1293025428516531 tokens=16275933 Standard time=2018.94 msec hash=-1293025428516531 tokens=16275933 Standard time=2090.48 msec hash=-1293025428516531 tokens=16275933 Standard time=2129.55 msec hash=-1293025428516531 tokens=16275933 {noformat} > upgrade jflex (1.7.0 -> 1.8.2) > ------------------------------ > > Key: LUCENE-10239 > URL: https://issues.apache.org/jira/browse/LUCENE-10239 > Project: Lucene - Core > Issue Type: Task > Reporter: Robert Muir > Priority: Major > Fix For: 9.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > When reviewing LUCENE-10238, I noticed we still had unicode 9.0 data > specified for our jflex tokenizers. > According to the changelog I see some key benefits from upgrading to jflex > 1.8.2: > * unicode 9 -> unicode 12.1 > * remove our custom emoji regeneration via ICU, as jflex supports emoji > properties directly now. > * Less RAM at runtime to users (two stage tables): > https://github.com/jflex-de/jflex/pull/697 > https://www.jflex.de/changelog.html -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org