[ 
https://issues.apache.org/jira/browse/LUCENE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17446609#comment-17446609
 ] 

Robert Muir commented on LUCENE-10239:
--------------------------------------

I manually ran the analyzers benchmark from luceneutil 
(http://people.apache.org/~mikemccand/lucenebench/analyzers.html) to make sure 
there was no regression with the upgrade.

Any difference looks to be in the noise

{noformat}
jflex 1.7.0:
Standard time=2125.72 msec hash=-1293025428516531 tokens=16275933
Standard time=2153.42 msec hash=-1293025428516531 tokens=16275933
Standard time=2094.67 msec hash=-1293025428516531 tokens=16275933
Standard time=2182.98 msec hash=-1293025428516531 tokens=16275933
Standard time=2138.22 msec hash=-1293025428516531 tokens=16275933

jflex 1.8.2:
Standard time=2135.17 msec hash=-1293025428516531 tokens=16275933
Standard time=2076.18 msec hash=-1293025428516531 tokens=16275933
Standard time=2018.94 msec hash=-1293025428516531 tokens=16275933
Standard time=2090.48 msec hash=-1293025428516531 tokens=16275933
Standard time=2129.55 msec hash=-1293025428516531 tokens=16275933
{noformat}

> upgrade jflex (1.7.0 -> 1.8.2)
> ------------------------------
>
>                 Key: LUCENE-10239
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10239
>             Project: Lucene - Core
>          Issue Type: Task
>            Reporter: Robert Muir
>            Priority: Major
>             Fix For: 9.1
>
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When reviewing LUCENE-10238, I noticed we still had unicode 9.0 data 
> specified for our jflex tokenizers. 
> According to the changelog I see some key benefits from upgrading to jflex 
> 1.8.2:
> * unicode 9 -> unicode 12.1
> * remove our custom emoji regeneration via ICU, as jflex supports emoji 
> properties directly now.
> * Less RAM at runtime to users (two stage tables): 
> https://github.com/jflex-de/jflex/pull/697
> https://www.jflex.de/changelog.html



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to