[ https://issues.apache.org/jira/browse/LUCENE-2023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772054#action_12772054 ]
Robert Muir commented on LUCENE-2023: ------------------------------------- hi DM, i think the bounds checks are redundant actually, With both situations, the bounds are calculated up front in the constructor. bq. Is it an invariant that tokenPair.to will always be in bounds? Yes, in this case. The reason I did this is for isToExist, etc is because those methods are public... but this stuff is pkg private anyway so maybe i should delete the bounds checks altogether??? > Improve performance of SmartChineseAnalyzer > ------------------------------------------- > > Key: LUCENE-2023 > URL: https://issues.apache.org/jira/browse/LUCENE-2023 > Project: Lucene - Java > Issue Type: Improvement > Components: contrib/analyzers > Reporter: Robert Muir > Assignee: Robert Muir > Priority: Minor > Fix For: 3.0 > > Attachments: LUCENE-2023.patch > > > I've noticed SmartChineseAnalyzer is a bit slow, compared to say CJKAnalyzer > on chinese text. > This patch improves the internal hhmm implementation. > Time to index my chinese corpus is 75% of the previous time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org