[
https://issues.apache.org/jira/browse/LUCENE-8153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350602#comment-16350602
]
Adrien Grand commented on LUCENE-8153:
--------------------------------------
I confirmed this is due to the additional checks that I added for impacts. I'm
running CheckIndex in 180 seconds locally vs. 74 seconds before LUCENE-4198.
This is slower because eg. ImpactsEnum doesn't give way to reuse existing
instances (it shouldn't be needed) and the checks we perform are a bit more
costly than comparing the freqs, but still important.
I started looking into short-circuiting long postings lists, but it doesn't buy
too much (around 150 seconds) because most postings lists are still very short.
Another option could be to only check 1/16th of terms for instance. Any
opinions?
> checkindex time more than doubles for wikipedia index with recent change
> ------------------------------------------------------------------------
>
> Key: LUCENE-8153
> URL: https://issues.apache.org/jira/browse/LUCENE-8153
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Priority: Major
>
> See [http://people.apache.org/~mikemccand/lucenebench/checkIndexTime.html]
> Since this is a pretty basic index, seems like something might be wrong?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]