[ 
https://issues.apache.org/jira/browse/LUCENE-8153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350602#comment-16350602
 ] 

Adrien Grand commented on LUCENE-8153:
--------------------------------------

I confirmed this is due to the additional checks that I added for impacts. I'm 
running CheckIndex in 180 seconds locally vs. 74 seconds before LUCENE-4198. 
This is slower because eg. ImpactsEnum doesn't give way to reuse existing 
instances (it shouldn't be needed) and the checks we perform are a bit more 
costly than comparing the freqs, but still important.

I started looking into short-circuiting long postings lists, but it doesn't buy 
too much (around 150 seconds) because most postings lists are still very short. 
Another option could be to only check 1/16th of terms for instance. Any 
opinions?

> checkindex time more than doubles for wikipedia index with recent change
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-8153
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8153
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Priority: Major
>
> See [http://people.apache.org/~mikemccand/lucenebench/checkIndexTime.html]
> Since this is a pretty basic index, seems like something might be wrong?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to