[ 
https://issues.apache.org/jira/browse/LUCENE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-6320:
--------------------------------
    Attachment: LUCENE-6320.patch

Here is a patch. We use codec apis to do these checks, so the optimizations we 
already worked on for merge help a lot (esp. stored fields, norms, docvalues).

When we check postings without deletes, we weren't reusing postingsenum and 
were clone()'ing for every term.

FieldInfos.get(int) is a cpu hog for stored fields and vectors, since its 
called for every field in the doc and we do O(log N) lookup each time. Its 
wasteful in memory usually too (using a treemap always when in most cases a 
simple array is smaller and faster).

> speed up checkindex
> -------------------
>
>                 Key: LUCENE-6320
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6320
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>         Attachments: LUCENE-6320.patch
>
>
> This is fairly slow today, very ram intensive, and has some buggy stuff (e.g. 
> postingsenum reuse bugs). We can do better...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to