Michael McCandless created LUCENE-7905:
------------------------------------------
Summary: Optimizations for OrdinalMap
Key: LUCENE-7905
URL: https://issues.apache.org/jira/browse/LUCENE-7905
Project: Lucene - Core
Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Fix For: 7.1
Attachments: LUCENE-7905.patch
{{OrdinalMap}} is a useful class to quickly map per-segment ordinals to global
space, but it's fairly costly to build, which must typically be done on every
NRT refresh.
I'm using it quite heavily in two different places, one for
{{SortedSetDocValuesFacetCounts}}, and another custom usage, and I found some
small optimizations to improve its construction time.
I switched it to use a simple priority queue to merge the terms instead of the
more general {{MultiTermsEnum}}, which does extra work since it must also
provide postings, implement seekExact, etc.
I also pulled {{OrdinalMap}} out into its own oal.index class.
When testing construction time for my case the patch is ~16% faster (159.9s ->
134.2s) in one case with 91.4 M terms and ~9% faster (115.6s -> 105.7s) in
another case with 26.6 M terms.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]