David Smiley created LUCENE-8041:
------------------------------------
Summary: All Fields.terms(fld) impls should be O(N) not O(log(N))
Key: LUCENE-8041
URL: https://issues.apache.org/jira/browse/LUCENE-8041
Project: Lucene - Core
Issue Type: Improvement
Reporter: David Smiley
I've seen apps that have a good number of fields -- hundreds. The O(log(N)) of
TreeMap definitely shows up in a profiler; sometimes 20% of search time, if I
recall. There are many Field implementations that are impacted... in part
because Fields is the base class of FieldsProducer.
As an aside, I hope Fields to go away some day; FieldsProducer should be
TermsProducer and not have an iterator of fields. If DocValuesProducer doesn't
have this then why should the terms index part of our API have it? If we did
this then the issue here would be a simple transition to a HashMap.
Or maybe we can switch to HashMap and relax the definition of Fields.iterator
to not necessarily be sorted?
Perhaps the fix can be a relatively simple conversion over to LinkedHashMap in
many cases if we can assume when we initialize these internal maps that we
consume them in sorted order to begin with.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]