Gautam Worah created LUCENE-10068:
-------------------------------------
Summary: Switch to a "double barrel" HPPC cache for the taxonomy
LRU cache
Key: LUCENE-10068
URL: https://issues.apache.org/jira/browse/LUCENE-10068
Project: Lucene - Core
Issue Type: Improvement
Components: modules/facet
Affects Versions: 8.8.1
Reporter: Gautam Worah
While working on an unrelated getBulkPath API
[PR|https://github.com/apache/lucene/pull/179], [~mikemccand] and I came across
a nice optimization that could be made to the taxonomy cache.
The taxonomy cache today caches frequently used ordinals and their
corresponding FacetLabels. It uses the existing LRUHashMap (backed by a
LinkedList) class for its implementation.
This implementation performs sub optimally when it has a large number of
threads accessing it, and consumes a large amount of RAM.
[~mikemccand] suggested the idea of a two array backed HPPC int->FacetLabel
cache. The basic idea behind the cache being:
# We use two hashmaps primary and secondary.
# In case of a cache miss in the primary and a cache hit in the secondary, we
add the key to the primary map as well.
# In case of a cache miss in both the maps, we add it to the primary map.
# When we reach (make this check each time we insert?) a large number of
elements in say the primary cache, (say larger than the existing
{color:#871094}DEFAULT_CACHE_VALUE{color}=4000), we dump the secondary map and
copy all the values of the primary map into it.
The idea was originally explained in
[this|https://github.com/apache/lucene/pull/179#discussion_r692907559] comment.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]