Adrien Grand created LUCENE-5780:
------------------------------------
Summary: OrdinalMap's mapping from global ords to segment ords is
sometimes wasteful
Key: LUCENE-5780
URL: https://issues.apache.org/jira/browse/LUCENE-5780
Project: Lucene - Core
Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
Fix For: 4.9, 5.0
Robert found a case when the ordinal map can be quite wasteful in terms of
memory usage: in order to be able to resolve values given a global ordinals, it
stores two things:
- an identifier of the segment where the value is
- the difference between the ordinal on the segment and the global ordinal
The issue is that OrdinalMap currently picks any of the segments that contain
the value but we can do better: we can pick the first segment that has the
value. This will help for two reasons:
- it will potentially require fewer bits per value to store the segment ids if
NRT segments don't introduce new values
- if all values happen to appear in the first segment, then the map from
global ords to deltas only stores zeros.
I just tested on an index where all values are in the first segment and this
helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]