[
https://issues.apache.org/jira/browse/LUCENE-9536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17200148#comment-17200148
]
Michael McCandless commented on LUCENE-9536:
--------------------------------------------
+1
It should be simple and quick to detect when this happens? I.e. when building
the {{OrdinalMap}} we can see that the cardinality of the global ords is the
same as the cardinality of this segment, and know that this segment matches
perfectly?
> Optimize OrdinalMap when one segment contains all distinct values?
> ------------------------------------------------------------------
>
> Key: LUCENE-9536
> URL: https://issues.apache.org/jira/browse/LUCENE-9536
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Julie Tibshirani
> Priority: Minor
>
> For doc values that are not too high cardinality, it seems common to have
> some large segments that contain all distinct values (plus many small
> segments who are missing some values). In this case, we could check if the
> first segment ords map perfectly to global ords and if so store
> `globalOrdDeltas` and `firstSegments` as `LongValues.ZEROES`. This could save
> a small amount of space.
> I don’t think it would help a huge amount, especially since the optimization
> might only kick in with small/ medium cardinalities, which don’t create huge
> `OrdinalMap` instances anyways? But it is simple and seemed worth mentioning.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]