Adrien Grand created LUCENE-8705:
------------------------------------
Summary: Compress BKD trees by only encoding the difference
between two dimensions
Key: LUCENE-8705
URL: https://issues.apache.org/jira/browse/LUCENE-8705
Project: Lucene - Core
Issue Type: Bug
Reporter: Adrien Grand
When serializing BKD trees to disk, for each block we look at the common prefix
for each dimension in isolation and only encode those common prefixes once for
the entire block. Now that we have range fields and shapes so that several
dimensions are storing related data, we might occasionally have longer common
prefixes when comparing with values in other dimensions. For instance when
indexing narrow ranges in a range field, we might get better compression on the
second dimension by encoding suffixes that differ with the first dimension.
This is also an obvious win if we are indexing lines or points as shapes, since
we have dimensions that record exactly the same values in that case.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]