iverase commented on issue #730: LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality URL: https://github.com/apache/lucene-solr/pull/730#issuecomment-504969937 > I wonder whether we should apply the run-length compression on the sorted dimension in the > low-cardinality case as well, this could save some additional bytes, and might make the logic to > decide whether to use the low-cardinality or high-cardinality encoding a bit easier? I don't think this would be so beneficial in the case of very low cardinality. Imagine that those few values only differ in the last Byte of the sorted dimension. For each value you add at least two bytes for the runLen compression and just save one when writing the sorted dimension. All in all the final size of the leaf will be bigger and I think this approach should favour this case.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
