[ https://issues.apache.org/jira/browse/LUCENE-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-7563: --------------------------------------- Attachment: LUCENE-7563.patch Another iteration on the patch; I think it's ready. I tested on the 20M sparse taxis data set and this change gives a sizable (~56% - ~59%) reduction in heap usage: * sparse-sorted: 6.14 MB -> 2.49 MB * sparse: 4.93 MB -> 2.17 MB * dense: 4.88 MB -> 2.09 MB > BKD index should compress unused leading bytes > ---------------------------------------------- > > Key: LUCENE-7563 > URL: https://issues.apache.org/jira/browse/LUCENE-7563 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Fix For: master (7.0), 6.4 > > Attachments: LUCENE-7563.patch, LUCENE-7563.patch > > > Today the BKD (points) in-heap index always uses {{dimensionNumBytes}} per > dimension, but if e.g. you are indexing {{LongPoint}} yet only use the bottom > two bytes in a given segment, we shouldn't store all those leading 0s in the > index. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org