[ https://issues.apache.org/jira/browse/LUCENE-6901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014081#comment-15014081 ]
Michael McCandless commented on LUCENE-6901: -------------------------------------------- OK for the 2D case this patch brings indexing time from 737.1 sec (trunk) to 441.5 sec (this patch), which is nice :) Note that the test is entirely single threaded: one indexing thread, SerialMergeScheduler. Trying {{TimSorter}} next ... > Optimize 1D dimensional value indexing > -------------------------------------- > > Key: LUCENE-6901 > URL: https://issues.apache.org/jira/browse/LUCENE-6901 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Michael McCandless > Assignee: Michael McCandless > Fix For: Trunk > > Attachments: LUCENE-6901.patch > > > Dimensional values give a smaller index, and faster search times, for > indexing ordered byte[] values across one or more dimensions, vs our existing > approaches, but the indexing time is substantially slower. > Since the 1D case is so important/common (numeric fields, range query) I > think it's worth optimizing its indexing time. It should also be possible to > optimize the N > 1 dimensions case too, but it's more complex ... we can > postpone that. > So for the 1D case, I changed the merge method to do a merge sort (like > postings) of the already sorted segments dimensional values, instead of > simply re-indexing all values from the incoming segments, and this was a big > speedup. > I also changed from {{InPlaceMergeSorter}} to {{IntroSorter}} (this is what > postings use, and it's faster but still safe) and this was another good > speedup, which should also help the > 1D cases. > Finally, I added a {{BKDReader.verify}} method (currently it's dark: NOT > called) that walks the index and then check that every value in each leaf > block does in fact fall within what the index expected/claimed. This is > useful for finding bugs! Maybe we can cleanly fold it into {{CheckIndex}} > somehow later. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org