[
https://issues.apache.org/jira/browse/LUCENE-8928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888722#comment-16888722
]
Adrien Grand commented on LUCENE-8928:
--------------------------------------
I played with this idea a bit at
https://github.com/jpountz/lucene-solr/commit/16e6594af44b753c9ac498a063eb9b9d6102e020
and
https://github.com/mikemccand/luceneutil/blob/master/src/main/perf/IndexAndSearchOpenStreetMaps.java
with shapes. It's a bit artificial since we are using shapes to index points,
but nevertheless I got 62% slower indexing (130 seconds instead of 80) but 45%
faster searching for box queries (63.0 QPS instead of 43.5).
> BKDWriter could make splitting decisions based on the actual range of values
> ----------------------------------------------------------------------------
>
> Key: LUCENE-8928
> URL: https://issues.apache.org/jira/browse/LUCENE-8928
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Priority: Minor
>
> Currently BKDWriter assumes that splitting on one dimension has no effect on
> values in other dimensions. While this may be ok for geo points, this is
> usually not true for ranges (or geo shapes, which are ranges too). Maybe we
> could get better indexing by re-computing the range of values on each
> dimension before making the choice of the split dimension?
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]