[ 
https://issues.apache.org/jira/browse/LUCENE-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401354#comment-15401354
 ] 

Adrien Grand commented on LUCENE-7401:
--------------------------------------

bq. Are you trying to address the adersarial case of indexing e.g. a narrow 
sliver of points?

Yes indeed. It might feel like a corner case when dimensions represent similar 
data like lontitudes and latitudes, but what happens eg. if you want to index 
all towns in the world alongside their population as a 3rd dimension. Given 
that there are very large areas that only have small towns, it could happen 
that the population dimension does not get indexed at all in these areas?

bq. Another fun one is if all indexed points are equidistant from an origin. 
I've wondered whether cells should be "shrink wrapped" during indexing to 
handle this one...

Hmm this got me curious, why is it an adversarial case if all points are 
equidistant from an origin?

bq. There are quite a few papers that explore different splitting techniques to 
have better behavior with challenging cases.

Good point, I should take a look!

> BKDWriter should ensure all dimensions are indexed
> --------------------------------------------------
>
>                 Key: LUCENE-7401
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7401
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Adrien Grand
>            Priority: Minor
>
> The current heuristic is to use the dimension that has the largest span, so 
> if dimensions have a different distribution of values, we could theoretically 
> (but maybe in practice too?) end up with one dimension that is not indexed at 
> all and queries that are mostly selective on this dimension would need to 
> scan lots of blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to