[ https://issues.apache.org/jira/browse/LUCENE-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ignacio Vera resolved LUCENE-8888. ---------------------------------- Resolution: Fixed Assignee: Ignacio Vera Fix Version/s: 8.2 master (9.0) > Improve distribution of points with data dimension in BKD tree leaves > --------------------------------------------------------------------- > > Key: LUCENE-8888 > URL: https://issues.apache.org/jira/browse/LUCENE-8888 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Ignacio Vera > Assignee: Ignacio Vera > Priority: Major > Fix For: master (9.0), 8.2 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > In LUCENE-8688 it was introduce a new storing strategy for leaves contains > duplicated points. This works well with indexed dimension as the process of > partition the space and the final sorting of leaves groups points with equal > indexed dimensions. > This is not the case all the time if the point contain data dimensions. It > might happen that if two points have the same indexed dimensions but > different data dimensions, the distribution on the leaves is not the most > optimal. > A good example is if a user tries to index a bounding box using LatLonShape. > The resulting tessellation of a bounding box is two triangles with the same > indexed dimensions but different data dimensions. If there are two documents > indexing the same bounding box, the result in the leaf is the triangles from > one document followed by the triangles of the second document. This is > because the current sorting/selection algorithms use one indexed dimension > and tie-break on the > docID. > The most optimal distribution in the case above is two group together the > equal triangles. Therefore what it is propose here is to update the > selection/ sorting algorithms to use the data dimensions when they exist as > tie-breakers before using the docID. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org