[
https://issues.apache.org/jira/browse/LUCENE-8521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16653306#comment-16653306
]
Ignacio Vera commented on LUCENE-8521:
--------------------------------------
It seems there is a typo in LatLonShapeBoundingBoxQuery#relateRangeBBoxToQuery:
{code:java}
FutureArrays.compareUnsigned(maxTriangle, maxYOffset, maxYOffset + BYTES, bbox,
2 * BYTES, 2 * BYTES) < 0{code}
apart from that +1.
Just for fun, I have been playing limiting the number of indexed dimension to 8
and allowing more data dimensions so we can have an encoding using integers, 10
data dimensions from which the first 4 are the bounding box. That seems more
natural that using longs.
Running performance test on 30m polygons sample (dev= 10d, base=7d):
{code:java}
||Index time (sec)||Index size (GB)||Reader heap (MB)||
||Dev||Base||Diff ||Dev||Base||Diff||Dev||Base||Diff ||
122.1s|119.5s| 2% ||2.77|2.83|-2%. ||0.77|0.92|-16%|
||Shape||M hits/sec ||QPS ||Hit count ||
||Dev||Base ||Diff||Dev||Base||Diff||Dev||Base||Diff||
box |4.23|3.86. |10%. |26.22|23.93|10%|36262002|36262002| 0%|
poly 10 |2.75|3.52 |-22% |9.47|12.11|-22%|65382571|65382571| 0%|{code}
There is a mismatch of results.
1) The index size does not change too much which I guess is due to good
compression when using longs.
2) Memory footprint is lower when using integers, still pretty low in both
cases.
3) Bounding box queries are faster but polygon queries are clearly slower. Not
sure why this difference.
So no clear benefit of using integers.
> Change LatLonShape encoding to use selective indexing
> -----------------------------------------------------
>
> Key: LUCENE-8521
> URL: https://issues.apache.org/jira/browse/LUCENE-8521
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Nicholas Knize
> Priority: Major
> Attachments: LUCENE-8521.patch
>
>
> LUCENE-8496 allows for selecting the first n dimensions to be used for
> building the index and the remaining dimensions to be used as data
> dimensions. This feature changes {{LatLonShape}} encoding to a 7 dimension
> encoding instead of 6; where the first 4 are index dimensions defining the
> bounding box of the {{LatLonShape.Triangle}} and the remaining 3 data
> dimensions defining the vertices of the triangle.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]