[
https://issues.apache.org/jira/browse/LUCENE-8126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16320976#comment-16320976
]
David Smiley commented on LUCENE-8126:
--------------------------------------
_(an aside: I wish the JIRA GitHub integration didn't put so much code context
around the feedback text!)_
It's nice to see a new RPT SpatialPrefixTree implementation :-) The API is a
little crusty; perhaps sometime we could kick around some ideas to make it
nicer.
It'll be interesting to see how well this performs. This appears to be a 6-ary
tree, as opposed to 4-ary (quad) or 32-ary (geohash). One could build a
variable arity prefixTree by the way (i.e. first level has 256, next 128,
etc.), and I recently tweaked one of ours to do that (not contributed back yet).
For point data, the higher the arity, the smaller the index but slower search
as it must scan more.
For non-point data, it's not clear since distErrPct caps the resolution of a
shape relative to its size, and I believe (though not 100% sure) that it yields
a roughly normal distribution around a certain number of cells (given fixed
distErrPct, random shape type & size, near equator, random tree arity). It'd
be neat to empirically validate my theory. If I'm right, then the optimal
arity is probably 4 for non-point shapes, and we have two of those
implementations. RE "near equator" above, see LUCENE-5056 though it has an easy
fix in my last comment.
Given the way S2 divides a world into 6 sides recursively, it seems it would
place shapes at a balanced depth in the tree no matter where in the world the
data is. That's a nice benefit... making the cell depth for a shape a bit more
shallow than the probable depth in the other tree implementations (assuming a
target precision for a given shape). That's a bonus.
CC [~nknize] you may find this issue interesting
> Spatial prefix tree based on S2 geometry
> ----------------------------------------
>
> Key: LUCENE-8126
> URL: https://issues.apache.org/jira/browse/LUCENE-8126
> Project: Lucene - Core
> Issue Type: New Feature
> Components: modules/spatial-extras
> Reporter: Ignacio Vera
>
> Hi [~dsmiley],
> I have been working on a prefix tree based on goggle S2 geometry
> (https://s2geometry.io/) to be used mainly with Geo3d shapes with very
> promising results, in particular for complex shapes (e.g polygons). Using
> this pixelization scheme reduces the size of the index, improves the
> performance of the queries and reduces the loading time for non-point shapes.
> If you are ok with this contribution and before providing any code I would
> like to understand what is the correct/prefered approach:
> 1) Add new depency to the S2 library
> (https://mvnrepository.com/artifact/io.sgr/s2-geometry-library-java). It has
> Apache 2.0 license so it should be ok.
> 2) Create a utility class with all methods necessary to navigate the S2 tree
> and create shapes from S2 cells (basically port what we need from the library
> into Lucene).
> What do you think?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]