Ignacio Vera commented on LUCENE-8126:

[~jpountz], the important thing here is if the shape is close to the equator 
and close to the poles. When using bounding boxes, the further away you are 
from the equator, the more cells you need to describe a shape. S2 cells are 
more constant around the globe so it should use the similar number of cells 
regardless where you are on the sphere.

I put together an example (attached) where I index a square polygon on the 
equator and at a 60 degrees latitude with trees with same set-up. You can see 
that geohash tree is the more inneficient as it needs quite a lot of cells to 
describe the polygon, 260 at the equator and 390 at 60 degrees. S2 and Quad 
trees use the same number of cells to describe the polygon at the equator (108) 
but at 60 degrees, S2 uses a similar number of cells  and Quad tree almost 
double the number of cells required. Here is where the benefit comes.

I have attached as well a small graph showing query performance of my data 
depending on the SPT. The queries use composite strategy and are random cone 
searches (query shape is a random circle). Horizontal axis represents the 
number of hits of the query and the vertical axis the query execution time.



> Spatial prefix tree based on S2 geometry
> ----------------------------------------
>                 Key: LUCENE-8126
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8126
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial-extras
>            Reporter: Ignacio Vera
>            Assignee: Ignacio Vera
>            Priority: Major
>         Attachments: SPT-query.jpeg, STP-cell.pdf
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
> Hi [~dsmiley],
> I have been working on a prefix tree based on goggle S2 geometry 
> (https://s2geometry.io/) to be used mainly with Geo3d shapes with very 
> promising results, in particular for complex shapes (e.g polygons). Using 
> this pixelization scheme reduces the size of the index, improves the 
> performance of the queries and reduces the loading time for non-point shapes. 
> If you are ok with this contribution and before providing any code I would 
> like to understand what is the correct/prefered approach:
> 1) Add new depency to the S2 library 
> (https://mvnrepository.com/artifact/io.sgr/s2-geometry-library-java). It has 
> Apache 2.0 license so it should be ok.
> 2) Create a utility class with all methods necessary to navigate the S2 tree 
> and create shapes from S2 cells (basically port what we need from the library 
> into Lucene).
> What do you think?

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to