Kristin Cowalcijk created SEDONA-453:
----------------------------------------

             Summary: Performance degrade when indexing points using Quadtree
                 Key: SEDONA-453
                 URL: https://issues.apache.org/jira/browse/SEDONA-453
             Project: Apache Sedona
          Issue Type: Bug
            Reporter: Kristin Cowalcijk
             Fix For: 1.5.1
         Attachments: image-2023-12-21-21-11-22-571.png, 
image-2023-12-21-21-12-21-706.png

We found that the index efficiency of Quadtree drastically degrades when 
indexing datasets made up of points. The index returns way more candidates than 
expected when querying the Quadtree using envelopes. The reason is that JTS 
Quadtree automatically expands indexed envelopes by 0.5 if the envelope has 
zero width and height (see 
[Quadtree.java#L61-L96|https://github.com/locationtech/jts/blob/1.19.0/modules/core/src/main/java/org/locationtech/jts/index/quadtree/Quadtree.java#L61-L96]),
 this makes the indexed envelopes of points are way larger than necessary, 
especially when indexed points are WGS84 coordinates.

Suppose that we are indexing the following dataset using Quadtree:

!image-2023-12-21-21-11-22-571.png|width=412,height=243!

The envelopes indexed by Quadtree happens to be something like this:

!image-2023-12-21-21-12-21-706.png|width=426,height=301!

One possible workaround for this problem is to manually extend envelopes with 0 
width or height by 1e-6. This will prevent JTS Quadtree from extending the 
envelopes by 0.5, and 1e-6 is small enough to cope with the most common use 
cases.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to