Kontinuation opened a new pull request, #1158:
URL: https://github.com/apache/sedona/pull/1158

   ## Did you read the Contributor Guide?
   
   - Yes, I have read [Contributor 
Rules](https://sedona.apache.org/latest-snapshot/community/rule/) and 
[Contributor Development 
Guide](https://sedona.apache.org/latest-snapshot/community/develop/)
   
   ## Is this PR related to a JIRA ticket?
   
   - Yes, the URL of the associated JIRA ticket is 
https://issues.apache.org/jira/browse/SEDONA-453. The PR name follows the 
format `[SEDONA-XXX] my subject`.
   
   ## What changes were proposed in this PR?
   
   We found that the index efficiency of Quadtree drastically degrades when 
indexing datasets made up of points. The index returns way more candidates than 
expected when querying the Quadtree using envelopes. The reason is that JTS 
Quadtree automatically expands indexed envelopes by 0.5 if the envelope has 
zero width and height (see 
[Quadtree.java#L61-L96](https://github.com/locationtech/jts/blob/1.19.0/modules/core/src/main/java/org/locationtech/jts/index/quadtree/Quadtree.java#L61-L96)),
 this makes the indexed envelopes of points are way larger than necessary, 
especially when indexed points are WGS84 coordinates.
   
   Suppose that we are indexing the following dataset using Quadtree:
   
   <img width="611" alt="Screenshot 2023-12-21 at 9 10 58 PM" 
src="https://github.com/apache/sedona/assets/5501374/c92e22b4-ad9b-457b-bc97-9e2586ecdd5d";>
   
   The envelopes indexed by Quadtree happens to be something like this:
   
   <img width="611" alt="Screenshot 2023-12-21 at 9 12 16 PM" 
src="https://github.com/apache/sedona/assets/5501374/32cc440e-9905-4a5f-96e2-9ec503352465";>
   
   This PR workarounds this problem by manually extendinging envelopes with 0 
width or height by 1e-6. This will prevent JTS Quadtree from extending the 
envelopes by 0.5, and 1e-6 is small enough to cope with the most common use 
cases.
   
   ## How was this patch tested?
   
   Add test to verify Quad tree index efficiency for PointRDD.
   
   ## Did this PR include necessary documentation updates?
   
   - No, this PR does not affect any public API so no need to change the docs.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@sedona.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to