Kristin Cowalcijk created SEDONA-207:
----------------------------------------

             Summary: Faster serialization/deserialization of geometry objects
                 Key: SEDONA-207
                 URL: https://issues.apache.org/jira/browse/SEDONA-207
             Project: Apache Sedona
          Issue Type: Improvement
            Reporter: Kristin Cowalcijk
         Attachments: image-2022-12-02-20-19-15-597.png, 
image-2022-12-02-20-19-36-449.png

Recently we've looked into the performance of geometry serdes, since it greatly 
impacts the performance of Spatial SQL. After benchmarking and assessing the 
geometry serializers currently in Apache Sedona (ShapeSerde, WKB-based 
GeometrySerializer, etc.), we came up with a high performance geometry serde 
implementation which outperforms existing serdes in both benchmarks and Spatial 
SQL end-to-end tests. It makes simple range queries like this speed up by 2x:
 
{code:sql}
SELECT COUNT(1) FROM traj_points WHERE ST_Within(geom, 
ST_GeomFromText('POLYGON((120.40586018622339 
31.429636201527515,120.84256672919214 31.429636201527515,120.84256672919214 
31.089198624963103,120.40586018622339 31.089198624963103,120.40586018622339 
31.429636201527515))'))
 {code}
[Here|https://github.com/Kontinuation/play-with-geometry-serde] is the 
benchmark code and result of geometry serdes. The benchmark was performed on an 
ECS instance with 4 Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz CPUs, using 
OpenJDK 1.8.0_352.

!image-2022-12-02-20-19-15-597.png|width=481,height=311! 
!image-2022-12-02-20-19-36-449.png|width=482,height=307!

I'll write a detailed description for the proposed geometry serde in the next 
few days. There're still a lot of things to do to integrate it into Apache 
Sedona. We'll implement a python version of proposed serde as a C extension, 
and also implement a pure python version using {{struct}} package as a fallback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to