2010YOUY01 opened a new pull request, #90:
URL: https://github.com/apache/sedona-db/pull/90

   Hi šŸ‘‹šŸ¼ , I’m new to the project and still learning my way around. `sedona-db` 
looks great, and I’d really appreciate any feedbacks.
   
   ## Rationale
   Before, the execution logic for `st_geometrytype()` function is, for each 
row, first parse the `WKB` binary into a `WKB` object, then extract the base 
type from the object. This approach includes parsing unused fields in the `WKB` 
binary, since only the geometry type is needed.
   
   This PR let it iterate through the raw `WKB` bytes, and directly parse the 
bytes to get the geometry type.
   
   ## Implementation
   1. Extend `GenericExecutor` with a new API `execute_wkb_bytes_void()` to 
iterate on raw `WKB` bytes.
   2. Implement a util to parse the type from `WKB` binary according to the 
spec.
   3. Update `st_geometrytype()` with 1 and 2
   
   I think it's better to move `2` to `wkb` crate, it doesn't have such a 
public interface yet šŸ¤” 
   
   ## Benchmark
   ### Command
   ```
   pytest --benchmark-group-by=param:table 
--benchmark-columns=median,mean,stddev 
test_functions.py::TestBenchFunctions::test_st_geometrytype
   ```
   ### Result:
   5x faster for complex collections, 30% faster for simple collections:
   ```sh
   -------------------------------- benchmark 'table=collections_complex': 3 
tests -------------------------------
   Name (time in ms)                                        Median              
  Mean            StdDev
   
---------------------------------------------------------------------------------------------------------------
   test_st_geometrytype[collections_complex-SedonaDB]       2.3656 (1.0)        
2.4929 (1.0)      0.3857 (1.0)
   test_st_geometrytype[collections_complex-DuckDB]        34.2037 (14.46)     
34.3980 (13.80)    0.8402 (2.18)
   test_st_geometrytype[collections_complex-PostGIS]      304.6275 (128.77)   
306.7333 (123.04)   5.8908 (15.27)
   
---------------------------------------------------------------------------------------------------------------
   
   ------------------------------ benchmark 'table=collections_simple': 3 tests 
-------------------------------
   Name (time in ms)                                      Median               
Mean            StdDev
   
------------------------------------------------------------------------------------------------------------
   test_st_geometrytype[collections_simple-SedonaDB]      1.3585 (1.0)       
1.7419 (1.0)      1.2142 (9.41)
   test_st_geometrytype[collections_simple-DuckDB]        5.1103 (3.76)      
5.1443 (2.95)     0.1291 (1.0)
   test_st_geometrytype[collections_simple-PostGIS]      46.8870 (34.51)    
46.9021 (26.93)    0.3712 (2.88)
   
------------------------------------------------------------------------------------------------------------
   ```
   
   ```sh
   -------------------------------------- benchmark 
'table=collections_complex': 3 tests -------------------------------------
   Name (time in us)                                            Median          
          Mean                StdDev
   
---------------------------------------------------------------------------------------------------------------------------
   test_st_geometrytype[collections_complex-SedonaDB]         419.2500 (1.0)    
      450.9272 (1.0)        124.1193 (1.0)
   test_st_geometrytype[collections_complex-DuckDB]        32,422.7921 (77.34)  
   32,917.7395 (73.00)    2,088.4215 (16.83)
   test_st_geometrytype[collections_complex-PostGIS]      295,752.0001 (705.43) 
  294,866.8750 (653.91)   3,872.8562 (31.20)
   
---------------------------------------------------------------------------------------------------------------------------
   
   ------------------------------------ benchmark 'table=collections_simple': 3 
tests -------------------------------------
   Name (time in us)                                          Median            
       Mean                StdDev
   
------------------------------------------------------------------------------------------------------------------------
   test_st_geometrytype[collections_simple-SedonaDB]        613.2090 (1.0)      
 1,144.3652 (1.0)      1,073.4389 (3.42)
   test_st_geometrytype[collections_simple-DuckDB]        5,502.5411 (8.97)     
 5,556.3829 (4.86)       314.2311 (1.0)
   test_st_geometrytype[collections_simple-PostGIS]      36,191.1250 (59.02)    
36,322.7638 (31.74)      730.0613 (2.32)
   
------------------------------------------------------------------------------------------------------------------------
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@sedona.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to