[I] [Sedona Core] Sedona doesn't clip raster to geometry extent in zonal statistics which can lead to inefficient queries [sedona]

via GitHub Sun, 19 Oct 2025 14:48:07 -0700


Imbruced opened a new issue, #2409:
URL: https://github.com/apache/sedona/issues/2409


   In the Sedona core raster function, which is used in the zonal statistics
   
   ```java
   private static List<Object> getStatObjects(GridCoverage2D raster, Geometry 
roi, int band, boolean allTouched, boolean excludeNoData, boolean lenient)
   ```
   
   We loop through all elements of the raster data array rather than clipping 
it to the geometry's boundary. This leads to long processing runs, especially 
when the geometry's area is much smaller than the raster's size. 
   
   Example of zonal stats for a relatively small polygon in comparison to the 
raster
   
   <img width="725" height="510" alt="Image" 
src="https://github.com/user-attachments/assets/b5c4c4b3-4de6-4823-aea9-908ba26d7f72";
 />
   
   exploding the rasters before calculating zonal stats, like below
   
   ```python
         .selectExpr("rp", "Explode(RS_Tile(rast, 64, 64)) AS col")
   ```
   
   Improves the speed of the processing a lot
   
   Clipping before zonal stats is an easy improvement we can add.
   
   cc: @jiayuasu 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Sedona Core] Sedona doesn't clip raster to geometry extent in zonal statistics which can lead to inefficient queries [sedona]

Reply via email to