Imbruced opened a new issue, #2409: URL: https://github.com/apache/sedona/issues/2409
In the Sedona core raster function, which is used in the zonal statistics ```java private static List<Object> getStatObjects(GridCoverage2D raster, Geometry roi, int band, boolean allTouched, boolean excludeNoData, boolean lenient) ``` We loop through all elements of the raster data array rather than clipping it to the geometry's boundary. This leads to long processing runs, especially when the geometry's area is much smaller than the raster's size. Example of zonal stats for a relatively small polygon in comparison to the raster <img width="725" height="510" alt="Image" src="https://github.com/user-attachments/assets/b5c4c4b3-4de6-4823-aea9-908ba26d7f72" /> exploding the rasters before calculating zonal stats, like below ```python .selectExpr("rp", "Explode(RS_Tile(rast, 64, 64)) AS col") ``` Improves the speed of the processing a lot Clipping before zonal stats is an easy improvement we can add. cc: @jiayuasu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
