jiayuasu opened a new pull request, #2905: URL: https://github.com/apache/sedona/pull/2905
## Did you read the Contributor Guide? - Yes, I have read the [Contributor Rules](https://sedona.apache.org/latest/community/rule/) and [Contributor Development Guide](https://sedona.apache.org/latest/community/develop/) ## Is this PR related to a ticket? - Yes, and the PR name follows the format `[GH-XXX] my subject`. Closes part of #2700. Final notebook in the docker-image refresh series alongside #2876, #2879, #2889, #2896, #2900. ## What changes were proposed in this PR? `docs/usecases/04-flood-snapshot.ipynb` — an end-to-end disaster-response workflow that closes the planned series. Demonstrates the full raster-mask-to-vector-result pipeline. > **A flood happened over this AOI. Which buildings are inside the flood extent, and how many of them are hospitals, schools, or fire stations?** Pipeline: 1. SedonaContext setup. 2. Synthesize a Sentinel-1-VV-style backscatter raster (low values where flat water sits — the SAR signature). Tiled GeoTIFF, 128×128 px, EPSG:4326. 3. Threshold with single-raster `RS_MapAlgebra` (`out[0] = rast[0] < 0.2 ? 1 : 0`) → binary water mask. 4. Vectorize: `RS_PixelAsPolygons(rast, 1)` + `explode` + `WHERE pixel.value = 1` + `ST_Union_Agg` → a single dissolved flood `MultiPolygon`. `ST_Buffer(..., 0.0005)` absorbs SAR speckle along the edge. 5. Synthesize 16 building footprints on a 4×4 grid, each tagged with `type ∈ {residential, hospital, school, fire_station}`. `ST_SetSRID(..., 4326)` so the GeoParquet 1.1 writer can populate projjson CRS. 6. `ST_Intersects(building, flood)` spatial join surfaces affected buildings; `type IN ('hospital', 'school', 'fire_station')` flags critical infrastructure. 7. Persist as GeoParquet 1.1 (auto covering-bbox + projjson), round-trip read-back. 8. Single-axes matplotlib plot: VV backscatter as basemap, dissolved flood polygon (blue), buildings color-coded by status (red = affected critical, orange = affected residential, cream = unaffected). All inputs synthesized in the notebook → runs offline, ships no new bytes. The intro markdown links to the [STAC tutorial](https://github.com/apache/sedona/blob/master/docs/tutorial/files/stac-sedona-spark.md) for the production path: swap the synthesis cell for `sedona.read.format("stac")` + `sedona.read.format("raster")` of the chosen scene's VV asset URL and the rest of the pipeline runs unchanged on real Copernicus data. Notebook is structured as numbered markdown sections (`## 1.` through `## 8.`), matching the convention from earlier notebooks. Notebook intro flags `**Requires Sedona ≥ 1.9.0**` for the auto-tiling raster reader (GH-2672) and the GeoParquet 1.1 writer's auto-populated covering-bbox + projjson CRS (GH-2646, GH-2664). Built-in ground truth: a circular flood patch in the SW quadrant of the AOI overlaps exactly the SW 2×2 building grid (B00, B01, B10, B11), of which B10 is tagged hospital. The harness confirms `4 affected, 1 critical`. ## How was this patch tested? End-to-end through the local mirror of `docker/test-notebooks.sh` (matched docker stack: Python 3.10, `pyspark==4.0.1`, `apache-sedona==1.9.0`, JDK 17, `local[*]`, `DRIVER_MEM=4g`, Sedona JAR via `PYSPARK_SUBMIT_ARGS` Maven coords). ``` PASS 04-flood-snapshot 14s elapsed ``` Output sanity-checked: - backscatter min=0.00, max=0.69, flood pixels = 2,449 / 16,384 (≈15% of AOI — matches the synthesized circle) - mask raster: 128×128 `UNSIGNED_8BITS` - flood polygon dissolved correctly via `ST_Union_Agg(RS_PixelAsPolygons(...))` - affected = {B00, B01, B10, B11} — exactly the four building centers inside the flood circle (verified: flood center (-91.075, 41.525), radius ≈0.022°, building grid step 0.025°) - 1 critical (B10 = hospital), matches the `types_grid` layout - GeoParquet 1.1 round-trip read-back identical One real failure mode surfaced and was fixed during local verification: the `RS_PixelAsPolygons` return struct field is `value`, not `val` as the doc example suggests. One-character notebook fix, then green. The Docker-build CI workflow will run on this PR (path-filter from #2889) and execute `test-notebooks.sh` against the built image, so the in-container PASS line lands directly in CI. ## Did this PR include necessary documentation updates? - The notebook is itself the documentation; intro markdown calls out `**Requires Sedona ≥ 1.9.0**` and links to the STAC tutorial for the live-data swap-in path. - No new shipped data, so no `docs/usecases/data/README.md` updates. - No public API changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
