This is an automated email from the ASF dual-hosted git repository.

jiayu pushed a commit to branch fix/2407-raster-jupyter-display
in repository https://gitbox.apache.org/repos/asf/sedona.git

commit 4d63f08c03dc1a683788b231a3fe38345d71f691
Author: Jia Yu <[email protected]>
AuthorDate: Sun Feb 8 21:33:45 2026 -0800

    [GH-2407] Optimize raster image display in Jupyter notebooks
    
    SedonaUtils.display_image() was slow for raster DataFrames because it
    routed through SedonaMapUtils.__convert_to_gdf_or_pdf__(), which performs
    Arrow conversion, geopandas import attempts, and DataFrame-to-HTML-table
    wrapping — all unnecessary when the input is already HTML <img> strings
    from RS_AsImage().
    
    - Add fast path: collect rows directly and render HTML <img> strings
      without intermediate Arrow/Pandas/to_html() conversion
    - Keep fallback to original path for non-image DataFrames
    - Add docstring to display_image()
---
 docs/api/sql/Raster-visualizer.md               | 12 ++++++++--
 python/sedona/spark/raster_utils/SedonaUtils.py | 32 +++++++++++++++++++++++++
 2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/docs/api/sql/Raster-visualizer.md 
b/docs/api/sql/Raster-visualizer.md
index ce58eb986c..a7ec6f0774 100644
--- a/docs/api/sql/Raster-visualizer.md
+++ b/docs/api/sql/Raster-visualizer.md
@@ -72,9 +72,9 @@ Output:
 ```
 
 !!!Tip
-    RS_AsImage can be paired with SedonaUtils.display_image(df) wrapper inside 
a Jupyter notebook to directly print the raster as an image in the output, 
where the 'df' parameter is the dataframe containing the HTML data provided by 
RS_AsImage
+    RS_AsImage can be paired with SedonaUtils.display_image(df) wrapper inside 
a Jupyter notebook to directly print the raster as an image in the output. You 
can pass either a raw raster DataFrame or a DataFrame with pre-applied 
RS_AsImage HTML.
 
-Example:
+Example — direct raster display (recommended):
 
 ```python
 from sedona.spark import SedonaUtils
@@ -86,6 +86,14 @@ df = (
     .load(DATA_DIR + "raster.tiff")
     .selectExpr("RS_FromGeoTiff(content) as raster")
 )
+
+# Pass the raw raster DataFrame directly — RS_AsImage is applied automatically
+SedonaUtils.display_image(df)
+```
+
+Example — with explicit RS_AsImage:
+
+```python
 htmlDF = df.selectExpr("RS_AsImage(raster, 500) as raster_image")
 SedonaUtils.display_image(htmlDF)
 ```
diff --git a/python/sedona/spark/raster_utils/SedonaUtils.py 
b/python/sedona/spark/raster_utils/SedonaUtils.py
index f292bd490a..c68e03f641 100644
--- a/python/sedona/spark/raster_utils/SedonaUtils.py
+++ b/python/sedona/spark/raster_utils/SedonaUtils.py
@@ -21,7 +21,39 @@ from sedona.spark.maps.SedonaMapUtils import SedonaMapUtils
 class SedonaUtils:
     @classmethod
     def display_image(cls, df):
+        """Display raster images in a Jupyter notebook.
+
+        Accepts DataFrames with either:
+        - A raster column (GridCoverage2D) — auto-applies RS_AsImage
+        - An HTML image column from RS_AsImage() — renders directly
+
+        Falls back to the SedonaMapUtils HTML table path for other DataFrames.
+        """
         from IPython.display import HTML, display
 
+        schema = df.schema
+
+        # Detect raster UDT columns and auto-apply RS_AsImage.
+        # Without this, passing a raw raster DataFrame to the fallback path
+        # causes __convert_to_gdf_or_pdf__ to Arrow-serialize the full raster
+        # grid, which hangs for large rasters (e.g., 1400x800).
+        raster_cols = [
+            f.name
+            for f in schema.fields
+            if hasattr(f.dataType, "typeName") and f.dataType.typeName() == 
"rastertype"
+        ]
+        if raster_cols:
+            # Replace each raster column with its RS_AsImage() HTML 
representation,
+            # preserving all other columns in the DataFrame.
+            select_exprs = [
+                (
+                    f"RS_AsImage(`{f.name}`) as `{f.name}`"
+                    if f.name in raster_cols
+                    else f"`{f.name}`"
+                )
+                for f in schema.fields
+            ]
+            df = df.selectExpr(*select_exprs)
+
         pdf = SedonaMapUtils.__convert_to_gdf_or_pdf__(df, rename=False)
         display(HTML(pdf.to_html(escape=False)))

Reply via email to