[PR] [SPARK-48942] Disable TransformNestedUDTParquet on Spark 4.1+ [sedona]

via GitHub Tue, 10 Mar 2026 10:14:09 -0700


james-willis opened a new pull request, #2703:
URL: https://github.com/apache/sedona/pull/2703


   ## Summary
   
   - Disable the `TransformNestedUDTParquet` optimizer rule on Spark 4.1+, 
where the root cause (SPARK-48942) has been fixed natively by SPARK-52651.
   - Use defensive version parsing (`Try`/`getOrElse`, `.lift()`) to avoid 
exceptions on malformed version strings.
   
   ## Context
   
   PR #2359 introduced the `TransformNestedUDTParquet` workaround which 
transforms nested `GeometryUDT` to `BinaryType` in `LogicalRelation` output 
attributes to work around the vectorized Parquet reader crash (SPARK-48942).
   
   SPARK-52651 (merged in Spark 4.1) fixes this at the Spark level by 
recursively stripping UDTs in `ColumnVector`, making our workaround unnecessary 
on 4.1+.
   
   This PR version-gates the workaround so it is only registered on Spark < 4.1.
   
   Cherry-pick of wherobots/wherobots-compute#614.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [SPARK-48942] Disable TransformNestedUDTParquet on Spark 4.1+ [sedona]

Reply via email to