tugceozberk opened a new pull request, #54337:
URL: https://github.com/apache/spark/pull/54337

   ### What changes were proposed in this pull request?
   
   `ColumnarBatchRow.copy()` and `MutableColumnarRow.copy()`/`get()` do not 
handle `VariantType`, causing a `RuntimeException: Not implemented. 
VariantType` when using `VariantType` columns with streaming custom data 
sources that rely on columnar batch row copying.
   
   PR #53137 (SPARK-54427) added `VariantType` support to `ColumnarRow` but 
missed `ColumnarBatchRow` and `MutableColumnarRow`. PR #54006 attempted this 
fix but was closed.
   
   This patch adds:
   - `PhysicalVariantType` branch in `ColumnarBatchRow.copy()`
   - `VariantType` branch in `MutableColumnarRow.copy()` and `get()`
   - Test in `ColumnarBatchSuite` validating `VariantVal` round-trip through 
`copy()`
   
   ### Why are the changes needed?
   
   Without this fix, any streaming pipeline that returns `VariantType` columns 
from a custom columnar data source throws a runtime exception when Spark 
attempts to copy the batch row.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No. This is a bug fix for an existing feature.
   
   ### How was this patch tested?
   
   Added a new test `[SPARK-55552] Variant` in `ColumnarBatchSuite` that 
creates a `VariantType` column vector, populates it with `VariantVal` data 
(including a null), wraps it in a `ColumnarBatchRow`, calls `copy()`, and 
verifies the values round-trip correctly.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Yes. GitHub Copilot was used to assist in drafting portions of this 
contribution.
   
   CC @cloud-fan @dongjoon-hyun
   
   This contribution is my original work and I license it under the Apache 2.0 
license.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to