[
https://issues.apache.org/jira/browse/SPARK-55552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tugce Ozberk Yener updated SPARK-55552:
---------------------------------------
Description:
ColumnarBatchRow.copy() and MutableColumnarRow.copy() / get() do not handle
{{{}VariantType{}}}, causing a {{RuntimeException: Not implemented.
VariantType}} when using {{VariantType}} columns in streaming custom data
sources that rely on columnar batch row copying.
*Related:* [SPARK-54427|https://issues.apache.org/jira/browse/SPARK-54427 the
original issue that added VariantType support to ColumnarRow via PR
[#53137|[https://github.com/apache/spark/pull/53137]], but missed
ColumnarBatchRow and MutableColumnarRow.
Previuous Attempt: PR [#54006|[https://github.com/apache/spark/pull/54006]]
attempted this fix but was closed/abandoned. A reviewer (@cloud-fan) requested
a JIRA ticket be created.
*Fix:*
* Add a {{PhysicalVariantType}} branch in ColumnarBatchRow.copy() (catalyst
module)
* Add a {{VariantType}} branch in MutableColumnarRow.copy() and
MutableColumnarRow.get() (core module)
* Add a test in {{ColumnarBatchSuite}} to validate the round-trip of
VariantVal through ColumnarBatchRow.copy()
was:
ColumnarBatchRow.copy() and MutableColumnarRow.copy() / get() do not handle
{{{}VariantType{}}}, causing a {{RuntimeException: Not implemented.
VariantType}} when using {{VariantType}} columns in streaming custom data
sources that rely on columnar batch row copying.
*Related:* [SPARK-54427|[https://issues.apache.org/jira/browse/SPARK-54427] the
original issue that added VariantType support to ColumnarRow via PR
[#53137|[https://github.com/apache/spark/pull/53137]|https://github.com/apache/spark/pull/53137],
but missed ColumnarBatchRow and MutableColumnarRow.
Previuous Attempt: PR [#54006|[https://github.com/apache/spark/pull/54006]]
attempted this fix but was closed/abandoned. A reviewer (@cloud-fan) requested
a JIRA ticket be created.
*Fix:*
* Add a {{PhysicalVariantType}} branch in ColumnarBatchRow.copy() (catalyst
module)
* Add a {{VariantType}} branch in MutableColumnarRow.copy() and
MutableColumnarRow.get() (core module)
* Add a test in {{ColumnarBatchSuite}} to validate the round-trip of
VariantVal through ColumnarBatchRow.copy()
> ColumnarBatchRow.copy() and MutableColumnarRow do not handle VariantType
> -------------------------------------------------------------------------
>
> Key: SPARK-55552
> URL: https://issues.apache.org/jira/browse/SPARK-55552
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 4.1.1
> Reporter: Tugce Ozberk Yener
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.2.0
>
>
> ColumnarBatchRow.copy() and MutableColumnarRow.copy() / get() do not handle
> {{{}VariantType{}}}, causing a {{RuntimeException: Not implemented.
> VariantType}} when using {{VariantType}} columns in streaming custom data
> sources that rely on columnar batch row copying.
> *Related:* [SPARK-54427|https://issues.apache.org/jira/browse/SPARK-54427 the
> original issue that added VariantType support to ColumnarRow via PR
> [#53137|[https://github.com/apache/spark/pull/53137]], but missed
> ColumnarBatchRow and MutableColumnarRow.
> Previuous Attempt: PR [#54006|[https://github.com/apache/spark/pull/54006]]
> attempted this fix but was closed/abandoned. A reviewer (@cloud-fan)
> requested a JIRA ticket be created.
> *Fix:*
> * Add a {{PhysicalVariantType}} branch in ColumnarBatchRow.copy() (catalyst
> module)
> * Add a {{VariantType}} branch in MutableColumnarRow.copy() and
> MutableColumnarRow.get() (core module)
> * Add a test in {{ColumnarBatchSuite}} to validate the round-trip of
> VariantVal through ColumnarBatchRow.copy()
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]