[
https://issues.apache.org/jira/browse/SPARK-48921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
L. C. Hsieh reassigned SPARK-48921:
-----------------------------------
Assignee: L. C. Hsieh
> ScalaUDF in subquery should run through analyzer
> ------------------------------------------------
>
> Key: SPARK-48921
> URL: https://issues.apache.org/jira/browse/SPARK-48921
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.4.3
> Reporter: L. C. Hsieh
> Assignee: L. C. Hsieh
> Priority: Major
>
> We got a customer issue that a `MergeInto` query on Iceberg table works
> earlier but cannot work after upgrading to Spark 3.4.
> The error looks like
> ```
> Caused by: org.apache.spark.SparkRuntimeException: Error while decoding:
> org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to
> nullable on unresolved object
> upcast(getcolumnbyordinal(0, StringType), StringType, - root class:
> java.lang.String).toString.
> ```
> The source table of `MergeInto` uses `ScalaUDF`. The error happens when Spark
> invokes the deserializer of input encoder of the `ScalaUDF` and the
> deserializer is not resolved yet.
> The encoders of ScalaUDF are resolved by the rule `ResolveEncodersInUDF`
> which will be applied at the end of analysis phase.
> During rewriting `MergeInto` to `ReplaceData` query, Spark creates an
> `Exists` subquery and `ScalaUDF` is part of the plan of the subquery. Note
> that the `ScalaUDF` is already resolved by the analyzer.
> Then, in `ResolveSubquery` rule which resolves the subquery, it will resolve
> the subquery plan if it is not resolved yet. Because the subquery containing
> `ScalaUDF` is resolved, the rule skips it so `ResolveEncodersInUDF` won't be
> applied on it. So the analyzed `ReplaceData` query contains a `ScalaUDF` with
> encoders unresolved that cause the error.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]