Jubin Soni created SPARK-57517:
----------------------------------
Summary: schema_of_json throws ClassCastException instead of
proper error on non-string literal input
Key: SPARK-57517
URL: https://issues.apache.org/jira/browse/SPARK-57517
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 4.1.2, 4.2.0, 4.3.0, 4.1.3
Reporter: Jubin Soni
Calling {{schema_of_json}} with a non-string literal (e.g., {{{}SELECT
schema_of_json(42){}}}) throws a {{ClassCastException}} during analysis instead
of a clear, user-facing error message.
The root cause is that {{SchemaOfJson.checkInputDataTypes()}} references a lazy
val {{json = child.eval().asInstanceOf[UTF8String]}} before validating that the
input type is {{{}StringType{}}}. When the child is an integer literal, the
{{asInstanceOf[UTF8String]}} cast fails with a {{{}ClassCastException{}}}.
The companion functions {{schema_of_csv}} and {{schema_of_xml}} were fixed for
the same issue in SPARK-52234, but {{schema_of_json}} was missed.
*Steps to reproduce:*
{{SELECT schema_of_json(42);}}
{*}Expected:{*}{{{}{}}}
{{DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE}}
analysis error (same behavior as
{{schema_of_csv(42)}}
)
*Actual:*
{{ClassCastException: class java.lang.Integer cannot be cast to class
org.apache.spark.unsafe.types.UTF8String}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]