Jubin Soni created SPARK-57517:
----------------------------------

             Summary: schema_of_json throws ClassCastException instead of 
proper error on non-string literal input
                 Key: SPARK-57517
                 URL: https://issues.apache.org/jira/browse/SPARK-57517
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 4.1.2, 4.2.0, 4.3.0, 4.1.3
            Reporter: Jubin Soni


Calling {{schema_of_json}} with a non-string literal (e.g., {{{}SELECT 
schema_of_json(42){}}}) throws a {{ClassCastException}} during analysis instead 
of a clear, user-facing error message.

The root cause is that {{SchemaOfJson.checkInputDataTypes()}} references a lazy 
val {{json = child.eval().asInstanceOf[UTF8String]}} before validating that the 
input type is {{{}StringType{}}}. When the child is an integer literal, the 
{{asInstanceOf[UTF8String]}} cast fails with a {{{}ClassCastException{}}}.

The companion functions {{schema_of_csv}} and {{schema_of_xml}} were fixed for 
the same issue in SPARK-52234, but {{schema_of_json}} was missed.

*Steps to reproduce:*
{{SELECT schema_of_json(42);}}
 
{*}Expected:{*}{{{}{}}}

{{DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE}}
 analysis error (same behavior as 
{{schema_of_csv(42)}}
)
*Actual:*
{{ClassCastException: class java.lang.Integer cannot be cast to class 
org.apache.spark.unsafe.types.UTF8String}}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to