HyukjinKwon opened a new pull request, #36375: URL: https://github.com/apache/spark/pull/36375
### What changes were proposed in this pull request? This PR is a backport of https://github.com/apache/spark/pull/36351 This PR proposes to remove the the usage of `toSQLValue(v)` without an explicit type. `Literal(v)` is intended to be used from end-users so it cannot handle our internal types such as `UTF8String` and `ArrayBasedMapData`. Using this method can lead to unexpected error messages such as: ``` Caused by: org.apache.spark.SparkRuntimeException: [UNSUPPORTED_FEATURE] The feature is not supported: literal for 'hair' of class org.apache.spark.unsafe.types.UTF8String. at org.apache.spark.sql.errors.QueryExecutionErrors$.literalTypeUnsupportedError(QueryExecutionErrors.scala:241) at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:99) at org.apache.spark.sql.errors.QueryErrorsBase.toSQLValue(QueryErrorsBase.scala:45) ... ``` Since It is impossible to have the corresponding data type from the internal types as one type can map to multiple external types (e.g., `Long` for `Timestamp`, `TimestampNTZ`, and `LongType`), the removal approach was taken. ### Why are the changes needed? To provide the error messages as intended. ### Does this PR introduce _any_ user-facing change? Yes. ```scala import org.apache.spark.sql.Row import org.apache.spark.sql.types.StructType import org.apache.spark.sql.types.StringType import org.apache.spark.sql.types.DataTypes val arrayStructureData = Seq( Row(Map("hair"->"black", "eye"->"brown")), Row(Map("hair"->"blond", "eye"->"blue")), Row(Map())) val mapType = DataTypes.createMapType(StringType, StringType) val arrayStructureSchema = new StructType().add("properties", mapType) val mapTypeDF = spark.createDataFrame( spark.sparkContext.parallelize(arrayStructureData),arrayStructureSchema) spark.conf.set("spark.sql.ansi.enabled", true) mapTypeDF.selectExpr("element_at(properties, 'hair')").show ``` Before: ``` Caused by: org.apache.spark.SparkRuntimeException: [UNSUPPORTED_FEATURE] The feature is not supported: literal for 'hair' of class org.apache.spark.unsafe.types.UTF8String. at org.apache.spark.sql.errors.QueryExecutionErrors$.literalTypeUnsupportedError(QueryExecutionErrors.scala:241) at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:99) at org.apache.spark.sql.errors.QueryErrorsBase.toSQLValue(QueryErrorsBase.scala:45) ... ``` After: ``` Caused by: org.apache.spark.SparkNoSuchElementException: [MAP_KEY_DOES_NOT_EXIST] Key 'hair' does not exist. To return NULL instead, use 'try_element_at'. If necessary set spark.sql.ansi.enabled to false to bypass this error. == SQL(line 1, position 0) == element_at(properties, 'hair') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` ### How was this patch tested? Unittest was added. Otherwise, existing test cases should cover. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
