Re: [PR] [SPARK-46934][SQL][FOLLOWUP] Read/write roundtrip for struct type with special characters with HMS - a backward compatible approach [spark]

via GitHub Wed, 25 Dec 2024 01:36:50 -0800


cloud-fan commented on code in PR #48986:
URL: https://github.com/apache/spark/pull/48986#discussion_r1897232451



##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -1092,13 +1090,45 @@ private[hive] object HiveClientImpl extends Logging {
     // When reading data in parquet, orc, or avro file format with string type 
for char,
     // the tailing spaces may lost if we are not going to pad it.
     val typeString = if (SQLConf.get.charVarcharAsString) {
-      c.dataType.catalogString
+      catalogString(c.dataType)
     } else {
-      
CharVarcharUtils.getRawTypeString(c.metadata).getOrElse(c.dataType.catalogString)
+      
CharVarcharUtils.getRawTypeString(c.metadata).getOrElse(catalogString(c.dataType))
     }
     new FieldSchema(c.name, typeString, c.getComment().orNull)
   }
 
+  /**
+   * This a a variant of `DataType.catalogString` that does the same thing in 
general but
+   * it will not quote the field names in the struct type. HMS API uses 
unquoted field names
+   * to store the schema of a struct type. This is fine if we in the write 
path, we might encounter
+   * issues in the read path to parse the unquoted schema strings in the Spark 
SQL parser. You can
+   * see the tricks we play in the `getSparkSQLDataType` method to handle 
this. To avoid the
+   * flakiness of those tricks, we quote the field names, make them 
unrecognized by HMS API, and

Review Comment:
   are you saying that even if we quote the field names now, the hive table 
creation will still fail and Spark will try it again with non-hive compatible 
table creation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-46934][SQL][FOLLOWUP] Read/write roundtrip for struct type with special characters with HMS - a backward compatible approach [spark]

Reply via email to