cloud-fan commented on code in PR #45039:
URL: https://github.com/apache/spark/pull/45039#discussion_r1857542583


##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -1056,11 +1056,22 @@ private[hive] object HiveClientImpl extends Logging {
 
   /** Get the Spark SQL native DataType from Hive's FieldSchema. */
   private def getSparkSQLDataType(hc: FieldSchema): DataType = {

Review Comment:
   This is also called by `verifyColumnDataType`, which is called by 
`createTable`, `alterTable`, etc. of `HiveClientImpl`.
   
   This means an unintentional behavior change: now we can save some spark data 
source tables as hive-compatible tables. This seems good, but it causes 
troubles if a user creates a table using Spark 4.0, and tries to read it with 
older Spark versions.
   
   It's common for users to upgrade the Spark version for a few workloads 
first, then expand it, instead of doing the upgrade all at once. Can we add a 
bool parameter for this function to disable this fix, and set the parameter to 
false when calling the function in `verifyColumnDataType`? Then we don't change 
the way we create tables in Spark 4.0



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to