cloud-fan commented on code in PR #45039:
URL: https://github.com/apache/spark/pull/45039#discussion_r1857542583


##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -1056,11 +1056,22 @@ private[hive] object HiveClientImpl extends Logging {
 
   /** Get the Spark SQL native DataType from Hive's FieldSchema. */
   private def getSparkSQLDataType(hc: FieldSchema): DataType = {

Review Comment:
   This is also called by `verifyColumnDataType`, which is called by 
`createTable`, `alterTable`, etc. of `HiveClientImpl`.
   
   This means an unintentional behavior change: now we can save some spark data 
source tables as hive-compatible tables. This seems good, but it causes 
troubles if a user creates a table using Spark 4.0, and tries to read it with 
older Spark versions.
   
   It's common for users to upgrade the Spark version for a few workloads 
first, then expand it, instead of doing the upgrade all at once. Can we add a 
flag for this function to disable this fix, and set the flag to false when 
calling the function in `verifyColumnDataType`? Then we don't change the way we 
create tables in Spark 4.0



##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -1056,11 +1056,22 @@ private[hive] object HiveClientImpl extends Logging {
 
   /** Get the Spark SQL native DataType from Hive's FieldSchema. */
   private def getSparkSQLDataType(hc: FieldSchema): DataType = {

Review Comment:
   cc @yaooqinn @dongjoon-hyun 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to