cloud-fan commented on code in PR #45039:
URL: https://github.com/apache/spark/pull/45039#discussion_r1857542583
##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -1056,11 +1056,22 @@ private[hive] object HiveClientImpl extends Logging {
/** Get the Spark SQL native DataType from Hive's FieldSchema. */
private def getSparkSQLDataType(hc: FieldSchema): DataType = {
Review Comment:
This is also called by `verifyColumnDataType`, which is called by
`createTable`, `alterTable`, etc. of `HiveClientImpl`.
This means an unintentional behavior change: now we can save some spark data
source tables as hive-compatible tables. This seems good, but it causes
troubles if a user creates a table using Spark 4.0, and tries to read it with
older Spark versions.
It's common for users to upgrade the Spark version for a few workloads
first, then expand it, instead of doing the upgrade all at once. Can we add a
bool parameter for this function to disable this fix, and set the parameter to
false when calling the function in `verifyColumnDataType`? Then we don't change
the way we create tables in Spark 4.0
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]