cloud-fan commented on a change in pull request #33328:
URL: https://github.com/apache/spark/pull/33328#discussion_r670694172
##########
File path:
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala
##########
@@ -244,7 +244,11 @@ private[hive] class HiveMetastoreCatalog(sparkSession:
SparkSession) extends Log
paths = rootPath.toString :: Nil,
userSpecifiedSchema = Option(updatedTable.dataSchema),
bucketSpec = None,
- options = options,
+ // Do not interpret the 'path' option at all when tables are
read using the Hive
+ // source, since the URIs will already have been read from the
table's LOCATION.
+ // The `Map() ++` is necessary to force materialization since
the return of
+ // `filterKeys` is a view which is not serializable
+ options = Map() ++
options.filterKeys(!_.equalsIgnoreCase("path")),
Review comment:
I agree, `path` is a special data source option in Spark. For hive
tables not written by Spark, it's wrong to apply the Spark knowledge here and
treat the `path` option specially.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]