cloud-fan commented on a change in pull request #33328:
URL: https://github.com/apache/spark/pull/33328#discussion_r670694172



##########
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala
##########
@@ -244,7 +244,11 @@ private[hive] class HiveMetastoreCatalog(sparkSession: 
SparkSession) extends Log
                 paths = rootPath.toString :: Nil,
                 userSpecifiedSchema = Option(updatedTable.dataSchema),
                 bucketSpec = None,
-                options = options,
+                // Do not interpret the 'path' option at all when tables are 
read using the Hive
+                // source, since the URIs will already have been read from the 
table's LOCATION.
+                // The `Map() ++` is necessary to force materialization since 
the return of
+                // `filterKeys` is a view which is not serializable
+                options = Map() ++ 
options.filterKeys(!_.equalsIgnoreCase("path")),

Review comment:
       I agree, `path` is a special data source option in Spark. For hive 
tables not written by Spark, it's wrong to apply the Spark knowledge here and 
treat the `path` option specially.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to