This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new a51dd1820da [SPARK-39203][SQL][FOLLOWUP] Do not qualify view location
a51dd1820da is described below
commit a51dd1820dad8f24f99a979d5998b999ae4a3c25
Author: Wenchen Fan <[email protected]>
AuthorDate: Thu Oct 20 15:16:04 2022 -0700
[SPARK-39203][SQL][FOLLOWUP] Do not qualify view location
### What changes were proposed in this pull request?
This fixes a corner-case regression caused by
https://github.com/apache/spark/pull/36625. Users may have existing views that
have invalid locations due to historical reasons. The location is actually
useless for a view, but after https://github.com/apache/spark/pull/36625 , they
start to fail to read the view as qualifying the location fails. We should just
skip qualifying view locations.
### Why are the changes needed?
avoid regression
### Does this PR introduce _any_ user-facing change?
Spark can read view with invalid location again.
### How was this patch tested?
manually test. View with an invalid location is kind of "broken" and can't
be dropped (HMS fails to drop it), so we can't write a UT for it.
Closes #38321 from cloud-fan/follow.
Authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
.../apache/spark/sql/hive/client/HiveClientImpl.scala | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
index f6b06b08cbc..213d930653d 100644
---
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
+++
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
@@ -537,12 +537,18 @@ private[hive] class HiveClientImpl(
storage = CatalogStorageFormat(
locationUri = shim.getDataLocation(h).map { loc =>
val tableUri = stringToURI(loc)
- // Before SPARK-19257, created data source table does not use
absolute uri.
- // This makes Spark can't read these tables across HDFS clusters.
- // Rewrite table location to absolute uri based on database uri to
fix this issue.
- val absoluteUri = Option(tableUri).filterNot(_.isAbsolute)
- .map(_ =>
stringToURI(client.getDatabase(h.getDbName).getLocationUri))
- HiveExternalCatalog.toAbsoluteURI(tableUri, absoluteUri)
+ if (h.getTableType == HiveTableType.VIRTUAL_VIEW) {
+ // Data location of SQL view is useless. Do not qualify it even if
it's present, as
+ // it can be an invalid path.
+ tableUri
+ } else {
+ // Before SPARK-19257, created data source table does not use
absolute uri.
+ // This makes Spark can't read these tables across HDFS clusters.
+ // Rewrite table location to absolute uri based on database uri to
fix this issue.
+ val absoluteUri = Option(tableUri).filterNot(_.isAbsolute)
+ .map(_ =>
stringToURI(client.getDatabase(h.getDbName).getLocationUri))
+ HiveExternalCatalog.toAbsoluteURI(tableUri, absoluteUri)
+ }
},
// To avoid ClassNotFound exception, we try our best to not get the
format class, but get
// the class name directly. However, for non-native tables, there is
no interface to get
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]