This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new a51dd1820da [SPARK-39203][SQL][FOLLOWUP] Do not qualify view location
a51dd1820da is described below

commit a51dd1820dad8f24f99a979d5998b999ae4a3c25
Author: Wenchen Fan <wenc...@databricks.com>
AuthorDate: Thu Oct 20 15:16:04 2022 -0700

    [SPARK-39203][SQL][FOLLOWUP] Do not qualify view location
    
    ### What changes were proposed in this pull request?
    
    This fixes a corner-case regression caused by 
https://github.com/apache/spark/pull/36625. Users may have existing views that 
have invalid locations due to historical reasons. The location is actually 
useless for a view, but after https://github.com/apache/spark/pull/36625 , they 
start to fail to read the view as qualifying the location fails. We should just 
skip qualifying view locations.
    
    ### Why are the changes needed?
    
    avoid regression
    
    ### Does this PR introduce _any_ user-facing change?
    
    Spark can read view with invalid location again.
    
    ### How was this patch tested?
    
    manually test. View with an invalid location is kind of "broken" and can't 
be dropped (HMS fails to drop it), so we can't write a UT for it.
    
    Closes #38321 from cloud-fan/follow.
    
    Authored-by: Wenchen Fan <wenc...@databricks.com>
    Signed-off-by: Dongjoon Hyun <dongj...@apache.org>
---
 .../apache/spark/sql/hive/client/HiveClientImpl.scala  | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
index f6b06b08cbc..213d930653d 100644
--- 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
+++ 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
@@ -537,12 +537,18 @@ private[hive] class HiveClientImpl(
       storage = CatalogStorageFormat(
         locationUri = shim.getDataLocation(h).map { loc =>
           val tableUri = stringToURI(loc)
-          // Before SPARK-19257, created data source table does not use 
absolute uri.
-          // This makes Spark can't read these tables across HDFS clusters.
-          // Rewrite table location to absolute uri based on database uri to 
fix this issue.
-          val absoluteUri = Option(tableUri).filterNot(_.isAbsolute)
-            .map(_ => 
stringToURI(client.getDatabase(h.getDbName).getLocationUri))
-          HiveExternalCatalog.toAbsoluteURI(tableUri, absoluteUri)
+          if (h.getTableType == HiveTableType.VIRTUAL_VIEW) {
+            // Data location of SQL view is useless. Do not qualify it even if 
it's present, as
+            // it can be an invalid path.
+            tableUri
+          } else {
+            // Before SPARK-19257, created data source table does not use 
absolute uri.
+            // This makes Spark can't read these tables across HDFS clusters.
+            // Rewrite table location to absolute uri based on database uri to 
fix this issue.
+            val absoluteUri = Option(tableUri).filterNot(_.isAbsolute)
+              .map(_ => 
stringToURI(client.getDatabase(h.getDbName).getLocationUri))
+            HiveExternalCatalog.toAbsoluteURI(tableUri, absoluteUri)
+          }
         },
         // To avoid ClassNotFound exception, we try our best to not get the 
format class, but get
         // the class name directly. However, for non-native tables, there is 
no interface to get


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to