[GitHub] [spark] cloud-fan commented on a change in pull request #28511: [SPARK-31684][SQL] Overwrite partition failed with 'WRONG FS' when the target partition is not belong to the filesystem as same as the table

GitBox Mon, 18 May 2020 06:19:15 -0700


cloud-fan commented on a change in pull request #28511:
URL: https://github.com/apache/spark/pull/28511#discussion_r426618965




##########
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
##########
@@ -281,11 +283,26 @@ case class InsertIntoHiveTable(
             oldPart.flatMap(_.storage.locationUri.map(uri => new Path(uri)))
           }
 
-          // SPARK-18107: Insert overwrite runs much slower than hive-client.
+          val hiveVersion = 
externalCatalog.asInstanceOf[ExternalCatalogWithListener]
+            .unwrapped.asInstanceOf[HiveExternalCatalog]
+            .client
+            .version
+          // SPARK-31684:
+          // For Hive 2.0.0 and onwards, as 
https://issues.apache.org/jira/browse/HIVE-11940
+          // has been fixed, and there is no performance issue anymore. We 
should leave the
+          // overwrite logic to hive to avoid failure in 
`FileSystem#checkPath` when the table
+          // and partition locations do not belong to the same `FileSystem`
+          // TODO(SPARK-31675): For Hive 2.2.0 and earlier, if the table and 
partition locations
+          // do not belong together, we will still get the same error thrown 
by hive encryption
+          // check. see https://issues.apache.org/jira/browse/HIVE-14380.
+          // So we still disable for Hive overwrite for Hive 1.x for better 
performance because
+          // the partition and table are on the same cluster in most cases.
+          // SPARK-18107:
+          // Insert overwrite runs much slower than hive-client.

Review comment:
       nit: this should be put in the same line of `SPARK-18107:`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #28511: [SPARK-31684][SQL] Overwrite partition failed with 'WRONG FS' when the target partition is not belong to the filesystem as same as the table

Reply via email to