[GitHub] [spark] rdblue commented on a change in pull request #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs

GitBox Wed, 21 Aug 2019 14:24:46 -0700

rdblue commented on a change in pull request #25465: [SPARK-28747][SQL] merge 
the two data source v2 fallback configs
URL: https://github.com/apache/spark/pull/25465#discussion_r316409612


 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
 ##########
 @@ -251,37 +251,17 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
 
     assertNotBucketed("save")
 
-    val session = df.sparkSession
-    val cls = DataSource.lookupDataSource(source, session.sessionState.conf)
-    val canUseV2 = canUseV2Source(session, cls) && partitioningColumns.isEmpty
-
-    // In Data Source V2 project, partitioning is still under development.
-    // Here we fallback to V1 if partitioning columns are specified.
-    // TODO(SPARK-26778): use V2 implementations when partitioning feature is 
supported.
-    if (canUseV2) {
-      val provider = 
cls.getConstructor().newInstance().asInstanceOf[TableProvider]
+    val maybeV2Provider = lookupV2Provider()
+    // TODO(SPARK-26778): use V2 implementations when partition columns are 
specified
+    if (maybeV2Provider.isDefined && partitioningColumns.isEmpty) {
 
 Review comment:
   This can throw an exception. If the provider is v2 and there is no catalog, 
then Spark can only append or overwrite (see the cases below). Append and 
overwrite rely on existing tables and must fail in v2 if the table does not 
exist. If the user specified partition columns for a table that exists, then it 
is an error.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] rdblue commented on a change in pull request #25465: [SPARK-28747][SQL] merge the two data source v2 fallback configs

Reply via email to