[GitHub] [spark] yuchenhuo commented on a change in pull request #26957: [SPARK-30314] Add identifier and catalog information to DataSourceV2Relation

GitBox Tue, 14 Jan 2020 18:42:20 -0800

yuchenhuo commented on a change in pull request #26957: [SPARK-30314] Add 
identifier and catalog information to DataSourceV2Relation
URL: https://github.com/apache/spark/pull/26957#discussion_r366672547


 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
 ##########
 @@ -554,12 +556,14 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
     }
 
     val command = (mode, tableOpt) match {
-      case (_, Some(table: V1Table)) =>
+      case (_, Some(_: V1Table)) =>
         return saveAsTable(TableIdentifier(ident.name(), 
ident.namespace().headOption))
 
       case (SaveMode.Append, Some(table)) =>
         checkPartitioningMatchesV2Table(table)
-        AppendData.byName(DataSourceV2Relation.create(table), df.logicalPlan, 
extraOptions.toMap)
+        val v2Relation =
+          DataSourceV2Relation.create(table, 
catalogManager.catalogIdentifier(catalog), Seq(ident))
+        AppendData.byName(v2Relation, df.logicalPlan, extraOptions.toMap)
 
       case (SaveMode.Overwrite, _) =>
 
 Review comment:
   Still probably a dumb question. Why does DDL/DML affects how we generate the 
query plan? I'm asking this because in the `save()` function for the 
DataFrameWriter, we do generate a `DataSourceV2Relation` for `Overwrite ` mode. 
I'm curious about why there is such a difference here.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] yuchenhuo commented on a change in pull request #26957: [SPARK-30314] Add identifier and catalog information to DataSourceV2Relation

Reply via email to