rdblue commented on a change in pull request #23606: [SPARK-26666][SQL] Support 
DSv2 overwrite and dynamic partition overwrite.
URL: https://github.com/apache/spark/pull/23606#discussion_r256635849
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala
 ##########
 @@ -41,18 +46,114 @@ case class WriteToDataSourceV2(batchWrite: BatchWrite, 
query: LogicalPlan)
   override def output: Seq[Attribute] = Nil
 }
 
+case class AppendDataExec(
+    table: SupportsBatchWrite,
+    writeOptions: DataSourceOptions,
+    query: SparkPlan) extends V2TableWriteExec with BatchWriteHelper {
+
+  override protected def doExecute(): RDD[InternalRow] = {
+    val batchWrite = newWriteBuilder() match {
+      case builder: SupportsSaveMode =>
+        builder.mode(SaveMode.Append).buildForBatch()
+
+      case builder =>
+        builder.buildForBatch()
+    }
+    doWrite(batchWrite)
+  }
+}
+
+case class OverwriteByExpressionExec(
+    table: SupportsBatchWrite,
+    filters: Array[Filter],
+    writeOptions: DataSourceOptions,
+    query: SparkPlan) extends V2TableWriteExec with BatchWriteHelper {
+
+  private def isTruncate(filters: Array[Filter]): Boolean = {
+    filters.length == 1 && filters(0).isInstanceOf[AlwaysTrue]
 
 Review comment:
   > Return true if one of the filters is AlwaysTrue, right?
   
   No, this uses the same convention as `SupportsPushDownFilters`, where the 
array of filters should be interpreted as ANDed together. The original 
expression is split using `splitConjunctivePredicates`.
   
   I've added a note to both `SupportsPushDownFilters` and `SupportsOverwrite` 
to explicitly note the expected behavior.
   
   > Or we assume the optimizer rule will do it? Do we have an end-to-end test 
case?
   
   The existing DataFrameWriter overwrite test is an end-to-end test case.
   
   I plan to add more end-to-end test cases when more of this functionality is 
exposed through SQL and an improved DataFrame write API. Right now, the only 
case that can be called from the API is table truncation, via 
`SaveMode.Overwrite`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to