[GitHub] [spark] xuanyuanking commented on a change in pull request #31842: [SPARK-34748][SS] Create a rule of the analysis logic for streaming write

GitBox Wed, 17 Mar 2021 06:21:40 -0700


xuanyuanking commented on a change in pull request #31842:
URL: https://github.com/apache/spark/pull/31842#discussion_r596008656




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
##########
@@ -23,31 +23,26 @@ import org.apache.spark.sql.{Dataset, SparkSession}
 import org.apache.spark.sql.catalyst.encoders.RowEncoder
 import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, 
CurrentBatchTimestamp, CurrentDate, CurrentTimestamp}
 import org.apache.spark.sql.catalyst.plans.logical.{LeafNode, LocalRelation, 
LogicalPlan, Project}
-import org.apache.spark.sql.catalyst.streaming.StreamingRelationV2
+import org.apache.spark.sql.catalyst.streaming.{StreamingRelationV2, 
WriteToStream}
 import org.apache.spark.sql.catalyst.util.truncatedString
-import org.apache.spark.sql.connector.catalog.{SupportsRead, SupportsWrite, 
Table, TableCapability}
+import org.apache.spark.sql.connector.catalog.{SupportsRead, SupportsWrite, 
TableCapability}
 import org.apache.spark.sql.connector.read.streaming.{MicroBatchStream, Offset 
=> OffsetV2, ReadLimit, SparkDataStream, SupportsAdmissionControl}
 import org.apache.spark.sql.execution.SQLExecution
 import 
org.apache.spark.sql.execution.datasources.v2.{StreamingDataSourceV2Relation, 
StreamWriterCommitProgress, WriteToDataSourceV2Exec}
 import 
org.apache.spark.sql.execution.streaming.sources.WriteToMicroBatchDataSource
 import org.apache.spark.sql.internal.SQLConf
-import org.apache.spark.sql.streaming.{OutputMode, Trigger}
+import org.apache.spark.sql.streaming.Trigger
 import org.apache.spark.util.{Clock, Utils}
 
 class MicroBatchExecution(
     sparkSession: SparkSession,
-    name: String,
-    checkpointRoot: String,
-    analyzedPlan: LogicalPlan,
-    sink: Table,
     trigger: Trigger,
     triggerClock: Clock,
-    outputMode: OutputMode,
     extraOptions: Map[String, String],
-    deleteCheckpointOnStop: Boolean)
+    plan: WriteToStream)
   extends StreamExecution(
-    sparkSession, name, checkpointRoot, analyzedPlan, sink,
-    trigger, triggerClock, outputMode, deleteCheckpointOnStop) {
+    sparkSession, plan.name, plan.checkpointLocation, plan.queryPlan, 
plan.sink, trigger,
+    triggerClock, plan.outputMode, plan.deleteCheckpointOnStop) {
 

Review comment:
       Actually, I'm following this direction now based on this PR. I plan to 
move all the logic of generating `logicalPlan` in both MicroBatchExcecution and 
ContinuousExecution. The patch is still in development, and some issues (like 
the queryExecutionThread assertion) are investigating. Maybe we can split this 
task into another PR. WDYT?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] xuanyuanking commented on a change in pull request #31842: [SPARK-34748][SS] Create a rule of the analysis logic for streaming write

Reply via email to