[GitHub] [spark] HyukjinKwon commented on a change in pull request #30491: [SPARK-33492][SQL][FOLLOWUP] Use callback instead of passing Spark session and v2 relation

GitBox Tue, 24 Nov 2020 17:47:44 -0800


HyukjinKwon commented on a change in pull request #30491:
URL: https://github.com/apache/spark/pull/30491#discussion_r530054985




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala
##########
@@ -213,15 +213,14 @@ case class AtomicReplaceTableAsSelectExec(
  * Rows in the output data set are appended.
  */
 case class AppendDataExec(
-    session: SparkSession,
     table: SupportsWrite,
-    relation: DataSourceV2Relation,
     writeOptions: CaseInsensitiveStringMap,
-    query: SparkPlan) extends V2TableWriteExec with BatchWriteHelper {
+    query: SparkPlan,
+    afterWrite: () => Unit = () => ()) extends V2TableWriteExec with 
BatchWriteHelper {
 
   override protected def run(): Seq[InternalRow] = {
     val writtenRows = writeWithV2(newWriteBuilder().buildForBatch())
-    session.sharedState.cacheManager.recacheByPlan(session, relation)

Review comment:
       What about fixing `DropTableExec` and `RefreshTableExec` together since 
we're here with a separate JIRA? There look only two DSv2 execution plans left 
that pass Spark session.
   
   Doing a refactoring across the codebase could be also done separately I 
believe.
   
   One thing though the callback  always does the same thing but it's unlikely 
for `afterWrite` to do something else at least so far. I felt it's a bit too 
much to make it generalised. But probably it's just okay.

##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala
##########
@@ -213,15 +213,14 @@ case class AtomicReplaceTableAsSelectExec(
  * Rows in the output data set are appended.
  */
 case class AppendDataExec(
-    session: SparkSession,
     table: SupportsWrite,
-    relation: DataSourceV2Relation,
     writeOptions: CaseInsensitiveStringMap,
-    query: SparkPlan) extends V2TableWriteExec with BatchWriteHelper {
+    query: SparkPlan,
+    afterWrite: () => Unit = () => ()) extends V2TableWriteExec with 
BatchWriteHelper {
 
   override protected def run(): Seq[InternalRow] = {
     val writtenRows = writeWithV2(newWriteBuilder().buildForBatch())
-    session.sharedState.cacheManager.recacheByPlan(session, relation)

Review comment:
       What about fixing `DropTableExec` and `RefreshTableExec` together since 
we're here with a separate JIRA? There look only two DSv2 execution plans left 
that pass Spark session from my cursory look.
   
   Doing a refactoring across the codebase could be also done separately I 
believe.
   
   One thing though the callback  always does the same thing but it's unlikely 
for `afterWrite` to do something else at least so far. I felt it's a bit too 
much to make it generalised. But probably it's just okay.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HyukjinKwon commented on a change in pull request #30491: [SPARK-33492][SQL][FOLLOWUP] Use callback instead of passing Spark session and v2 relation

Reply via email to