HyukjinKwon commented on a change in pull request #30491:
URL: https://github.com/apache/spark/pull/30491#discussion_r530054985
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala
##########
@@ -213,15 +213,14 @@ case class AtomicReplaceTableAsSelectExec(
* Rows in the output data set are appended.
*/
case class AppendDataExec(
- session: SparkSession,
table: SupportsWrite,
- relation: DataSourceV2Relation,
writeOptions: CaseInsensitiveStringMap,
- query: SparkPlan) extends V2TableWriteExec with BatchWriteHelper {
+ query: SparkPlan,
+ afterWrite: () => Unit = () => ()) extends V2TableWriteExec with
BatchWriteHelper {
override protected def run(): Seq[InternalRow] = {
val writtenRows = writeWithV2(newWriteBuilder().buildForBatch())
- session.sharedState.cacheManager.recacheByPlan(session, relation)
Review comment:
What about fixing `DropTableExec` and `RefreshTableExec` together since
we're here with a separate JIRA? There look only two DSv2 execution plans left
that pass Spark session.
Doing a refactoring across the codebase could be also done separately I
believe.
One thing though the callback always does the same thing but it's unlikely
for `afterWrite` to do something else at least so far. I felt it's a bit too
much to make it generalised. But probably it's just okay.
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala
##########
@@ -213,15 +213,14 @@ case class AtomicReplaceTableAsSelectExec(
* Rows in the output data set are appended.
*/
case class AppendDataExec(
- session: SparkSession,
table: SupportsWrite,
- relation: DataSourceV2Relation,
writeOptions: CaseInsensitiveStringMap,
- query: SparkPlan) extends V2TableWriteExec with BatchWriteHelper {
+ query: SparkPlan,
+ afterWrite: () => Unit = () => ()) extends V2TableWriteExec with
BatchWriteHelper {
override protected def run(): Seq[InternalRow] = {
val writtenRows = writeWithV2(newWriteBuilder().buildForBatch())
- session.sharedState.cacheManager.recacheByPlan(session, relation)
Review comment:
What about fixing `DropTableExec` and `RefreshTableExec` together since
we're here with a separate JIRA? There look only two DSv2 execution plans left
that pass Spark session from my cursory look.
Doing a refactoring across the codebase could be also done separately I
believe.
One thing though the callback always does the same thing but it's unlikely
for `afterWrite` to do something else at least so far. I felt it's a bit too
much to make it generalised. But probably it's just okay.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]