HeartSaVioR commented on code in PR #39082:
URL: https://github.com/apache/spark/pull/39082#discussion_r1053715659


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala:
##########
@@ -183,16 +172,80 @@ object LogicalRDD {
       }
     }
 
+    val logicalPlan = originDataset.logicalPlan
     val optimizedPlan = originDataset.queryExecution.optimizedPlan
     val executedPlan = originDataset.queryExecution.executedPlan
 
+    val (stats, constraints) = rewriteStatsAndConstraints(logicalPlan, 
optimizedPlan)
+
     LogicalRDD(
       originDataset.logicalPlan.output,
       rdd,
       firstLeafPartitioning(executedPlan.outputPartitioning),
       executedPlan.outputOrdering,
       isStreaming
-    )(originDataset.sparkSession, Some(optimizedPlan.stats), 
Some(optimizedPlan.constraints))

Review Comment:
   We tried this before, and realized that this could break existing use case 
when someone is trying to checkpoint "subtree" of logical plan. Given that we 
know exprId can differ, it would break expressions in above node(s).
   
   The actual example is merge into materialize source of Delta Lake. This 
performs join with source DF and target table with merge condition as join 
condition (here the condition is built with logical plan), and source DF can 
checkpoint and be replaced with LogicalRDD (should produce the same output to 
not break join condition).
   
   
https://github.com/delta-io/delta/blob/4e51a9969708080b9ac002462f20f64000288978/core/src/main/scala/org/apache/spark/sql/delta/commands/MergeIntoCommand.scala#L458-L472
   
   
https://github.com/delta-io/delta/blob/master/core/src/main/scala/org/apache/spark/sql/delta/commands/merge/MergeIntoMaterializeSource.scala#L245-L251
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to