JingsongLi commented on code in PR #8081:
URL: https://github.com/apache/paimon/pull/8081#discussion_r3348787975
##########
paimon-spark/paimon-spark-common/src/main/scala/org/apache/paimon/spark/commands/MergeIntoPaimonDataEvolutionTable.scala:
##########
@@ -426,7 +451,8 @@ case class MergeIntoPaimonDataEvolutionTable(
val sourceTableProjExprs =
allReadFieldsOnSource.toSeq :+ Alias(TrueLiteral, ROW_FROM_SOURCE)()
- val sourceTableProj = Project(sourceTableProjExprs, sourceTable)
+ val sourceChild =
persistSourceDss.map(_.queryExecution.logical).getOrElse(sourceTable)
Review Comment:
This only wires the cached source into the matched/update path. For a MERGE
that has both matched and not-matched clauses, `insertActionInvoke` still
builds its left-anti join from `sourceTable`, so the source is scanned again
after the update path. Could you pass the persisted source into the insert path
too, so the new option avoids repeated source loading for the whole merge
action?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]