Re: [PR] [SPARK-54496][SQL] Fix Merge Into Schema Evolution for Dataframe API [spark]

via GitHub Mon, 24 Nov 2025 21:25:03 -0800


szehon-ho commented on code in PR #53207:
URL: https://github.com/apache/spark/pull/53207#discussion_r2558582195



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveMergeIntoSchemaEvolution.scala:
##########
@@ -42,15 +45,19 @@ object ResolveMergeIntoSchemaEvolution extends 
Rule[LogicalPlan] {
       if (changes.isEmpty) {
         m
       } else {
-        m transformUpWithNewOutput {
-          case r @ DataSourceV2Relation(_: SupportsRowLevelOperations, _, _, 
_, _, _) =>
+        val finalAttrMapping = ArrayBuffer.empty[(Attribute, Attribute)]

Review Comment:
   There is a bug here, as it actually hits  _both_ the sourceTable and 
targetTable and tries schema evolution on both, when actually schema evolution 
should always be performed only for target table.
   
   I had done it this way because of the limitation of transformUpWithNewOutput 
that it doesn't re-map the attribues of the top level object (MergeIntoTable).  
See https://github.com/apache/spark/pull/52866#discussion_r2512952222 for my 
finding.  So I transformed all children of MergeIntoTable and assumed that the 
match with SupportsRowLevelOperation Table would be enough to only do schema 
evolution on the targetTable, but I was wrong.  So I add an extra rewriteAttrs 
to rewrite the top level object (MergeIntoTable).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-54496][SQL] Fix Merge Into Schema Evolution for Dataframe API [spark]

Reply via email to