szehon-ho commented on code in PR #52866:
URL: https://github.com/apache/spark/pull/52866#discussion_r2512952222
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveMergeIntoSchemaEvolution.scala:
##########
@@ -34,24 +35,104 @@ import
org.apache.spark.sql.execution.datasources.v2.DataSourceV2Relation
object ResolveMergeIntoSchemaEvolution extends Rule[LogicalPlan] {
override def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
- case m @ MergeIntoTable(_, _, _, _, _, _, _)
- if m.needSchemaEvolution =>
+ // This rule should run only if all assignments are resolved, except those
+ // that will be satisfied by schema evolution
+ case m @ MergeIntoTable(_, _, _, _, _, _, _) if m.needSchemaEvolution =>
val newTarget = m.targetTable.transform {
- case r : DataSourceV2Relation => performSchemaEvolution(r,
m.sourceTable)
+ case r : DataSourceV2Relation => performSchemaEvolution(r, m)
}
- m.copy(targetTable = newTarget)
+
+ // Unresolve all references based on old target output
Review Comment:
Actually originally I selectively targeted DataSourceV2Relation under
targetTable field.
```
m.targetTable transform {
case r: DataSourceV2Relation => ...
}
```
But now I need to run the rule on the top object (m) because the attributes
to rewrite are child of m.
I could not figure out how to get it to rewrite attributes if I match m
itself, ie
```
m transformWithNewOutput {
case _: MergeIntoTable => _.targetTable transform { case r:
DataSourceV2Relation }
}
```
because I think this method only populates attributeMap if the rule targets
a child and now m itself?
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala#L349
unless I missed it.
```
m transformWithNewOutput {
case r: DataSourceV2Relation(SupportsRowLevelOperation, ...) =>
}
```
So hence now I make an assumption that the DataSourceV2Relation is the
target table. Currently it is the case because only target table has a
SupportsRowLevelOperationTable object, but just calling this out.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]