rdblue commented on pull request #1947: URL: https://github.com/apache/iceberg/pull/1947#issuecomment-748305937
I looked into resolution and there is a rule in Spark: https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L1682-L1710 Looks like if the assignments are out of order or a subset of the output columns, the expressions are left as-is. If there are no assignments, then the source table's columns are used to set the output columns by position, using an `Attribute` from the target table as the LHS. We will need an analyzer rule that fills in the missing assignments for update, checks the order of assignments by name, and validates that inserts are complete. I also think that this rule should convert to a different MergeInto logical plan. The plan in Spark is not sufficient because it considers the plan resolved when assignments are resolved, not when the assignments actually produce the expected output. That's strange because resolution produces assignments when there aren't any, but allows them to be missing when some are present. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
