rdblue commented on pull request #1947:
URL: https://github.com/apache/iceberg/pull/1947#issuecomment-748305937


   I looked into resolution and there is a rule in Spark: 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala#L1682-L1710
   
   Looks like if the assignments are out of order or a subset of the output 
columns, the expressions are left as-is. If there are no assignments, then the 
source table's columns are used to set the output columns by position, using an 
`Attribute` from the target table as the LHS.
   
   We will need an analyzer rule that fills in the missing assignments for 
update, checks the order of assignments by name, and validates that inserts are 
complete. I also think that this rule should convert to a different MergeInto 
logical plan. The plan in Spark is not sufficient because it considers the plan 
resolved when assignments are resolved, not when the assignments actually 
produce the expected output. That's strange because resolution produces 
assignments when there aren't any, but allows them to be missing when some are 
present.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to