rdblue opened a new pull request #2116:
URL: https://github.com/apache/iceberg/pull/2116


   This updates MERGE INTO to avoid using a full outer join if it is not needed.
   
   * If there are no "matched" conditions, then all of the actions must be 
inserts and the merge can be rewritten to append rows instead of rewriting 
existing data files. In this case, the join used is a left anti join to remove 
any source row that has a matching target row.
   * If there are no "not matched" conditions, then all actions must have a 
target row. In this case, the join used is a right outer join to discard any 
source rows that do not match a target row.
   * Otherwise, the original full outer join is used.
   
   This commit also updates how rows are handled in `MergeIntoExec`. These 
changes are needed because target columns are not available if the join is a 
left anti join. To avoid a failure when building a projection for the target 
columns that will not be used, this updates the projection as optional. If 
there is no projection, `null` will be emitted instead of the target row with 
an extra `true` column, and the output rows are the non-null results.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to