rdblue opened a new pull request #2116: URL: https://github.com/apache/iceberg/pull/2116
This updates MERGE INTO to avoid using a full outer join if it is not needed. * If there are no "matched" conditions, then all of the actions must be inserts and the merge can be rewritten to append rows instead of rewriting existing data files. In this case, the join used is a left anti join to remove any source row that has a matching target row. * If there are no "not matched" conditions, then all actions must have a target row. In this case, the join used is a right outer join to discard any source rows that do not match a target row. * Otherwise, the original full outer join is used. This commit also updates how rows are handled in `MergeIntoExec`. These changes are needed because target columns are not available if the join is a left anti join. To avoid a failure when building a projection for the target columns that will not be used, this updates the projection as optional. If there is no projection, `null` will be emitted instead of the target row with an extra `true` column, and the output rows are the non-null results. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
