[GitHub] [iceberg] dilipbiswal opened a new pull request #1947: [WIP] Spark MERGE INTO Support (copy-on-write implementation)

GitBox Wed, 16 Dec 2020 14:36:23 -0800


dilipbiswal opened a new pull request #1947:
URL: https://github.com/apache/iceberg/pull/1947



   - Adds WIP support of MERGE INTO for spark leveraging the work done for 
DELETE by Anton.
   - This PR implements by doing copy-on-write.
   
   Plan:
   ```
   == Optimized Logical Plan ==
   ReplaceData RelationV2[key1#50, value1#51] file:///..., 
IcebergWrite(table=file:///..., format=PARQUET)
   +- MergeInto 
org.apache.spark.sql.catalyst.plans.logical.MergeIntoProcessor@e1a150c, 
RelationV2[key1#50, value1#51] file:///...
      +- Join FullOuter, (key1#50 = key2#65)
         :- Project [key2#65, value2#66, true AS _source_row_present_#138]
         :  +- RelationV2[key2#65, value2#66] file:///...
         +- Project [key1#50, value1#51, true AS _target_row_present_#139]
            +- DynamicFileFilter
               :- RelationV2[key1#50, value1#51] file:///...
               +- Aggregate [_file_name_#137], [_file_name_#137]
                  +- Project [_file_name_#137]
                     +- Join Inner, (key1#50 = key2#65)
                        :- Filter isnotnull(key2#65)
                        :  +- RelationV2[key2#65] file:///...
                        +- Project [key1#50, input_file_name() AS 
_file_name_#137]
                           +- Filter isnotnull(key1#50)
                              +- RelationV2[key1#50] file:///...
   
   ``` 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] dilipbiswal opened a new pull request #1947: [WIP] Spark MERGE INTO Support (copy-on-write implementation)

Reply via email to