wirybeaver opened a new issue, #2201:
URL: https://github.com/apache/iceberg-rust/issues/2201

   ### What's the feature are you trying to implement?
   
   Add support for SQL `MERGE INTO` (UPSERT) operations in the 
iceberg-datafusion integration. This enables atomic row-level updates and 
inserts based on join conditions, essential for CDC pipelines, incremental 
updates, and data synchronization. I already have a PoC branch.
   
   **SQL Example:**
   ```sql
   MERGE INTO target_table t
   USING source_table s
   ON t.id = s.id
   WHEN MATCHED THEN
     UPDATE SET t.value = s.value
   WHEN NOT MATCHED THEN
     INSERT (id, value) VALUES (s.id, s.value)
   ```
   
   Task List 
   - [ ] Implement RowDeltaAction transaction action for row-level modifications
   - [ ] Add IcebergMergeExec with HashJoinExec integration and row 
classification
   - [ ] Add IcebergMergeWriteExec and IcebergMergeCommitExec nodes
   - [ ] Implement full MERGE execution logic with file tracking
   - [ ] Integrate MERGE INTO into IcebergTableProvider (`table/mod.rs`)
   - [ ] Add comprehensive MERGE INTO integration tests (`table/mod.rs`)
   - [ ] Add partition-aware merge optimization (spark storage partition join 
style)
   
   ### Willingness to contribute
   
   I would be willing to contribute to this feature with guidance from the 
Iceberg Rust community


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to