glitchy commented on issue #2186:
URL: https://github.com/apache/iceberg-rust/issues/2186#issuecomment-3990973559

   hey @wirybeaver--thanks for the interest in this area! a few things worth 
flagging:
   
   ### `RowDeltaAction` overlaps with `OverwriteAction`
   
   the `OverwriteAction` in my PR (#2185)  already handles adding data files 
and marking existing ones as deleted via manifest 
rewriting--`add_data_files()`, `delete_data_files()`, `Operation::Overwrite`, 
`ManifestStatus::Deleted` entries with preserved sequence numbers. looking at 
your `RowDeltaAction` (#2201), it reimplements the same core logic. 
`RowDeltaAction` is already planned as the next PR in this series (see 
checklist above)
   
   ### datafusion is a consumer, not a dependency:
   
   the core Iceberg transaction primitives (`OverwriteAction`, 
`RowDeltaAction`, scan-side delete reconciliation) do not depend on 
datafusion--they live in the iceberg crate and need to work independently of 
any query engine. datafusion integration (`INSERT OVERWRITE`, `MERGE INTO`) is 
a layer on top that consumes these APIs. coupling them together makes both 
harder to review, and datafusion doesn't even support `MERGE INTO` parsing yet 
(there's no logical plan node for it and it falls through to 
`not_impl_err!("Unsupported SQL statement")` in their planner).
   
   the sequencing for the core primitives is already outlined in the checklist 
above on this issue. once they go in then datafusion integration should be 
considered.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to