glitchy commented on issue #2186:
URL: https://github.com/apache/iceberg-rust/issues/2186#issuecomment-3990973559
hey @wirybeaver--thanks for the interest in this area! a few things worth
flagging:
### `RowDeltaAction` overlaps with `OverwriteAction`
the `OverwriteAction` in my PR (#2185) already handles adding data files
and marking existing ones as deleted via manifest
rewriting--`add_data_files()`, `delete_data_files()`, `Operation::Overwrite`,
`ManifestStatus::Deleted` entries with preserved sequence numbers. looking at
your `RowDeltaAction` (#2201), it reimplements the same core logic.
`RowDeltaAction` is already planned as the next PR in this series (see
checklist above)
### datafusion is a consumer, not a dependency:
the core Iceberg transaction primitives (`OverwriteAction`,
`RowDeltaAction`, scan-side delete reconciliation) do not depend on
datafusion--they live in the iceberg crate and need to work independently of
any query engine. datafusion integration (`INSERT OVERWRITE`, `MERGE INTO`) is
a layer on top that consumes these APIs. coupling them together makes both
harder to review, and datafusion doesn't even support `MERGE INTO` parsing yet
(there's no logical plan node for it and it falls through to
`not_impl_err!("Unsupported SQL statement")` in their planner).
the sequencing for the core primitives is already outlined in the checklist
above on this issue. once they go in then datafusion integration should be
considered.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]