xbattlax opened a new pull request, #2089:
URL: https://github.com/apache/iceberg-rust/pull/2089
## Summary
Implement 10 new metadata table types for the inspect module, addressing the
metadata tables epic #823.
## Changes
- Add `crates/iceberg/src/inspect/refs.rs`:
- `RefsTable` showing all branches and tags with retention policies
- Reads from `TableMetadata.refs` HashMap
- Add `crates/iceberg/src/inspect/history.rs`:
- `HistoryTable` showing snapshot log with ancestry tracking
- Computes `is_current_ancestor` by walking parent chain from current
snapshot
- Add `crates/iceberg/src/inspect/metadata_log_entries.rs`:
- `MetadataLogEntriesTable` showing metadata file history
- Appends current metadata entry with snapshot/schema/sequence info
- Add `crates/iceberg/src/inspect/files.rs` with shared infrastructure:
- `files_schema()` building dynamic schema with partition struct
- `scan_files()` for current snapshot manifest scanning
- `scan_all_files()` for all-snapshot scanning with manifest path
deduplication
- `ContentFilter` enum for data-only, deletes-only, or all content
- `FilesTable`, `DataFilesTable`, `DeleteFilesTable` as thin wrappers
- Add `crates/iceberg/src/inspect/all_manifests.rs`:
- `AllManifestsTable` iterating manifests across all snapshots
- Adds `reference_snapshot_id` field to track which snapshot references
each manifest
- Add `crates/iceberg/src/inspect/all_files.rs`:
- `AllFilesTable`, `AllDataFilesTable`, `AllDeleteFilesTable`
- Thin wrappers around `scan_all_files()` with different content filters
- Update `crates/iceberg/src/inspect/metadata_table.rs`:
- Add `AllManifests`, `AllFiles`, `AllDataFiles`, `AllDeleteFiles`,
`History`, `MetadataLogEntries`, `Refs`, `Files`, `DataFiles`, `DeleteFiles` to
`MetadataTableType` enum
- Add accessor methods on `MetadataTable`
- Update `crates/integrations/datafusion/src/table/metadata_table.rs`:
- Add match arms for all new `MetadataTableType` variants in schema and
scan
- Update `crates/iceberg/src/scan/mod.rs`:
- Add `setup_all_snapshot_manifest_files()` test helper for all-snapshot
tests
## Notes
This matches the Java Iceberg implementation's metadata table types:
- Group A (metadata-only, no I/O): `refs`, `history`, `metadata_log_entries`
- Group B (file-level, shared infra): `files`, `data_files`, `delete_files`
- Group C (all-snapshot variants): `all_manifests`, `all_files`,
`all_data_files`, `all_delete_files`
Remaining tables (`all_entries`, `partitions`, `position_deletes`) are
blocked or deferred:
- `all_entries` depends on ENTRIES PR #863
- `partitions` requires complex aggregation logic
- `position_deletes` requires reading delete file contents (needs design
discussion)
Part of #823
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]