Thanks for the proposal Ashvin! I see value in adding this to support the use case of allowing read only access from Snowflake. Currently we push updates with an ALTER TABLE command <https://docs.snowflake.com/en/sql-reference/sql/alter-iceberg-table-refresh> to synchronize our internally-hosted catalog with Snowflake so a version-hint file would potentially eliminate this need.
One question I have is "how could we prevent the version-hint file from being removed during the delete orphan files procedure?" If version-hint is an optional file that is not tracked in the table's metadata, it seems this file could be removed during table maintenance. On Mon, Nov 11, 2024 at 2:03 PM Ashvin A <ash...@apache.org> wrote: > Hello Community, > > We would like to share a proposal to standardize a file system based > method to identify Iceberg tables’ current snapshot. > > Proposal doc: Adding a File System based Consistent Method to Identify > Iceberg Tables’ Current Snapshot > <https://docs.google.com/document/d/1yzLXSOtzBXyaWHfeVsWsMu4xmOH8rV6QyM5ZAnJZjMQ/edit?pli=1&tab=t.0#heading=h.yhvnt89pggpj> > > The proposal aims to enhance the interoperability and self-sufficiency of > Iceberg tables by replicating the snapshot's metadata file name > (version-hint) from the catalog to the file system. This will make the > table representation on the file system complete and eliminate the need for > catalog dependency in certain read-only scenarios. > > Use Case: Microsoft Fabric now supports Iceberg tables in OneLake, > allowing users to leverage Iceberg tables in addition to Delta Lake tables > with Microsoft Fabric’s compute engines. Having a file system based > integration reduces the number of components required in the read query > execution path, especially when the catalog is inaccessible or during > pre-production scenarios. > > Please review the proposal document and share your suggestions in the > comments. We look forward to discussing this further. > > Best, > Ashvin >