This is an automated email from the ASF dual-hosted git repository.
blue pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg.git
The following commit(s) were added to refs/heads/main by this push:
new 479f468c5f Spec: Deprecate the file system table scheme (#10833)
479f468c5f is described below
commit 479f468c5f389bd7a30938114f8e79445c48f179
Author: Ryan Blue <[email protected]>
AuthorDate: Sun Aug 4 14:32:52 2024 -0700
Spec: Deprecate the file system table scheme (#10833)
---
format/spec.md | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/format/spec.md b/format/spec.md
index 5a90f6fd97..daef7538e7 100644
--- a/format/spec.md
+++ b/format/spec.md
@@ -779,7 +779,9 @@ When two commits happen at the same time and are based on
the same version, only
#### File System Tables
-An atomic swap can be implemented using atomic rename in file systems that
support it, like HDFS or most local file systems [1].
+_Note: This file system based scheme to commit a metadata file is
**deprecated** and will be removed in version 4 of this spec. The scheme is
**unsafe** in object stores and local file systems._
+
+An atomic swap can be implemented using atomic rename in file systems that
support it, like HDFS [1].
Each version of table metadata is stored in a metadata folder under the
table’s base location using a file naming scheme that includes a version
number, `V`: `v<V>.metadata.json`. To commit a new metadata version, `V+1`, the
writer performs the following steps:
@@ -1393,4 +1395,4 @@ This section covers topics not required by the
specification but recommendations
Iceberg supports two types of histories for tables. A history of previous
"current snapshots" stored in ["snapshot-log" table
metadata](#table-metadata-fields) and [parent-child lineage stored in
"snapshots"](#table-metadata-fields). These two histories
might indicate different snapshot IDs for a specific timestamp. The
discrepancies can be caused by a variety of table operations (e.g. updating the
`current-snapshot-id` can be used to set the snapshot of a table to any
arbitrary snapshot, which might have a lineage derived from a table branch or
no lineage at all).
-When processing point in time queries implementations should use
"snapshot-log" metadata to lookup the table state at the given point in time.
This ensures time-travel queries reflect the state of the table at the provided
timestamp. For example a SQL query like `SELECT * FROM prod.db.table TIMESTAMP
AS OF '1986-10-26 01:21:00Z';` would find the snapshot of the Iceberg table
just prior to '1986-10-26 01:21:00 UTC' in the snapshot logs and use the
metadata from that snapshot to perform th [...]
\ No newline at end of file
+When processing point in time queries implementations should use
"snapshot-log" metadata to lookup the table state at the given point in time.
This ensures time-travel queries reflect the state of the table at the provided
timestamp. For example a SQL query like `SELECT * FROM prod.db.table TIMESTAMP
AS OF '1986-10-26 01:21:00Z';` would find the snapshot of the Iceberg table
just prior to '1986-10-26 01:21:00 UTC' in the snapshot logs and use the
metadata from that snapshot to perform th [...]