jackye1995 commented on a change in pull request #3425:
URL: https://github.com/apache/iceberg/pull/3425#discussion_r765220466



##########
File path: site/docs/spec.md
##########
@@ -593,16 +625,16 @@ Table metadata consists of the following fields:
 | _optional_ | _required_ | **`default-spec-id`**| ID of the "current" spec 
that writers should use by default. |
 | _optional_ | _required_ | **`last-partition-id`**| An integer; the highest 
assigned partition field ID across all partition specs for the table. This is 
used to ensure partition fields are always assigned an unused ID when evolving 
specs. |
 | _optional_ | _optional_ | **`properties`**| A string to string map of table 
properties. This is used to control settings that affect reading and writing 
and is not intended to be used for arbitrary metadata. For example, 
`commit.retry.num-retries` is used to control the number of commit retries. |
-| _optional_ | _optional_ | **`current-snapshot-id`**| `long` ID of the 
current table snapshot. |
+| _optional_ | _optional_ | **`current-snapshot-id`**| `long` ID of the 
current table snapshot; must be the same as the current ID of the `main` branch 
in `refs`. |
 | _optional_ | _optional_ | **`snapshots`**| A list of valid snapshots. Valid 
snapshots are snapshots for which all data files exist in the file system. A 
data file must not be deleted from the file system until the last snapshot in 
which it was listed is garbage collected. |
 | _optional_ | _optional_ | **`snapshot-log`**| A list (optional) of timestamp 
and snapshot ID pairs that encodes changes to the current snapshot for the 
table. Each time the current-snapshot-id is changed, a new entry should be 
added with the last-updated-ms and the new current-snapshot-id. When snapshots 
are expired from the list of valid snapshots, all entries before a snapshot 
that has expired should be removed. |
 | _optional_ | _optional_ | **`metadata-log`**| A list (optional) of timestamp 
and metadata file location pairs that encodes changes to the previous metadata 
files for the table. Each time a new metadata file is created, a new entry of 
the previous metadata file location should be added to the list. Tables can be 
configured to remove oldest metadata log entries and keep a fixed-size log of 
the most recent entries after a commit. |
 | _optional_ | _required_ | **`sort-orders`**| A list of sort orders, stored 
as full sort order objects. |
 | _optional_ | _required_ | **`default-sort-order-id`**| Default sort order id 
of the table. Note that this could be used by writers, but is not used when 
reading because reads use the specs stored in manifest files. |
+|            | _optional_ | **`refs`** | A map of snapshot references. The map 
keys are the unique snapshot reference names in the table, and the map values 
are snapshot reference objects. There is always a `main` branch reference 
pointing to the `current-snapshot-id` even if the `refs` map is null. |

Review comment:
       Yes I think the work in TableMetadata builder and REST catalog is to try 
have an alternative way of dealing with table metadata updates from as a diff 
rather than rewriting the entire table metadata file. And that could open the 
door for potentially some other implementations of the TableMetadata spec 
backed by services and do not require full metadata rewrite. That is a 
development I would be happy to see, and with that we will be able to have fast 
commits to the metadata for small changes like tagging.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to