rymurr commented on a change in pull request #3425:
URL: https://github.com/apache/iceberg/pull/3425#discussion_r757674048
##########
File path: site/docs/spec.md
##########
@@ -566,6 +566,38 @@ Notes:
1. An alternative, *strict projection*, creates a partition predicate that
will match a file if all of the rows in the file must match the scan predicate.
These projections are used to calculate the residual predicates for each file
in a scan.
2. For example, if `file_a` has rows with `id` between 1 and 10 and a delete
file contains rows with `id` between 1 and 4, a scan for `id = 9` may ignore
the delete file because none of the deletes can match a row that will be
selected.
+#### Snapshot Reference
+
+Iceberg tables keep track of branches and tags using snapshot references.
+Tags are labels for individual snapshots. Branches are mutable named
references that can be updated by committing a new snapshot as the branch's
referenced snapshot using the [Commit Conflict Resolution and
Retry](#commit-conflict-resolution-and-retry) procedures.
+
+The snapshot reference object records all the information of a reference
including snapshot ID, reference type and [Snapshot Retention
Policy](#snapshot-retention-policy).
+
Review comment:
As discussed on the google doc I would be in favour of splitting this
into two structures. From the google doc:
To me this is coupling the concept of snapshot expiry to the branch/tag
definition when we know we probably don't want to long term. Which could make
it hard to make changes in the future. Maybe I am thinking about it too hard
but to me there are two concepts: the list of branches/tags and the expiration
policies. Having these as two separate data structures allows for both to grow
independently in my mind: named policies, more complex policies etc as well as
more flexibility in how the list of existing references is maintained in the
presence of multi-table branching or catalog owned branching etc.
##########
File path: site/docs/spec.md
##########
@@ -593,16 +625,16 @@ Table metadata consists of the following fields:
| _optional_ | _required_ | **`default-spec-id`**| ID of the "current" spec
that writers should use by default. |
| _optional_ | _required_ | **`last-partition-id`**| An integer; the highest
assigned partition field ID across all partition specs for the table. This is
used to ensure partition fields are always assigned an unused ID when evolving
specs. |
| _optional_ | _optional_ | **`properties`**| A string to string map of table
properties. This is used to control settings that affect reading and writing
and is not intended to be used for arbitrary metadata. For example,
`commit.retry.num-retries` is used to control the number of commit retries. |
-| _optional_ | _optional_ | **`current-snapshot-id`**| `long` ID of the
current table snapshot. |
+| _optional_ | _optional_ | **`current-snapshot-id`**| `long` ID of the
current table snapshot; must be the same as the current ID of the `main` branch
in `refs`. |
| _optional_ | _optional_ | **`snapshots`**| A list of valid snapshots. Valid
snapshots are snapshots for which all data files exist in the file system. A
data file must not be deleted from the file system until the last snapshot in
which it was listed is garbage collected. |
| _optional_ | _optional_ | **`snapshot-log`**| A list (optional) of timestamp
and snapshot ID pairs that encodes changes to the current snapshot for the
table. Each time the current-snapshot-id is changed, a new entry should be
added with the last-updated-ms and the new current-snapshot-id. When snapshots
are expired from the list of valid snapshots, all entries before a snapshot
that has expired should be removed. |
| _optional_ | _optional_ | **`metadata-log`**| A list (optional) of timestamp
and metadata file location pairs that encodes changes to the previous metadata
files for the table. Each time a new metadata file is created, a new entry of
the previous metadata file location should be added to the list. Tables can be
configured to remove oldest metadata log entries and keep a fixed-size log of
the most recent entries after a commit. |
| _optional_ | _required_ | **`sort-orders`**| A list of sort orders, stored
as full sort order objects. |
| _optional_ | _required_ | **`default-sort-order-id`**| Default sort order id
of the table. Note that this could be used by writers, but is not used when
reading because reads use the specs stored in manifest files. |
+| | _optional_ | **`refs`** | A map of snapshot references. The map
keys are the unique snapshot reference names in the table, and the map values
are snapshot reference objects. There is always a `main` branch reference
pointing to the `current-snapshot-id` even if the `refs` map is null. |
Review comment:
My concern raised on the google doc was that the act of re-writing
`metadata.json` and doing a full transaction just to add or change a branch/tag
feels strange. Now that `current-branch` is gone I am a bit less concerned. The
only time where there is a transaction just to modify the `refs` field is when
a branch or tag is created, is that right?
The optimistic retry means that at transaction commit time things will
unlikely conflict or cause issues. I think it just feels counterintuitive to me
to do a re-write of the table metadata just to eg change the branch or add a
tag. I also have concerns about extensibility in the future. For example
extending to multiple tables, or implementing a `git log` like feature.
One potential solution is to give the catalogs control over the
implementation. For example the HMS catalog may store the proposed branch
related data structures in the table definition in Hive rather in the
metadata.json. This allows individual catalogs more latitude in how they deal
with branching, and the ability to evolve that strategy over time. It does have
the downside that migrating between catalogs could be more troublesome.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]