RussellSpitzer commented on code in PR #16025:
URL: https://github.com/apache/iceberg/pull/16025#discussion_r3227470666
##########
format/spec.md:
##########
@@ -590,85 +592,182 @@ A data or delete file is associated with a sort order by
the sort order's id wit
### Manifests
-A manifest is an immutable Avro file that lists data files or delete files,
along with each file’s partition data tuple, metrics, and tracking information.
One or more manifest files are used to store a [snapshot](#snapshots), which
tracks all of the files in a table at some point in time. Manifests are tracked
by a [manifest list](#manifest-lists) for each table snapshot.
+A manifest is an immutable file that lists data files or delete files, along
with each file’s partition data, metrics, and tracking information. One or more
manifest files are used to store a [snapshot](#snapshots), which tracks all of
the files in a table at some point in time. In V1-V3, manifests are tracked by
a [manifest list](#manifest-lists) for each table snapshot. In V4, a single
root manifest per snapshot can directly reference data files, delete files, and
other data and delete manifests.
-A manifest is a valid Iceberg data file: files must use valid Iceberg formats,
schemas, and column projection.
+Manifests are valid Iceberg data files: files must use valid Iceberg formats,
schemas, and column projection.
A manifest may store either data files or delete files, but not both because
manifests that contain delete files are scanned first during job planning.
Whether a manifest is a data manifest or a delete manifest is stored in
manifest metadata.
-A manifest stores files for a single partition spec. When a table’s partition
spec changes, old files remain in the older manifest and newer files are
written to a new manifest. This is required because a manifest file’s schema is
based on its partition spec (see below). The partition spec of each manifest is
also used to transform predicates on the table's data rows into predicates on
partition values that are used during job planning to select files from a
manifest.
+**Partition Spec Binding:**
-A manifest file must store the partition spec and other metadata as properties
in the Avro file's key-value metadata:
-
-| v1 | v2 | Key | Value
|
-|------------|------------|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------|
-| _required_ | _required_ | `schema` | JSON representation of the
table schema at the time the manifest was written
|
-| _optional_ | _required_ | `schema-id` | ID of the schema used to
write the manifest as a string
|
-| _required_ | _required_ | `partition-spec` | JSON representation of only
the partition fields array of the partition spec used to write the manifest.
See [Appendix C](#partition-specs) |
-| _optional_ | _required_ | `partition-spec-id` | ID of the partition spec
used to write the manifest as a string
|
-| _optional_ | _required_ | `format-version` | Table format version number
of the manifest as a string
|
-| | _required_ | `content` | Type of content files
tracked by the manifest: "data" or "deletes"
|
-
-The schema of a manifest file is defined by the `manifest_entry` struct,
described in the following section.
-
-#### Manifest Entry Fields
-
-The `manifest_entry` struct consists of the following fields:
-
-| v1 | v2 | Field id, name | Type
| Description |
-| ---------- | ----------
|-------------------------------|-----------------------------------------------------------|-------------|
-| _required_ | _required_ | **`0 status`** | `int` with
meaning: `0: EXISTING` `1: ADDED` `2: DELETED` | Used to track additions and
deletions. Deletes are informational only and not used in scans. |
-| _required_ | _optional_ | **`1 snapshot_id`** | `long`
| Snapshot id where the file was added,
or deleted if status is 2. Inherited when null. |
-| | _optional_ | **`3 sequence_number`** | `long`
| Data sequence number of the file.
Inherited when null and status is 1 (added). |
-| | _optional_ | **`4 file_sequence_number`** | `long`
| File sequence number indicating when
the file was added. Inherited when null and status is 1 (added). |
-| _required_ | _required_ | **`2 data_file`** | `data_file`
`struct` (see below) | File path, partition tuple,
metrics, ... |
-
-The manifest entry fields are used to keep track of the snapshot in which
files were added or logically deleted. The `data_file` struct, defined below,
is nested inside the manifest entry so that it can be easily passed to job
planning without the manifest entry fields.
-
-When a file is added to the dataset, its manifest entry should store the
snapshot ID in which the file was added and set status to 1 (added).
-
-When a file is replaced or deleted from the dataset, its manifest entry fields
store the snapshot ID in which the file was deleted and status 2 (deleted). The
file may be deleted from the file system when the snapshot in which it was
deleted is garbage collected, assuming that older snapshots have also been
garbage collected [1].
-
-Iceberg v2 adds data and file sequence numbers to the entry and makes the
snapshot ID optional. Values for these fields are inherited from manifest
metadata when `null`. That is, if the field is `null` for an entry, then the
entry must inherit its value from the manifest file's metadata, stored in the
manifest list.
-The `sequence_number` field represents the data sequence number and must never
change after a file is added to the dataset. The data sequence number
represents a relative age of the file content and should be used for planning
which delete files apply to a data file.
-The `file_sequence_number` field represents the sequence number of the
snapshot that added the file and must also remain unchanged upon assigning at
commit. The file sequence number can't be used for pruning delete files as the
data within the file may have an older data sequence number.
-The data and file sequence numbers are inherited only if the entry status is 1
(added). If the entry status is 0 (existing) or 2 (deleted), the entry must
include both sequence numbers explicitly.
-
-Notes:
-
-1. Technically, data files can be deleted when the last snapshot that contains
the file as “live” data is garbage collected. But this is harder to detect and
requires finding the diff of multiple snapshots. It is easier to track what
files are deleted in a snapshot and delete them when that snapshot expires. It
is not recommended to add a deleted file back to a table. Adding a deleted file
can lead to edge cases where incremental deletes can break table snapshots.
-2. Manifest list files are required in v2, so that the `sequence_number` and
`snapshot_id` to inherit are always available.
+- V1-V3: A manifest stores files for a single partition spec. When a table’s
partition spec changes, old files remain in the older manifest and newer files
are written to a new manifest. This is required because a manifest file’s
schema is based on its partition spec. The partition spec of each manifest is
used to transform predicates on the table’s data rows into predicates on
partition values during job planning.
+- V4: Manifests are not bound to a single partition spec. Files with different
partition specs can coexist in the same manifest because partition values are
stored in column statistics using source column IDs rather than in a
partition-spec-specific struct. The `partition-spec-id` in manifest metadata is
tracked for informational purposes but does not constrain the contents.
+
+#### Manifest File Format
+
+Manifests are Avro files in V1-V3. Starting in V4, writers must produce
manifests in Parquet.
+
+In V1-V3, a manifest file must store the partition spec and other metadata as
properties in the file’s key-value metadata:
+
+=== "v1 - v3"
+ | v1 | v2 and v3 | Key | Value
|
+
|------------|------------|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------|
+ | _required_ | _required_ | `schema` | JSON representation of
the table schema at the time the manifest was written
|
+ | _optional_ | _required_ | `schema-id` | ID of the schema used to
write the manifest as a string
|
+ | _required_ | _required_ | `partition-spec` | JSON representation of
only the partition fields array of the partition spec used to write the
manifest. See [Appendix C](#partition-specs) |
+ | _optional_ | _required_ | `partition-spec-id` | ID of the partition spec
used to write the manifest as a string
|
+ | _optional_ | _required_ | `format-version` | Table format version
number of the manifest as a string
|
+ | | _required_ | `content` | Type of content files
tracked by the manifest: "data" or "deletes"
|
+
+=== "v4"
+ | Write | Read | Key | Value
|
+
|------------|------------|---------------------|---------------------------------------------------------------------------------------------------------------------------------------------|
+ | _optional_ | _optional_ | `schema-id` | ID of the schema used to
write the manifest as a string
|
+ | _optional_ | _optional_ | `partition-spec-id` | ID of the partition spec
used to write the manifest as a string
|
Review Comment:
Not sure this one makes sense now? Entries should all have a spec, but i'm
not sure it makes sense to have a global spec id for the manifest anymore?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]