RussellSpitzer commented on code in PR #16025:
URL: https://github.com/apache/iceberg/pull/16025#discussion_r3227391903


##########
format/spec.md:
##########
@@ -75,9 +75,9 @@ This table format tracks individual data files in a table 
instead of directories
 
 Table state is maintained in metadata files. All changes to table state create 
a new metadata file and replace the old metadata with an atomic swap. The table 
metadata file tracks the table schema, partitioning config, custom properties, 
and snapshots of the table contents. A snapshot represents the state of a table 
at some time and is used to access the complete set of data files in the table.
 
-Data files in snapshots are tracked by one or more manifest files that contain 
a row for each data file in the table, the file's partition data, and its 
metrics. The data in a snapshot is the union of all files in its manifests. 
Manifest files are reused across snapshots to avoid rewriting metadata that is 
slow-changing. Manifests can track data files with any subset of a table and 
are not associated with partitions.
+Data files in snapshots are tracked by one or more manifest files that contain 
a row for each data file in the table, the file's partition data, and its 
metrics. The data in a snapshot is the union of all files in its manifests. 
Manifest files are reused across snapshots to avoid rewriting metadata that is 
slow-changing. Data manifests and delete manifests can track files with any 
subset of a table and are not associated with partitions.
 
-The manifests that make up a snapshot are stored in a manifest list file. Each 
manifest list stores metadata about manifests, including partition stats and 
data file counts. These stats are used to avoid reading manifests that are not 
required for an operation.
+In V1-V3, the manifests that make up a snapshot are stored in a manifest list 
file. Each manifest list stores metadata about manifests, including partition 
stats and data file counts. These stats are used to avoid reading manifests 
that are not required for an operation. In V4, manifest lists are replaced by a 
single root manifest per snapshot, which can contain references to data files, 
delete files, and other data and delete manifests in a unified structure.

Review Comment:
   I thought technically we aren't allowing a reference to a delete file in the 
Root Manifest or in any V4 Manifest except for V4 Delete manifests for equality 
deletes. Shouldn't it always be a coupled entry of DV and DataFile or DV and 
Manifest?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to