This is an automated email from the ASF dual-hosted git repository.

fokko pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg.git


The following commit(s) were added to refs/heads/main by this push:
     new 5821efcdd5 Spec: Clarify missing fields when writing (#8672)
5821efcdd5 is described below

commit 5821efcdd521fa4d0f244500d3edb5e1c9e06311
Author: Fokko Driesprong <[email protected]>
AuthorDate: Fri Apr 26 08:50:30 2024 +0200

    Spec: Clarify missing fields when writing (#8672)
    
    * Spec: Carify missing fields when writing
    
    Jan raised a point on slack of the symantic meaning of a field
    that can be written:
    
    https://apache-iceberg.slack.com/archives/C03LG1D563F/p1695834739711569
    
    There are two options:
    
    - The field is not part of the schema, and omitted from the file
    - The field is part of the schema, but the value is not written (nullable)
    
    My personal take on this is that we should use static schema's when
    writing Avro files, so that all the fields that are either optional or
    required are in the schema.
    
    I'm well aware of that this doesn't impose any issues if you dogfood
    the Iceberg Avro reader, where you can add required fields, for example
    the `134: content` field in the manifest.
    
    However, I think we should try to stick to the concept of write strict,
    read permissive where we try to encourage people to write all the fields
    that are in the spec (even they if the value itself is all null).
    
    * Add manifest-list explicitly
    
    Co-authored-by: JFinis <[email protected]>
    
    * Update wording
    
    * Comments
    
    * Retain formatting
    
    * Thanks Steven
    
    ---------
    
    Co-authored-by: JFinis <[email protected]>
---
 format/spec.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/format/spec.md b/format/spec.md
index aa905e7032..b00c63256a 100644
--- a/format/spec.md
+++ b/format/spec.md
@@ -127,12 +127,12 @@ Tables do not require rename, except for tables that use 
atomic rename to implem
 
 #### Writer requirements
 
-Some tables in this spec have columns that specify requirements for v1 and v2 
tables. These requirements are intended for writers when adding metadata files 
to a table with the given version.
+Some tables in this spec have columns that specify requirements for v1 and v2 
tables. These requirements are intended for writers when adding metadata files 
(including manifests files and manifest lists) to a table with the given 
version.
 
 | Requirement | Write behavior |
 |-------------|----------------|
 | (blank)     | The field should be omitted |
-| _optional_  | The field can be written |
+| _optional_  | The field can be written or omitted |
 | _required_  | The field must be written |
 
 Readers should be more permissive because v1 metadata files are allowed in v2 
tables so that tables can be upgraded to v2 without rewriting the metadata 
tree. For manifest list and manifest files, this table shows the expected v2 
read behavior:

Reply via email to