rdblue commented on code in PR #16446:
URL: https://github.com/apache/iceberg/pull/16446#discussion_r3284351352


##########
format/spec.md:
##########
@@ -514,12 +514,16 @@ Partition field IDs must be reused if an existing 
partition spec contains an equ
 | **`truncate[W]`** | Value truncated to width `W` (see below)                 
    | `int`, `long`, `decimal`, `string`, `binary`                              
                                | Source type |
 | **`year`**        | Extract a date or timestamp year, as years from 1970     
    | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`      
                                | `int`       |
 | **`month`**       | Extract a date or timestamp month, as months from 
1970-01-01 | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, 
`timestamptz_ns`                                      | `int`       |
-| **`day`**         | Extract a date or timestamp day, as days from 1970-01-01 
    | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`      
                                | `int`       |
+| **`day`**         | Extract a date or timestamp day, as days from 1970-01-01 
    | `date`, `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`      
                                | `date` [1]  |
 | **`hour`**        | Extract a timestamp hour, as hours from 1970-01-01 
00:00:00  | `timestamp`, `timestamptz`, `timestamp_ns`, `timestamptz_ns`        
                                      | `int`       |
 | **`void`**        | Always produces `null`                                   
    | Any                                                                       
                                | Source type or `int` |
 
 All transforms must return `null` for a `null` input value.
 
+Notes:
+
+1. The result type for `day` has been documented as both `int` and `date` in 
earlier revisions of this spec. The physical representation has always been a 
4-byte integer counting days from `1970-01-01`, regardless of whether the Avro 
field is annotated with `logicalType: date`. Readers may encounter manifests in 
either form; per the Avro specification, unrecognized logical type annotations 
are ignored, so the bytes on disk are identical.

Review Comment:
   I don't agree with the conclusion "so the bytes on disk are identical". 
That's true, but not a result of the Avro spec's requirement. How about 
"Writers have produced fields with the `logicalType: date` annotation because 
Avro requires that readers must ignore unknown logical types. The bytes on disk 
are identical so readers must accept both forms and can choose how to pass on 
the data."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to