rdblue commented on code in PR #14234:
URL: https://github.com/apache/iceberg/pull/14234#discussion_r3197679037


##########
format/spec.md:
##########
@@ -707,6 +714,131 @@ For `geography` only, xmin (X value of `lower_bounds`) 
may be greater than xmax
 
 When calculating upper and lower bounds for `geometry` and `geography`, null 
or NaN values in a coordinate dimension are skipped; for example, POINT (1 NaN) 
contributes a value to X but no values to Y, Z, or M dimension bounds. If a 
dimension has only null or NaN values, that dimension is omitted from the 
bounding box. If either the X or Y dimension is missing then the bounding box 
itself is not produced.
 
+##### Content Stats
+
+In Iceberg v4 stats have been redesigned and are represented by using nested 
structs (`struct<struct<...>>`). The statistics for fields are tracked inside a 
nested struct of value counts and bounds (described in the next section). Each 
field-level statistics struct is a field of the `content_stats` struct, which 
holds all statistics for table fields.

Review Comment:
   ```suggestion
   In Iceberg v4, column metrics for a data field are typed fields in a metrics 
struct that corresponds to the field. These stats structs are nested within the 
`content_stats` struct in manifest files.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to