scovich commented on code in PR #520:
URL: https://github.com/apache/parquet-format/pull/520#discussion_r2307492178


##########
VariantShredding.md:
##########
@@ -110,18 +119,22 @@ Shredded values must use the following Parquet types:
 
 Primitive values can be shredded using the equivalent Parquet primitive type 
from the table above for `typed_value`.
 
-Unless the value is shredded as an object (see [Objects](#objects)), 
`typed_value` or `value` (but not both) must be non-null.
+Unless the value is shredded as an object (see [Objects](#objects)), exactly 
one of `typed_value` or `value` must be present in each row.
+
+NULL values in an optional group are encoded in the usual way, with a 
definition level that excludes
+that group. For example, if `measurement` above were a top-level column in the 
parquet file, NULL
+measurements would have definition level 0.
 
 #### Arrays
 
 Arrays can be shredded by using a 3-level Parquet list for `typed_value`.
 
-If the value is not an array, `typed_value` must be null.
-If the value is an array, `value` must be null.
+If the value is not an array, `typed_value` must be NULL.
+If the value is an array, `value` must be NULL.
 
 The list `element` must be a required group.
 The `element` group can contain `value` and `typed_value` fields.
-The element's `value` field stores the element as Variant-encoded `binary` 
when the `typed_value` is not present or cannot represent it.
+The element's `value` field stores the element as Variant-encoded `binary` 
when the `typed_value` field is missing or cannot represent it.

Review Comment:
   The spec seems to have quite a collection of similarly-named concepts:
   * [parquet] null vs. [Variant] null
   * [parquet schema] field [in a group] vs. [Variant object] field
   * [parquet schema] field vs. [parquet data] column
   * missing [field of Variant object] vs. missing [column of parquet file]
   
   Not sure the best way to clearly delineate them all without getting wordy...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to