gene-db commented on code in PR #460:
URL: https://github.com/apache/parquet-format/pull/460#discussion_r1809672976


##########
LogicalTypes.md:
##########
@@ -563,6 +563,23 @@ defined by the [BSON specification][bson-spec].
 
 The sort order used for `BSON` is unsigned byte-wise comparison.
 
+### VARIANT
+
+`VARIANT` is used for a Variant value. It must annotate a group. The group must
+contain a `binary` field named `metadata`, and a `binary` field named `value`.
+The `VARIANT` annotated group can be used to store either an unshredded Variant
+value, or a shredded Variant value.
+
+* The top level must be a group annotated with `VARIANT` that contains a
+  `binary` field named `metadata`, and a `binary` field named `value`.
+* Additional fields which start with `_` (underscore) can be ignored.

Review Comment:
   This was desired in case there were some additional (but redundant) metadata 
or values we might store, and still allow it to be a valid Variant value 
(group).



##########
LogicalTypes.md:
##########
@@ -563,6 +563,23 @@ defined by the [BSON specification][bson-spec].
 
 The sort order used for `BSON` is unsigned byte-wise comparison.
 
+### VARIANT
+
+`VARIANT` is used for a Variant value. It must annotate a group. The group must
+contain a `binary` field named `metadata`, and a `binary` field named `value`.
+The `VARIANT` annotated group can be used to store either an unshredded Variant
+value, or a shredded Variant value.
+
+* The top level must be a group annotated with `VARIANT` that contains a
+  `binary` field named `metadata`, and a `binary` field named `value`.
+* Additional fields which start with `_` (underscore) can be ignored.
+* If `metadata` and `value` are the only fields in the group, then the group
+  is an unshredded Variant value. The `metadata` and `value` fields are
+  interpreted as an encoded Variant value as defined by the
+  [Variant binary encoding specification](VariantEncoding.md).
+* If the group contains additional fields, it is a shredded Variant, and must
+  adhere to the scheme detailed in the [Variant shredding 
specification](VariantShredding.md).

Review Comment:
   I figured the Variant shredding doc already has that disclaimer. I think 
this statement of having to follow the shredding spec will still be accurate, 
regardless of how we change the shredding spec, right?



##########
LogicalTypes.md:
##########
@@ -563,6 +563,23 @@ defined by the [BSON specification][bson-spec].
 
 The sort order used for `BSON` is unsigned byte-wise comparison.
 
+### VARIANT
+
+`VARIANT` is used for a Variant value. It must annotate a group. The group must
+contain a `binary` field named `metadata`, and a `binary` field named `value`.
+The `VARIANT` annotated group can be used to store either an unshredded Variant
+value, or a shredded Variant value.
+
+* The top level must be a group annotated with `VARIANT` that contains a
+  `binary` field named `metadata`, and a `binary` field named `value`.

Review Comment:
   I folded in those changes.
   
   One thing I had a question about is that for an unshredded variant, the 
`value` should be required. For unshredded, it can be optional. Is there any 
issue with that?



##########
VariantEncoding.md:
##########
@@ -42,7 +42,7 @@ This document describes the Variant Binary Encoding scheme.
 [VariantShredding.md](VariantShredding.md) describes the details of the 
Variant shredding scheme.
 
 # Variant in Parquet
-A Variant value in Parquet is represented by a group with 2 fields, named 
`value` and `metadata`.
+A Variant value in Parquet is represented by a group annotated with `VARIANT`, 
with 2 fields, named `value` and `metadata`.

Review Comment:
   Sure, I reverted the spec changes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to