JFinis commented on code in PR #469:
URL: https://github.com/apache/parquet-format/pull/469#discussion_r1827580560


##########
LogicalTypes.md:
##########
@@ -709,13 +709,18 @@ to values. `MAP` must annotate a 3-level structure:
 
 * The outer-most level must be a group annotated with `MAP` that contains a
   single field named `key_value`. The repetition of this level must be either
-  `optional` or `required` and determines whether the list is nullable.
+  `optional` or `required` and determines whether the map is nullable.
 * The middle level, named `key_value`, must be a repeated group with a `key`
-  field for map keys and, optionally, a `value` field for map values.
+  field for map keys and, optionally, a `value` field for map values. It must
+  not contain any other values.
 * The `key` field encodes the map's key type. This field must have
-  repetition `required` and must always be present.
+  repetition `required` and must always be present. It must be placed at the
+  0th position of the `key_value` group. It is suggested to use a primitive as
+  the type.

Review Comment:
   This suggestion is weird IMHO. First, a spec "suggesting" something is weird 
in general, either something should be allowed or forbidden. A suggestion might 
be okay, if the rationale of the suggestion is clearly stated, so an 
implementor can decide wether that rationale applies to them.
   
   So either add a good reason why this is suggested, or drop the suggestion.
   
   IMHO, I would drop it. The fact that most implementations don't support a 
non-primitive type here seems to be a pure limitation of the implementations. 
There is no logical reason why the type couldn't be nested. I see that 
sometimes composite key types can make sense (e.g., a map of name + birth date 
to identify an individual).
   
   I always find it weird when specifications are bound to implementations 
rather the other way round. I know that Parquet already does that in a few 
places where it mentions that some things are not well supported by 
implementations (IIRC, it was the v2 encodings or v2 data page?). Even in the 
existing places, I find this to be an anti-pattern. Implementations change over 
time. Such a sentence might very well be outdated.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to