gszadovszky opened a new issue, #468:
URL: https://github.com/apache/parquet-format/issues/468

   ### Describe the enhancement requested
   
   There are a couple of issues with the specification of the logical type 
[MAP](https://github.com/apache/parquet-format/blob/apache-parquet-format-2.10.0/LogicalTypes.md#maps):
   
   * typo: "(...) `optional` or `required` and determines whether the **list** 
is nullable."
   * Based on the spec we allow to have a nested `key` column. Does it make 
sense? Most engines/libs require to use primitives for keys.
   * It is clear that `value` is not required, however I did not find a proper 
implementation to handle this. Do we want to suggest anything for this case 
(e.g. value column to be null)?
   * The [Backward-compatibility 
rules](https://github.com/apache/parquet-format/blob/apache-parquet-format-2.10.0/LogicalTypes.md#backward-compatibility-rules-1)
 suggests that `key` and `value` might not be named according to the spec. But 
does not say anything about how to identify them. I've seen multiple 
implementations (e.g. [the avro binding of 
parquet-java](https://github.com/apache/parquet-java/blob/master/parquet-avro/src/main/java/org/apache/parquet/avro/AvroSchemaConverter.java#L446)),
 where we simply choose the `0`th element as key and the `1`st one as value 
without actually checking their names. It does not seem to be correct according 
on the spec.
   * Spec mentions that `MAP_KEY_VALUE` might appear at the place of `MAP` but 
doesn't mention its original purpose to tag `key_value` level.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to