wgtmac commented on PR #466: URL: https://github.com/apache/parquet-format/pull/466#issuecomment-2449039626
There is another issue at https://github.com/apache/parquet-format/blob/master/LogicalTypes.md?plain=1#L696-L712. > `MAP` is used to annotate types that should be interpreted as a map from keys > to values. `MAP` must annotate a 3-level structure: > > ``` > <map-repetition> group <name> (MAP) { > repeated group key_value { > required <key-type> key; > <value-repetition> <value-type> value; > } > } > ``` > > The outer-most level must be a group annotated with `MAP` that contains a single field named `key_value`. The repetition of this level must be either `optional` or `required` and determines whether the list is nullable. Does it mean that MAP cannot be nested in two-level encoded LIST? So the 1st one below is valid and the 2nd one is invalid. What should reader implementation do if reading from a Parquet file with the latter schema? @emkornfield @gszadovszky ``` List<Map<String, String>> in three-level list encoding: optional group my_list (LIST) { repeated group list { required group element (MAP) { repeated group key_value { required binary key (STRING); optional binary value (STRING); } } } } List<Map<String, String>> in two-level list encoding: optional group my_list (LIST) { repeated group array (MAP) { repeated group key_value { required binary key (STRING); optional binary value (STRING); } } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
