nevi-me commented on issue #1642: URL: https://github.com/apache/arrow-rs/issues/1642#issuecomment-1159366876
@tustvold here's Arrow's spec: https://github.com/apache/arrow/blob/master/format/Schema.fbs#L103-L131 ```rust /// A Map is a logical nested type that is represented as /// /// List<entries: Struct<key: K, value: V>> /// /// In this layout, the keys and values are each respectively contiguous. We do /// not constrain the key and value types, so the application is responsible /// for ensuring that the keys are hashable and unique. Whether the keys are sorted /// may be set in the metadata for this field. /// /// In a field with Map type, the field has a child Struct field, which then /// has two children: key type and the second the value type. The names of the /// child fields may be respectively "entries", "key", and "value", but this is /// not enforced. /// /// Map /// ```text /// - child[0] entries: Struct /// - child[0] key: K /// - child[1] value: V /// ``` /// Neither the "entries" field nor the "key" field may be nullable. /// /// The metadata is structured so that Arrow systems without special handling /// for Map can make Map an alias for List. The "layout" attribute for the Map /// field must have the same contents as a List. table Map { /// Set to true if the keys within each value are sorted keysSorted: bool; } ``` Parquet seems to allow both `HashMap` and `HashSet`, while I interpret `Neither the "entries" field nor the "key" field may be nullable.` to mean that Arrows `Map` requires both keys and values. ____ @frolovdev I suppose a solution is to check whether a map has both key and value, then fall back to parsing it as a list. I think ``` message table { required group map (MAP) { repeated group key_value { REQUIRED BYTE_ARRAY key; } } } ``` would then be read in as `list[map]<Binary[key]>` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
