Hi dev@parquet, The Logical Type Specification [1] has the following to say about duplicate keys.
If there are multiple key-value pairs for the same key, then the final > value for that key must be the last value. Other values may be ignored or > may be added with replacement to the map container in the order that they > are encoded. The MAP annotation should not be used to encode multi-maps > using duplicate keys. I was wondering if anybody was aware of systems that use this in practice (i.e. write out duplicate keys and rely on the reader to deduplicate them). Thanks, Micah [1] https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps