Hi dev@parquet,
The Logical Type Specification [1] has the following to say about duplicate
keys.

If there are multiple key-value pairs for the same key, then the final
> value for that key must be the last value. Other values may be ignored or
> may be added with replacement to the map container in the order that they
> are encoded. The MAP annotation should not be used to encode multi-maps
> using duplicate keys.


I was wondering if anybody was aware of systems that use this in practice
(i.e. write out duplicate keys and rely on the reader to deduplicate them).

Thanks,
Micah


[1]
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps

Reply via email to