edgarRd opened a new issue #767: Clarify / Document metrics contract URL: https://github.com/apache/incubator-iceberg/issues/767 The metrics contract is a bit unclear, from the implementation. Since it's not defined in the spec, having the only fully implemented metrics for Parquet, and while I'm working on ORC metrics it's not very clear what is the contract expected since file formats seem to implement this differently, for instance: * `Map<Integer, Long> valueCounts()` - it's not clear whether this method includes non-null or repeated values. As per the `TestMetrics` it looks like value counts *includes null and repeated values* which would be pretty much the same as row count, except for nested structures (e.g. lists, maps) - however this is not defined. This issue is to track the discussion about the expected metrics contract and get a clear definition.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org