yyanyy opened a new pull request #1963: URL: https://github.com/apache/iceberg/pull/1963
This change is a smaller PR broken down from #1935. This change adds field id to constructors of Avro primitive value writers, and make these writers to track stats such as value count, min and max, and expose a `metrics` method that could be called to collect `FieldMetrics`. However nothing is calling these method yet. This change doesn't have any test, and tests will be included in the next PR when end to end integration is set up. Please note: regarding change to the signature of `FieldMetrics`, the alternative would be to keep `ByteBuffer` as the return value for lower/upper bound of `FieldMetrics` and ingest each field's metrics mode to each leaf value writer during construction, so that when collecting metrics from these writers, truncation and conversion to byte buffer could happen. I think it's doable but it would touch a lot of methods' signatures, including adding metric mode to the constructor of every leaf writer, and adding metrics config to every datum writer (e.g. `DataWriter`, `GenericAppenderFactory`), but it does avoid skip computing min/max for fields that don't need them. Please let me know if you are interested, and I'll post a new commit to this PR so that the differences in these two implementations could be compared. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
