kou opened a new pull request, #43801: URL: https://github.com/apache/arrow/pull/43801
### Rationale for this change If we can attach associated statistics to an array via `ArrayData`, we can use it in later processes such as query planning. If `ArrayData` not `Array` has statistics, we can use statistics in computing kernels. There was a concern that associated `arrow::ArrayStatistics` may be outdated if `arrow::ArrayData` is mutated after attaching `arrow::ArrayStatistics`. But `arrow::ArrayData` isn't mutable after the first population. So `arrow::ArrayStatistics` will not be outdated. We can require mutators to take responsibility for statistics. ### What changes are included in this PR? * Embed `arrow::ArrayStatistics` into `arrow::ArrayData` because `arrow::ArrayStatistics` is a lightweight data * Add `arrow::Array::statistics()` to get statistics attached in `arrow::ArrayData` This doesn't provide a new `arrow::ArrayData` constructor (`arrow::ArrayData::Make()`) that accepts `arrow::ArrayStatistics`. We can change `arrow::ArrayData::statistics` after we create `arrow::ArrayData`. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. `arrow::Array::statistics()` is a new public API. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
