cjc0013 opened a new pull request, #50071: URL: https://github.com/apache/arrow/pull/50071
### Rationale for this change The statistics schema documentation describes statistics arrays for Arrow arrays and nested field column indexes, but C++ only exposed `RecordBatch::MakeStatisticsArray()` and only enumerated top-level record batch columns. This leaves two related gaps: * callers cannot ask an `Array` to produce its statistics schema representation directly; * `RecordBatch::MakeStatisticsArray()` drops statistics attached to nested child arrays. ### What changes are included in this PR? * Add `Array::MakeStatisticsArray()`. * Share the statistics-array construction path between `Array` and `RecordBatch`. * Traverse nested `ArrayData::child_data` when enumerating record batch column statistics, using the same depth-first column index order described by the IPC record batch message rules. * Preserve existing record batch row-count behavior. ### Are these changes tested? Yes. Locally: * `ninja arrow-table-test -j2` * `./debug/arrow-table-test --gtest_filter="*MakeStatisticsArray*"`: 24 tests passed * `./debug/arrow-table-test`: 181 tests passed ### Are there any user-facing changes? Yes. This adds the public C++ `Array::MakeStatisticsArray()` API and lets `RecordBatch::MakeStatisticsArray()` include nested child-array statistics when present. This is not a breaking API change. Closes #45804. Addresses part of #45474. Refs #45806. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
