kylebarron opened a new pull request, #7365: URL: https://github.com/apache/arrow-rs/pull/7365
# Which issue does this PR close? Closes https://github.com/apache/arrow-rs/issues/7364. Also related to https://github.com/apache/datafusion/issues/10609 and https://github.com/apache/datafusion/issues/8334. # Rationale for this change Support some way to handle struct-typed columns in `StatisticsConverter`. # What changes are included in this PR? I think there are two ways to handle this: 1. Support struct-type columns directly, where the user can pass the name of a struct column, and all children are automatically handled. So `row_group_mins` would return a struct-typed Arrow array. 2. Adjust `StatisticsConverter::try_new` to handle a nested column name. So the user would pass `a.b.c`, which designates a primitive type, and we reuse existing converters. So `row_group_mins` would return a **primitive-typed** Arrow array. This PR prototypes approach 2. In principle both approaches would be good to have, but approach 1 looked more complex, and approach 2 at least provides a valid workaround. # Are there any user-facing changes? Yes, a breaking change to the signature of `StatisticsConverter` to use `ColumnPath` instead of `&str` for the `column_name`. This isn't _technically_ required; we could assume that a `.` in the column name is a field delimiter, but this seems unnecessary when `ColumnPath` already exists as a type. (Similarly, it's weird to me that [`ProjectionMask::columns`](https://docs.rs/parquet/latest/parquet/arrow/struct.ProjectionMask.html#method.columns) takes in `&str` and not `&ColumnPath` as input) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org