tustvold opened a new issue, #5177: URL: https://github.com/apache/arrow-rs/issues/5177
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** <!-- A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] (This section helps Arrow developers understand the context and *why* for this feature, in addition to the *what*) --> Currently ColumnValueDecoderImpl and by extension ColumnReader accepts slices of `[T::T]` where `T: DataType`. This was preserved by #1041 which extracted generics to allow using owned buffer constructions instead for the arrow read path, whilst preserving the existing API for non-arrow readers. However, preserving this API has a couple of fairly substantial drawbacks: * A lot of the test coverage in the parquet crate uses the arrow APIs which use different implementations of `ColumnValueDecoder` * The finite capacity of the output buffers introduces challenges related to record truncation - https://github.com/apache/arrow-rs/issues/5150 * The generics are pretty arcane and require some gymnastics to allow for slices that don't have a size separate from their capacity * Buffers must be pre-allocated and zeroed ahead of time, which is not only an unnecessary overhead, but for list will likely necessitate re-allocation once the correct number of values is ascertained **Describe the solution you'd like** <!-- A clear and concise description of what you want to happen. --> I would like to update `ColumnValueDecoderImpl` to accept `Vec<T>` instead of `[T::T]`. This would not only simplify `RecordReader`, and improve its performance for nested data, but would eliminate issues like #5150 **Describe alternatives you've considered** <!-- A clear and concise description of any alternative solutions or features you've considered. --> **Additional context** <!-- Add any other context or screenshots about the feature request here. --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
