alamb commented on code in PR #7055:
URL: https://github.com/apache/arrow-rs/pull/7055#discussion_r2083489167
##########
parquet/src/arrow/array_reader/primitive_array.rs:
##########
@@ -261,6 +262,45 @@ where
// - date64: cast int32 to date32, then date32 to date64.
// - decimal: cast int32 to decimal, int64 to decimal
let array = match target_type {
+ // Using `arrow_cast::cast` has been found to be very slow for
converting
Review Comment:
> Several of the clickbench queries (not sure what data types, but it was
spending like 20% of samples in casting during parquet reading).
FWIW many of the clickbench columns are Int16, as I found when working on
https://github.com/apache/arrow-rs/pull/7470.
I started running some benchmarks on a draft update to parquet in this PR
(hopefully it will show some improvements)
- https://github.com/apache/datafusion/pull/16012
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]