alamb commented on code in PR #7055:
URL: https://github.com/apache/arrow-rs/pull/7055#discussion_r2083489167


##########
parquet/src/arrow/array_reader/primitive_array.rs:
##########
@@ -261,6 +262,45 @@ where
         // - date64: cast int32 to date32, then date32 to date64.
         // - decimal: cast int32 to decimal, int64 to decimal
         let array = match target_type {
+            // Using `arrow_cast::cast` has been found to be very slow for 
converting

Review Comment:
   > Several of the clickbench queries (not sure what data types, but it was 
spending like 20% of samples in casting during parquet reading).
   
   FWIW many of the clickbench columns are Int16, as I found when working on 
https://github.com/apache/arrow-rs/pull/7470.
   
   I started running some benchmarks on a draft update to parquet in this PR 
(hopefully it will show some improvements)
   - https://github.com/apache/datafusion/pull/16012



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to