Rachelint commented on issue #11281:
URL: https://github.com/apache/datafusion/issues/11281#issuecomment-2212907839

   > > But still somethings confused me now... When using the builder directly, 
looping in flatten way is obviously slower than looping it in nested way...
   > 
   > Maybe the compiler is not smart enough to avoid bounds checks or something 
when using `flatten` 🤔
   
   Maybe it is actually due to compiler...
   I found the same thing in another bench about Boolean...
   See 
https://github.com/Rachelint/arrow-datafusion/blob/82490204e6afecc3f9e15c2ce5d35d7e407f422f/datafusion/core/src/datasource/physical_plan/parquet/statistics.rs#L752-L777
   - use flatten
   ```
   Extract data page statistics for Boolean/extract_statistics/Boolean
                           time:   [11.990 µs 11.993 µs 11.997 µs]
                           change: [+18.603% +18.783% +18.917%] (p = 0.00 < 
0.05)
   ```
   - use nested
   ```
   Extract data page statistics for Boolean/extract_statistics/Boolean
                           time:   [10.143 µs 10.146 µs 10.149 µs]
                           change: [-15.525% -15.464% -15.403%] (p = 0.00 < 
0.05)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to