lyang24 commented on issue #9059: URL: https://github.com/apache/arrow-rs/issues/9059#issuecomment-3751969853
> I think we could probably sprinkly some "reserves" in there locally, but I think we can probably do much better if we reserve the correct capacity when creating the output buffers in the first place (as we know the (max) output size = `batch_size`) > > i have some ideas on how to wire this in. > > Basically, it will involve https://github.com/apache/arrow-rs/blob/9a1e8b572d11078e813fffe3d5c7106b6953d58c/parquet/src/arrow/record_reader/buffer.rs#L21-L20 > > Changing `ValuesBuffer` so that instead of > > pub trait ValuesBuffer : default { > ... > } > It must be explicitly created with capacity > > pub trait ValuesBuffer { > /// create a new buffer with an allocation for `capacity` items > fn new_with_capacity(capacity: usize) -> Self; > } > And then use the compiler to find all the relevant places and add the appropriate capacies Hi @alamb, its looks like the bench results are mixed doing well with large scan (full table scan) querys arrow_reader_clickbench/async/Q20 1.18 130.2±1.59ms ? ?/sec 1.00 110.7±0.82ms ? ?/sec arrow_reader_clickbench/async/Q21 1.29 165.5±0.99ms ? ?/sec 1.00 128.6±0.93ms ? ?/sec arrow_reader_clickbench/async/Q22 1.24 318.7±11.80ms ? ?/sec 1.00 257.3±6.26ms ? ?/sec some regressions with high selectivity arrow_reader_row_filter/int64 > 90/exclude_filter_column/async 1.00 2.6±0.02ms ? ?/sec 1.31 3.5±0.08ms ? ?/sec arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync 1.00 2.4±0.02ms ? ?/sec 1.35 3.2±0.03ms ? ?/sec regression with - i am guessing preallocate large blocks messes up cpu cache? arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs 1.00 75.9±0.46µs ? ?/sec 1.56 118.1±0.43µs ? ?/sec arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs 1.00 232.7±2.34µs ? ?/sec 1.23 285.9±3.00µs ? ?/sec arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs 1.00 80.8±0.46µs ? ?/sec 1.53 123.7±0.34µs ? ?/sec is it possible do preallocate conditionally - skip preallocate for Plain numeric (Float16, Int32, Float64) types, if possible skip with high selectivity filters -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
