ClSlaid opened a new pull request, #9758:
URL: https://github.com/apache/arrow-rs/pull/9758

   ## Summary
   - add a direct `BatchCoalescer::push_batch_with_indices` path for primitive, 
`Utf8View`, and `BinaryView` columns when the indices are integer typed and 
non-null
   - specialise indexed copying for primitive and byte-view in-progress arrays 
so supported schemas can coalesce rows directly without materialising an 
intermediate taken `RecordBatch`
   - keep other data types on the existing `take_record_batch` fallback; 
benchmark work on this branch showed widening the direct path beyond primitive 
and view arrays regressed `Utf8` and dictionary-backed cases
   
   ## Testing
   - `cargo test -p arrow-select coalesce --lib`
   - `cargo clippy -p arrow-select --lib --tests -- -D warnings`
   - `cargo clippy -p arrow --bench coalesce_kernels --features test_utils -- 
-D warnings`
   - `cargo clippy --workspace --all-targets -- -D warnings`
   
   ## Benchmarks
   - `take: primitive, 8192, nulls: 0, selectivity: 0.01`: `3.5194-3.5796 ms` 
-> `1.8780-1.9136 ms`
   - `take: primitive, 8192, nulls: 0.1, selectivity: 0.01`: `5.5208-5.5708 ms` 
-> `4.0016-4.1647 ms`
   - `take: primitive, 8192, nulls: 0, selectivity: 0.001`: `23.684-23.813 ms` 
-> `5.9713-6.0137 ms`
   - `take: single_utf8view, 8192, nulls: 0, selectivity: 0.01`: `3.0301-3.0830 
ms` -> `2.4513-2.4854 ms`
   - `take: mixed_utf8view (max_string_len=20), 8192, nulls: 0, selectivity: 
0.01`: `1.8643-1.8823 ms` -> `1.2706-1.2856 ms`
   - `take: single_binaryview, 8192, nulls: 0, selectivity: 0.01`: 
`3.1346-3.2991 ms` -> `2.7578-2.8539 ms`
   - `take: mixed_binaryview (max_string_len=20), 8192, nulls: 0, selectivity: 
0.01`: `1.9634-2.0215 ms` -> `1.4117-1.4383 ms`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to