onursatici commented on PR #6808: URL: https://github.com/apache/arrow-rs/pull/6808#issuecomment-2515095767
@alamb @tustvold I did add a string view case for the interleave benchmark and ran on main, this PR (interleave-deduplicated), and https://github.com/apache/arrow-rs/pull/6779 (interleave-specific-impl) ``` ❯ critcmp interleave-main interleave-deduplicated interleave-specific-impl group interleave-deduplicated interleave-main interleave-specific-impl ----- ----------------------- --------------- ------------------------ interleave string_view(0.5, 50, true) 100 [0..100, 100..230, 450..1000] 2.33 2.3±0.02µs ? ?/sec 1.82 1767.6±25.30ns ? ?/sec 1.00 968.7±4.97ns ? ?/sec interleave string_view(0.5, 50, true) 1024 [0..100, 100..230, 450..1000, 0..1000] 1.80 13.9±0.17µs ? ?/sec 1.35 10.4±0.11µs ? ?/sec 1.00 7.7±0.11µs ? ?/sec interleave string_view(0.5, 50, true) 1024 [0..100, 100..230, 450..1000] 1.80 13.3±0.13µs ? ?/sec 1.39 10.3±0.10µs ? ?/sec 1.00 7.4±0.09µs ? ?/sec interleave string_view(0.5, 50, true) 400 [0..100, 100..230, 450..1000] 1.93 5.8±0.05µs ? ?/sec 1.49 4.5±0.05µs ? ?/sec 1.00 3.0±0.02µs ? ?/sec ``` I believe the penalty introduced by this PR would be mitigated for interleave's case if we also merge #6779, for other cases it feels like the read / transfer over the wire improvements might outweigh the cost. Happy to hear your thoughts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
