alamb opened a new pull request, #7620: URL: https://github.com/apache/arrow-rs/pull/7620
# Which issue does this PR close? - Part of https://github.com/apache/arrow-rs/issues/7615 - Follow on to https://github.com/apache/arrow-rs/pull/7617 # Rationale for this change @Dandandan and @zhuqi-lucas pointed out some places we could improve gc_string_view on https://github.com/apache/arrow-rs/pull/7617, so let's do that # What changes are included in this PR? Avoid allocations and re-validating RecordBatch schema # Are there any user-facing changes? Faster performance (these are the larger string benchmarks added in https://github.com/apache/arrow-rs/pull/7619): ``` filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.001 time: [34.474 ms 34.571 ms 34.678 ms] change: [−8.7777% −8.1016% −7.4194%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.01 time: [4.5034 ms 4.5153 ms 4.5282 ms] change: [−7.4270% −6.8866% −6.3152%] (p = 0.00 < 0.05) Performance has improved. Found 23 outliers among 100 measurements (23.00%) 6 (6.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 11 (11.00%) high severe filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.1 time: [2.1658 ms 2.1953 ms 2.2236 ms] change: [−5.6248% −4.0141% −2.3053%] (p = 0.00 < 0.05) Performance has improved. Benchmarking filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.8: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.6s, enable flat sampling, or reduce sample count to 50. filter: mixed_utf8view (max_string_len=128), 8192, nulls: 0, selectivity: 0.8 time: [1.6407 ms 1.6450 ms 1.6491 ms] change: [−12.806% −11.526% −10.507%] (p = 0.00 < 0.05) Performance has improved. Found 30 outliers among 100 measurements (30.00%) 1 (1.00%) low severe 21 (21.00%) low mild 6 (6.00%) high mild 2 (2.00%) high severe ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
