ndemir opened a new pull request, #7274: URL: https://github.com/apache/arrow-rs/pull/7274
# Which issue does this PR close? I see an issue opened here: [https://github.com/apache/arrow-rs/issues/7273](https://github.com/apache/arrow-rs/issues/7273) I realized that can be optimized and I worked on that (not in the way that was suggested in the issue). Closes #7273 # Rationale for this change This PR introduces performance improvements for string coercion. Performance Metrics: Step#1: Save Baseline Metrics ``` > git checkout main > cargo bench --bench json_reader -- --save-baseline before ``` Step#2: Compare the changes with Baseline Metrics ``` > git checkout ndemir/string-coerce-optimization > cargo bench --bench json_reader -- --baseline before small_bench_primitive time: [8.5790 µs 8.5890 µs 8.6029 µs] change: [-3.0652% -2.4764% -1.9676%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe large_bench_primitive time: [3.2421 ms 3.2951 ms 3.3615 ms] change: [+0.9382% +2.6439% +4.5593%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) high mild 8 (8.00%) high severe small_bench_list time: [17.444 µs 17.655 µs 18.076 µs] change: [-6.4619% -4.0195% -2.1687%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) low severe 1 (1.00%) high mild 3 (3.00%) high severe ``` # What changes are included in this PR? - Shared buffer for numeric -> string: Added a number_buffer to reuse for numeric->string conversions. - Performance: This reduces overhead by writing numeric values to a single buffer and then converting to &str. - Safety comment: I added a comment and explain why from_utf8_unchecked is valid. # Are there any user-facing changes? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
