ndemir opened a new pull request, #7274:
URL: https://github.com/apache/arrow-rs/pull/7274

   # Which issue does this PR close?
   
   I see an issue opened here: 
[https://github.com/apache/arrow-rs/issues/7273](https://github.com/apache/arrow-rs/issues/7273)
   I realized that can be optimized and I worked on that (not in the way that 
was suggested in the issue).
   
   Closes #7273
   
   # Rationale for this change
   
   This PR introduces performance improvements for string coercion.
   
   Performance Metrics:
   
   Step#1: Save Baseline Metrics
   ```
   > git checkout main
   > cargo bench --bench json_reader -- --save-baseline before        
   ```
   
   Step#2: Compare the changes with Baseline Metrics
   
   ```
   > git checkout ndemir/string-coerce-optimization
   > cargo bench --bench json_reader --  --baseline before
   
   small_bench_primitive   time:   [8.5790 µs 8.5890 µs 8.6029 µs]
                           change: [-3.0652% -2.4764% -1.9676%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 4 outliers among 100 measurements (4.00%)
     2 (2.00%) high mild
     2 (2.00%) high severe
   
   large_bench_primitive   time:   [3.2421 ms 3.2951 ms 3.3615 ms]
                           change: [+0.9382% +2.6439% +4.5593%] (p = 0.00 < 
0.05)
                           Change within noise threshold.
   Found 9 outliers among 100 measurements (9.00%)
     1 (1.00%) high mild
     8 (8.00%) high severe
   
   small_bench_list        time:   [17.444 µs 17.655 µs 18.076 µs]
                           change: [-6.4619% -4.0195% -2.1687%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 5 outliers among 100 measurements (5.00%)
     1 (1.00%) low severe
     1 (1.00%) high mild
     3 (3.00%) high severe
   ```
   
   # What changes are included in this PR?
   
   - Shared buffer for numeric -> string: Added a number_buffer to reuse for 
numeric->string conversions.
   - Performance: This reduces overhead by writing numeric values to a single 
buffer and then converting to &str.
   - Safety comment: I added a comment and explain why from_utf8_unchecked is 
valid.
   
   # Are there any user-facing changes?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to