JakeDern commented on PR #10128:
URL: https://github.com/apache/arrow-rs/pull/10128#issuecomment-4685992738

   Pretty good improvement - ~42% for the dictionary case and ~20% for delta 
dictionary cases. Not 100% sure why less improvement on the delta side yet, but 
I think this is worth it to take on its own and can investigate further later.
   
   Perf results from #10122:
   
   ```
   ➜  arrow-ipc git:(ipc-writer-dict-benches) cargo bench 
(StreamWriter|FileWriter)/write_10 --features zstd
   zsh: no matches found: (StreamWriter|FileWriter)/write_10
   ➜  arrow-ipc git:(ipc-writer-dict-benches) cargo bench 
"(StreamWriter|FileWriter)/write_10" --features zstd
       Finished `bench` profile [optimized] target(s) in 0.07s
        Running benches/ipc_reader.rs 
(/home/jakedern/repos/arrow-rs/target/release/deps/ipc_reader-a1b491f58c77bb6a)
        Running benches/ipc_writer.rs 
(/home/jakedern/repos/arrow-rs/target/release/deps/ipc_writer-6612be2d7eba35b1)
   Benchmarking arrow_ipc_stream_writer/StreamWriter/write_10: Collecting 100 
samples in estimated 5.5019 s (50k 
iteratiarrow_ipc_stream_writer/StreamWriter/write_10
                           time:   [107.53 µs 108.06 µs 108.61 µs]
                           change: [−2.9828% −0.9112% +0.7341%] (p = 0.39 > 
0.05)
                           No change in performance detected.
   Found 5 outliers among 100 measurements (5.00%)
     3 (3.00%) high mild
     2 (2.00%) high severe
   Benchmarking arrow_ipc_stream_writer/StreamWriter/write_10/zstd: Collecting 
100 samples in estimated 5.0248 s (1100 
iarrow_ipc_stream_writer/StreamWriter/write_10/zstd
                           time:   [4.5765 ms 4.6054 ms 4.6355 ms]
                           change: [−0.7831% +0.1488% +1.0639%] (p = 0.75 > 
0.05)
                           No change in performance detected.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high mild
   Benchmarking arrow_ipc_stream_writer/FileWriter/write_10: Collecting 100 
samples in estimated 5.3861 s (50k 
iterationarrow_ipc_stream_writer/FileWriter/write_10
                           time:   [106.14 µs 106.82 µs 107.54 µs]
                           change: [+1.1887% +2.7126% +4.6164%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 12 outliers among 100 measurements (12.00%)
     6 (6.00%) high mild
     6 (6.00%) high severe
   Benchmarking arrow_ipc_stream_writer/StreamWriter/write_10/dict: Collecting 
100 samples in estimated 5.2009 s (71k 
itarrow_ipc_stream_writer/StreamWriter/write_10/dict
                           time:   [60.775 µs 62.004 µs 63.440 µs]
                           change: [−6.4822% −3.5063% −0.6010%] (p = 0.03 < 
0.05)
                           Change within noise threshold.
   Found 1 outliers among 100 measurements (1.00%)
     1 (1.00%) high severe
   Benchmarking arrow_ipc_stream_writer/StreamWriter/write_10/dict/delta: 
Collecting 100 samples in estimated 5.0870 s 
(arrow_ipc_stream_writer/StreamWriter/write_10/dict/delta
                           time:   [128.47 µs 129.73 µs 130.88 µs]
                           change: [−1.8693% −0.0642% +1.7216%] (p = 0.95 > 
0.05)
                           No change in performance detected.
   Found 3 outliers among 100 measurements (3.00%)
     2 (2.00%) high mild
     1 (1.00%) high severe
   Benchmarking arrow_ipc_stream_writer/FileWriter/write_10/dict/delta: 
Collecting 100 samples in estimated 5.5440 s 
(45arrow_ipc_stream_writer/FileWriter/write_10/dict/delta
                           time:   [130.29 µs 131.33 µs 132.26 µs]
                           change: [+1.8877% +2.8406% +3.8001%] (p = 0.00 < 
0.05)
                           Performance has regressed.
   Found 9 outliers among 100 measurements (9.00%)
     1 (1.00%) low severe
     3 (3.00%) low mild
     3 (3.00%) high mild
     2 (2.00%) high severe
   
   ➜  arrow-ipc git:(ipc-writer-dict-benches)
   ```
   
   perf results from this branch:
   
   ```
   ➜  arrow-ipc git:(ipc-writer-collect-dicts) ✗ cargo bench 
"(StreamWriter|FileWriter)/write_10" --features zstd
      Compiling arrow-ipc v59.0.0 (/home/jakedern/repos/arrow-rs/arrow-ipc)
       Finished `bench` profile [optimized] target(s) in 2.55s
        Running benches/ipc_reader.rs 
(/home/jakedern/repos/arrow-rs/target/release/deps/ipc_reader-a1b491f58c77bb6a)
        Running benches/ipc_writer.rs 
(/home/jakedern/repos/arrow-rs/target/release/deps/ipc_writer-6612be2d7eba35b1)
   Benchmarking arrow_ipc_stream_writer/StreamWriter/write_10: Collecting 100 
samples in estimated 5.3935 s (50k 
iteratiarrow_ipc_stream_writer/StreamWriter/write_10
                           time:   [106.95 µs 107.85 µs 108.76 µs]
                           change: [−2.2269% −1.1032% −0.0394%] (p = 0.06 > 
0.05)
                           No change in performance detected.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) high mild
     1 (1.00%) high severe
   Benchmarking arrow_ipc_stream_writer/StreamWriter/write_10/zstd: Collecting 
100 samples in estimated 5.0249 s (1100 
iarrow_ipc_stream_writer/StreamWriter/write_10/zstd
                           time:   [4.5629 ms 4.5901 ms 4.6184 ms]
                           change: [−1.1939% −0.3327% +0.5704%] (p = 0.47 > 
0.05)
                           No change in performance detected.
   Found 3 outliers among 100 measurements (3.00%)
     3 (3.00%) high mild
   Benchmarking arrow_ipc_stream_writer/FileWriter/write_10: Collecting 100 
samples in estimated 5.0247 s (45k 
iterationarrow_ipc_stream_writer/FileWriter/write_10
                           time:   [109.86 µs 110.45 µs 111.11 µs]
                           change: [+0.3417% +2.0979% +3.7750%] (p = 0.01 < 
0.05)
                           Change within noise threshold.
   Found 5 outliers among 100 measurements (5.00%)
     2 (2.00%) high mild
     3 (3.00%) high severe
   Benchmarking arrow_ipc_stream_writer/StreamWriter/write_10/dict: Collecting 
100 samples in estimated 5.0650 s (136k 
iarrow_ipc_stream_writer/StreamWriter/write_10/dict
                           time:   [37.300 µs 37.543 µs 37.807 µs]
                           change: [−43.963% −42.283% −40.548%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 6 outliers among 100 measurements (6.00%)
     2 (2.00%) high mild
     4 (4.00%) high severe
   Benchmarking arrow_ipc_stream_writer/StreamWriter/write_10/dict/delta: 
Collecting 100 samples in estimated 5.4418 s 
(arrow_ipc_stream_writer/StreamWriter/write_10/dict/delta
                           time:   [103.36 µs 104.28 µs 105.18 µs]
                           change: [−19.764% −18.621% −17.508%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 8 outliers among 100 measurements (8.00%)
     3 (3.00%) low mild
     2 (2.00%) high mild
     3 (3.00%) high severe
   Benchmarking arrow_ipc_stream_writer/FileWriter/write_10/dict/delta: 
Collecting 100 samples in estimated 5.4730 s 
(56arrow_ipc_stream_writer/FileWriter/write_10/dict/delta
                           time:   [104.84 µs 105.65 µs 106.44 µs]
                           change: [−20.651% −20.021% −19.377%] (p = 0.00 < 
0.05)
                           Performance has improved.
   Found 2 outliers among 100 measurements (2.00%)
     1 (1.00%) low mild
     1 (1.00%) high mild
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to