kou opened a new pull request, #49657:
URL: https://github.com/apache/arrow/pull/49657

   ### Rationale for this change
   
   Performance is important in Apache Arrow. So benchmark is useful for 
developing Apache Arrow implementation.
   
   ### What changes are included in this PR?
   
   * Add benchmarks for file and streaming writers.
   * Remove redundant type arguments from array constructors.
   
   Here are benchmark results on my environment.
   
   Pure Ruby implementation is about 2-2.5x slower than release build C++ 
implementation but about 2-2.5x faster than debug build C++ implementation.
   
   Release build C++/GLib:
   
   File format:
   
   ```console
   $ ruby -v -S benchmark-driver 
ruby/red-arrow-format/benchmark/file-writer.yaml
   ruby 4.1.0dev (2026-03-26T07:27:31Z master c5ab2114df) +PRISM [x86_64-linux]
   Warming up --------------------------------------
              Arrow::Table#save    348.499 i/s -     374.000 times in 1.073175s 
(2.87ms/i)
   Arrow::RecordBatchFileWriter    353.426 i/s -     385.000 times in 1.089337s 
(2.83ms/i)
        ArrowFormat::FileWriter    133.293 i/s -     140.000 times in 1.050314s 
(7.50ms/i)
   Calculating -------------------------------------
              Arrow::Table#save    336.984 i/s -      1.045k times in 3.101035s 
(2.97ms/i)
   Arrow::RecordBatchFileWriter    338.695 i/s -      1.060k times in 3.129655s 
(2.95ms/i)
        ArrowFormat::FileWriter    134.640 i/s -     399.000 times in 2.963462s 
(7.43ms/i)
   
   Comparison:
   Arrow::RecordBatchFileWriter:       338.7 i/s
              Arrow::Table#save:       337.0 i/s - 1.01x  slower
        ArrowFormat::FileWriter:       134.6 i/s - 2.52x  slower
   
   ```
   
   Streaming format:
   
   ```console
   $ ruby -v -S benchmark-driver 
ruby/red-arrow-format/benchmark/streaming-writer.yaml
   ruby 4.1.0dev (2026-03-26T07:27:31Z master c5ab2114df) +PRISM [x86_64-linux]
   Warming up --------------------------------------
                Arrow::Table#save    356.995 i/s -     385.000 times in 
1.078447s (2.80ms/i)
   Arrow::RecordBatchStreamWriter    347.891 i/s -     374.000 times in 
1.075050s (2.87ms/i)
     ArrowFormat::StreamingWriter    156.709 i/s -     160.000 times in 
1.021004s (6.38ms/i)
   Calculating -------------------------------------
                Arrow::Table#save    350.743 i/s -      1.070k times in 
3.050665s (2.85ms/i)
   Arrow::RecordBatchStreamWriter    345.821 i/s -      1.043k times in 
3.016011s (2.89ms/i)
     ArrowFormat::StreamingWriter    160.022 i/s -     470.000 times in 
2.937090s (6.25ms/i)
   
   Comparison:
                Arrow::Table#save:       350.7 i/s
   Arrow::RecordBatchStreamWriter:       345.8 i/s - 1.01x  slower
     ArrowFormat::StreamingWriter:       160.0 i/s - 2.19x  slower
   ```
   
   Debug build C++/GLib:
   
   File format:
   
   ```console
   $ ruby -v -S benchmark-driver 
ruby/red-arrow-format/benchmark/file-writer.yaml
   ruby 4.1.0dev (2026-03-26T07:27:31Z master c5ab2114df) +PRISM [x86_64-linux]
   Warming up --------------------------------------
              Arrow::Table#save     63.290 i/s -      66.000 times in 1.042815s 
(15.80ms/i)
   Arrow::RecordBatchFileWriter     62.655 i/s -      66.000 times in 1.053389s 
(15.96ms/i)
        ArrowFormat::FileWriter    138.082 i/s -     140.000 times in 1.013891s 
(7.24ms/i)
   Calculating -------------------------------------
              Arrow::Table#save     63.165 i/s -     189.000 times in 2.992143s 
(15.83ms/i)
   Arrow::RecordBatchFileWriter     61.773 i/s -     187.000 times in 3.027220s 
(16.19ms/i)
        ArrowFormat::FileWriter    134.709 i/s -     414.000 times in 3.073285s 
(7.42ms/i)
   
   Comparison:
        ArrowFormat::FileWriter:       134.7 i/s
              Arrow::Table#save:        63.2 i/s - 2.13x  slower
   Arrow::RecordBatchFileWriter:        61.8 i/s - 2.18x  slower
   
   ```
   
   Streaming format:
   
   ```console
   $ ruby -v -S benchmark-driver 
ruby/red-arrow-format/benchmark/streaming-writer.yaml
   ruby 4.1.0dev (2026-03-26T07:27:31Z master c5ab2114df) +PRISM [x86_64-linux]
   Warming up --------------------------------------
                Arrow::Table#save     63.252 i/s -      66.000 times in 
1.043439s (15.81ms/i)
   Arrow::RecordBatchStreamWriter     61.272 i/s -      66.000 times in 
1.077162s (16.32ms/i)
     ArrowFormat::StreamingWriter    152.598 i/s -     160.000 times in 
1.048506s (6.55ms/i)
   Calculating -------------------------------------
                Arrow::Table#save     61.016 i/s -     189.000 times in 
3.097525s (16.39ms/i)
   Arrow::RecordBatchStreamWriter     63.024 i/s -     183.000 times in 
2.903642s (15.87ms/i)
     ArrowFormat::StreamingWriter    160.416 i/s -     457.000 times in 
2.848846s (6.23ms/i)
   
   Comparison:
     ArrowFormat::StreamingWriter:       160.4 i/s
   Arrow::RecordBatchStreamWriter:        63.0 i/s - 2.55x  slower
                Arrow::Table#save:        61.0 i/s - 2.63x  slower
   
   ```
   
   ### Are these changes tested?
   
   Yes.
   
   ### Are there any user-facing changes?
   
   Yes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to