kou opened a new pull request, #49657:
URL: https://github.com/apache/arrow/pull/49657
### Rationale for this change
Performance is important in Apache Arrow. So benchmark is useful for
developing Apache Arrow implementation.
### What changes are included in this PR?
* Add benchmarks for file and streaming writers.
* Remove redundant type arguments from array constructors.
Here are benchmark results on my environment.
Pure Ruby implementation is about 2-2.5x slower than release build C++
implementation but about 2-2.5x faster than debug build C++ implementation.
Release build C++/GLib:
File format:
```console
$ ruby -v -S benchmark-driver
ruby/red-arrow-format/benchmark/file-writer.yaml
ruby 4.1.0dev (2026-03-26T07:27:31Z master c5ab2114df) +PRISM [x86_64-linux]
Warming up --------------------------------------
Arrow::Table#save 348.499 i/s - 374.000 times in 1.073175s
(2.87ms/i)
Arrow::RecordBatchFileWriter 353.426 i/s - 385.000 times in 1.089337s
(2.83ms/i)
ArrowFormat::FileWriter 133.293 i/s - 140.000 times in 1.050314s
(7.50ms/i)
Calculating -------------------------------------
Arrow::Table#save 336.984 i/s - 1.045k times in 3.101035s
(2.97ms/i)
Arrow::RecordBatchFileWriter 338.695 i/s - 1.060k times in 3.129655s
(2.95ms/i)
ArrowFormat::FileWriter 134.640 i/s - 399.000 times in 2.963462s
(7.43ms/i)
Comparison:
Arrow::RecordBatchFileWriter: 338.7 i/s
Arrow::Table#save: 337.0 i/s - 1.01x slower
ArrowFormat::FileWriter: 134.6 i/s - 2.52x slower
```
Streaming format:
```console
$ ruby -v -S benchmark-driver
ruby/red-arrow-format/benchmark/streaming-writer.yaml
ruby 4.1.0dev (2026-03-26T07:27:31Z master c5ab2114df) +PRISM [x86_64-linux]
Warming up --------------------------------------
Arrow::Table#save 356.995 i/s - 385.000 times in
1.078447s (2.80ms/i)
Arrow::RecordBatchStreamWriter 347.891 i/s - 374.000 times in
1.075050s (2.87ms/i)
ArrowFormat::StreamingWriter 156.709 i/s - 160.000 times in
1.021004s (6.38ms/i)
Calculating -------------------------------------
Arrow::Table#save 350.743 i/s - 1.070k times in
3.050665s (2.85ms/i)
Arrow::RecordBatchStreamWriter 345.821 i/s - 1.043k times in
3.016011s (2.89ms/i)
ArrowFormat::StreamingWriter 160.022 i/s - 470.000 times in
2.937090s (6.25ms/i)
Comparison:
Arrow::Table#save: 350.7 i/s
Arrow::RecordBatchStreamWriter: 345.8 i/s - 1.01x slower
ArrowFormat::StreamingWriter: 160.0 i/s - 2.19x slower
```
Debug build C++/GLib:
File format:
```console
$ ruby -v -S benchmark-driver
ruby/red-arrow-format/benchmark/file-writer.yaml
ruby 4.1.0dev (2026-03-26T07:27:31Z master c5ab2114df) +PRISM [x86_64-linux]
Warming up --------------------------------------
Arrow::Table#save 63.290 i/s - 66.000 times in 1.042815s
(15.80ms/i)
Arrow::RecordBatchFileWriter 62.655 i/s - 66.000 times in 1.053389s
(15.96ms/i)
ArrowFormat::FileWriter 138.082 i/s - 140.000 times in 1.013891s
(7.24ms/i)
Calculating -------------------------------------
Arrow::Table#save 63.165 i/s - 189.000 times in 2.992143s
(15.83ms/i)
Arrow::RecordBatchFileWriter 61.773 i/s - 187.000 times in 3.027220s
(16.19ms/i)
ArrowFormat::FileWriter 134.709 i/s - 414.000 times in 3.073285s
(7.42ms/i)
Comparison:
ArrowFormat::FileWriter: 134.7 i/s
Arrow::Table#save: 63.2 i/s - 2.13x slower
Arrow::RecordBatchFileWriter: 61.8 i/s - 2.18x slower
```
Streaming format:
```console
$ ruby -v -S benchmark-driver
ruby/red-arrow-format/benchmark/streaming-writer.yaml
ruby 4.1.0dev (2026-03-26T07:27:31Z master c5ab2114df) +PRISM [x86_64-linux]
Warming up --------------------------------------
Arrow::Table#save 63.252 i/s - 66.000 times in
1.043439s (15.81ms/i)
Arrow::RecordBatchStreamWriter 61.272 i/s - 66.000 times in
1.077162s (16.32ms/i)
ArrowFormat::StreamingWriter 152.598 i/s - 160.000 times in
1.048506s (6.55ms/i)
Calculating -------------------------------------
Arrow::Table#save 61.016 i/s - 189.000 times in
3.097525s (16.39ms/i)
Arrow::RecordBatchStreamWriter 63.024 i/s - 183.000 times in
2.903642s (15.87ms/i)
ArrowFormat::StreamingWriter 160.416 i/s - 457.000 times in
2.848846s (6.23ms/i)
Comparison:
ArrowFormat::StreamingWriter: 160.4 i/s
Arrow::RecordBatchStreamWriter: 63.0 i/s - 2.55x slower
Arrow::Table#save: 61.0 i/s - 2.63x slower
```
### Are these changes tested?
Yes.
### Are there any user-facing changes?
Yes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]