HippoBaro opened a new pull request, #9679:
URL: https://github.com/apache/arrow-rs/pull/9679
# Which issue does this PR close?
- None, but relates to #9653
# Rationale for this change
#9653 introduces optimizations related to non-null uniform workloads. This
adds benchmarks so we can quantify them.
# What changes are included in this PR?
Add three new benchmark cases to the arrow_writer benchmark suite for
evaluating write performance on struct columns at varying null densities:
* `struct_non_null`: a nullable struct with 0% null rows and non-nullable
primitive children;
* `struct_sparse_99pct_null`: a nullable struct with 99% null rows,
exercising null batching through one level of struct nesting;
* `struct_all_null`: a nullable struct with 100% null rows, exercising the
uniform-null path through struct nesting.
Baseline results (Apple M1 Max):
```
struct_non_null/default 29.9 ms
struct_non_null/parquet_2 38.2 ms
struct_non_null/zstd_parquet_2 50.9 ms
struct_sparse_99pct_null/default 7.2 ms
struct_sparse_99pct_null/parquet_2 7.3 ms
struct_sparse_99pct_null/zstd_p2 8.1 ms
struct_all_null/default 83.3 µs
struct_all_null/parquet_2 82.5 µs
struct_all_null/zstd_parquet_2 106.6 µs
```
# Are these changes tested?
N/A
# Are there any user-facing changes?
None
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]