HippoBaro opened a new pull request, #9679:
URL: https://github.com/apache/arrow-rs/pull/9679

   # Which issue does this PR close?
   
   - None, but relates to #9653
   
   # Rationale for this change
   
   #9653 introduces optimizations related to non-null uniform workloads. This 
adds benchmarks so we can quantify them.
   
   # What changes are included in this PR?
   
   Add three new benchmark cases to the arrow_writer benchmark suite for 
evaluating write performance on struct columns at varying null densities:
   
   * `struct_non_null`: a nullable struct with 0% null rows and non-nullable 
primitive children;
   * `struct_sparse_99pct_null`: a nullable struct with 99% null rows, 
exercising null batching through one level of struct nesting;
   * `struct_all_null`: a nullable struct with 100% null rows, exercising the 
uniform-null path through struct nesting.
   
   Baseline results (Apple M1 Max):
   ```
     struct_non_null/default              29.9 ms
     struct_non_null/parquet_2            38.2 ms
     struct_non_null/zstd_parquet_2       50.9 ms
     struct_sparse_99pct_null/default      7.2 ms
     struct_sparse_99pct_null/parquet_2    7.3 ms
     struct_sparse_99pct_null/zstd_p2      8.1 ms
     struct_all_null/default              83.3 µs
     struct_all_null/parquet_2            82.5 µs
     struct_all_null/zstd_parquet_2      106.6 µs
   ```
   
   # Are these changes tested?
   
   N/A
   
   # Are there any user-facing changes?
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to