geoffreyclaude opened a new pull request, #19376:
URL: https://github.com/apache/datafusion/pull/19376

   ## Which issue does this PR close?
   
   - Related to #19241
   
   ## Rationale for this change
   
   This PR enhances the `in_list` benchmark suite to provide more comprehensive 
performance measurements across a wider range of data types and list sizes. 
These improvements are necessary groundwork for evaluating optimizations 
proposed in #19241.
   
   The current benchmarks were limited in scope, making it difficult to assess 
the performance impact of potential `in_list` optimizations across different 
data types and scenarios.
   
   ## What changes are included in this PR?
   
   - Added benchmarks for `UInt8Array`, `Int16Array`, and 
`TimestampNanosecondArray`
   - Added `28` to `IN_LIST_LENGTHS` (now `[3, 8, 28, 100]`) to better cover 
the range between small and large lists
   - Increased `ARRAY_LENGTH` from `1024` to `8192` to be aligned with the 
default DataFusionbatch size
   - Configured criterion with shorter warm-up (100ms) and measurement times 
(500ms) for faster iteration
   
   ## Are these changes tested?
   
   Yes, this PR adds benchmark coverage. The benchmarks can be run with:
   ```bash
   cargo bench --bench in_list
   ```
   
   The benchmarks verify that the `in_list` expression evaluates correctly for 
all the new data types.
   
   ## Are there any user-facing changes?
   
   No user-facing changes. This PR only affects the benchmark suite used for 
performance testing and development.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to