neilconway commented on PR #20385:
URL: https://github.com/apache/datafusion/pull/20385#issuecomment-3945356897

   @Jefffrey Got it; the default `hashbrown` hash function does seem like a 
better choice. Interestingly the benchmarks are significantly better in some 
cases. This is comparing the feature branch with hashbrown (target) vs std 
hashset (base):
   
   ```
     group                                          base                        
           target
     -----                                          ----                        
           ------
     array_has_any/no_match/10                      1.00      7.4±0.08ms        
? ?/sec    1.00      7.4±0.04ms        ? ?/sec
     array_has_any/no_match/100                     1.00     23.0±0.05ms        
? ?/sec    1.02     23.6±0.05ms        ? ?/sec
     array_has_any/no_match/500                     1.00     91.7±0.11ms        
? ?/sec    1.05     96.2±0.53ms        ? ?/sec
     array_has_any/scalar_no_match/10               1.00      2.1±0.02ms        
? ?/sec    1.00      2.1±0.00ms        ? ?/sec
     array_has_any/scalar_no_match/100              1.00     20.3±0.07ms        
? ?/sec    1.01     20.5±0.06ms        ? ?/sec
     array_has_any/scalar_no_match/500              1.00    134.4±1.10ms        
? ?/sec    1.00    134.6±0.37ms        ? ?/sec
     array_has_any/scalar_some_match/10             1.00  1038.6±12.05µs        
? ?/sec    1.00   1036.2±4.87µs        ? ?/sec
     array_has_any/scalar_some_match/100            1.00     10.7±0.09ms        
? ?/sec    1.00     10.7±0.07ms        ? ?/sec
     array_has_any/scalar_some_match/500            1.00     83.1±0.39ms        
? ?/sec    1.00     83.2±0.40ms        ? ?/sec
     array_has_any/some_match/10                    1.01      6.5±0.03ms        
? ?/sec    1.00      6.4±0.04ms        ? ?/sec
     array_has_any/some_match/100                   1.00     14.6±0.06ms        
? ?/sec    1.01     14.8±0.05ms        ? ?/sec
     array_has_any/some_match/500                   1.00     50.1±0.13ms        
? ?/sec    1.06     52.9±0.22ms        ? ?/sec
     array_has_any_scalar/i64_no_match/1            1.00    359.7±1.46µs        
? ?/sec    1.04    373.2±2.89µs        ? ?/sec
     array_has_any_scalar/i64_no_match/10           1.91    844.9±9.22µs        
? ?/sec    1.00    441.6±9.23µs        ? ?/sec
     array_has_any_scalar/i64_no_match/100          1.59  1003.3±34.17µs        
? ?/sec    1.00   629.3±21.51µs        ? ?/sec
     array_has_any_scalar/i64_no_match/1000         1.77   955.1±12.20µs        
? ?/sec    1.00   540.2±12.02µs        ? ?/sec
     array_has_any_scalar/string_no_match/1         1.01    256.7±1.83µs        
? ?/sec    1.00    255.1±1.92µs        ? ?/sec
     array_has_any_scalar/string_no_match/10        1.97   826.3±13.46µs        
? ?/sec    1.00    420.2±8.06µs        ? ?/sec
     array_has_any_scalar/string_no_match/100       1.65   910.6±19.59µs        
? ?/sec    1.00    552.9±17.14µs        ? ?/sec
     array_has_any_scalar/string_no_match/1000      1.90   874.5±12.71µs        
? ?/sec    1.00    459.8±8.70µs        ? ?/sec
     array_has_any_strings/no_match/10              1.00      5.0±0.01ms        
? ?/sec    1.00      5.0±0.02ms        ? ?/sec
     array_has_any_strings/no_match/100             1.01     22.2±0.05ms        
? ?/sec    1.00     22.0±0.03ms        ? ?/sec
     array_has_any_strings/no_match/500             1.00    128.7±0.18ms        
? ?/sec    1.03    132.1±1.15ms        ? ?/sec
     array_has_any_strings/scalar_no_match/10       1.00    863.4±2.22µs        
? ?/sec    1.07    920.9±1.92µs        ? ?/sec
     array_has_any_strings/scalar_no_match/100      1.00      7.3±0.02ms        
? ?/sec    1.10      8.0±0.02ms        ? ?/sec
     array_has_any_strings/scalar_no_match/500      1.00     87.1±0.14ms        
? ?/sec    1.05     91.4±0.14ms        ? ?/sec
     array_has_any_strings/scalar_some_match/10     1.00    769.2±2.00µs        
? ?/sec    1.03    790.9±3.03µs        ? ?/sec
     array_has_any_strings/scalar_some_match/100    1.00      4.1±0.17ms        
? ?/sec    1.04      4.3±0.22ms        ? ?/sec
     array_has_any_strings/scalar_some_match/500    1.00     16.9±0.08ms        
? ?/sec    1.08     18.2±0.07ms        ? ?/sec
     array_has_any_strings/some_match/10            1.00      4.3±0.02ms        
? ?/sec    1.00      4.3±0.01ms        ? ?/sec
     array_has_any_strings/some_match/100           1.01     14.3±0.05ms        
? ?/sec    1.00     14.1±0.04ms        ? ?/sec
     array_has_any_strings/some_match/500           1.00     53.5±0.11ms        
? ?/sec    1.00     53.6±0.07ms        ? ?/sec
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to