Dandandan commented on pull request #808:
URL: https://github.com/apache/arrow-datafusion/pull/808#issuecomment-898881106


   Hashbrown already implements many tricks like this I believe, it's one of 
the fastest hash table implementations:
   https://docs.rs/hashbrown/0.11.2/hashbrown/hash_map/index.html
   
   There is also a nightly rawtable API to retrieve multiple values at once 
`get_each_mut`, which might be a bit faster.
   
   So far, in profiling results, I haven't seen the probing/hashmap itself 
being a very expensive part currently. AFAIK It's mostly other parts that could 
be optimized: updating the states/values, collision checks, converting to 
array, creating hash values, actual `sum` over the array, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to