Dandandan commented on pull request #808: URL: https://github.com/apache/arrow-datafusion/pull/808#issuecomment-898881106
Hashbrown already implements many tricks like this I believe, it's one of the fastest hash table implementations: https://docs.rs/hashbrown/0.11.2/hashbrown/hash_map/index.html There is also a nightly rawtable API to retrieve multiple values at once `get_each_mut`, which might be a bit faster. So far, in profiling results, I haven't seen the probing/hashmap itself being a very expensive part currently. AFAIK It's mostly other parts that could be optimized: updating the states/values, collision checks, converting to array, creating hash values, actual `sum` over the array, etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
