sunchao commented on issue #4973: URL: https://github.com/apache/arrow-datafusion/issues/4973#issuecomment-1613758230
@Dandandan thanks, I agree that calculating hashes itself is probably not so critical to performances comparing to other hash table operations. We are doing some experiments to switch to vectorized hash table and see if it can deliver better performance. So far, without SIMD it's roughly the same as `hashbrown` used by DF, and I'm really hoping by vectorizing each of the steps it can get us to the next level. Will share the result once I have some numbers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
