[GitHub] [arrow-datafusion] Dandandan commented on issue #790: Rework GroupByHash for faster performance and support grouping by nulls

GitBox Mon, 09 Aug 2021 10:32:18 -0700


Dandandan commented on issue #790:
URL: 
https://github.com/apache/arrow-datafusion/issues/790#issuecomment-895407042



   @sundy-li
   
   Yes, I think for some types the hashing method might be further specialized 
to speed up the hashing or to reduce the amount of memory needed for the hash 
value instead of all ways using `u64`. I think in our case that's currently a 
small cost / win compared to the other gains we might get, but still 
interesting to try once.
   
   I think for small ranges / data types we can even avoid using a hash table 
and move to direct indexing instead. That might be interesting for u8 values or 
small dictionaries.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] Dandandan commented on issue #790: Rework GroupByHash for faster performance and support grouping by nulls

Reply via email to