coderfender commented on PR #21561:
URL: https://github.com/apache/datafusion/pull/21561#issuecomment-4247315039

   @Dandandan  , I am trying to a Vector of pairs (group ID , value) approach 
to see if SIMD (sort and return group counts on ly during `evaluate` would cost 
lesser than computing hashes in `update_batch` method) . Results were promising 
on my local but I would be happy if you could run a benchmarks on GH runners 
for more accuracy. I also moved distinct accumulators to a separate cold path 
(which were probably causing other count queries' minor slowness) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to