alamb commented on issue #4813:
URL: https://github.com/apache/arrow-rs/issues/4813#issuecomment-1721314706

   > Certainly in the case of a single column grouping, the dictionary encoding 
is pure overhead, as each value will only appear once.
   
   I am not sure about this -- the group keys for each incoming batch are 
converted to Row format first, to compare them with existing group keys. 
   
   However, we could potentially add a special case `GroupValues` for single 
column `Dictionary` grouping that knew how to avoid that step. That would 
likely be close to optimal in terms of performance as we would skip doing any 
hydration at all


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to