mingmwang commented on issue #1570:
URL: 
https://github.com/apache/arrow-datafusion/issues/1570#issuecomment-1521564406

   @alamb @yjshen 
   Can we make the `GroupState` and the Accumulator states serializable ? 
   With this approach, we do not need to do any sort when spiiling data to 
disks. And when we read the data back, we reconstruct our raw hash table 
quickly from the hash values and indexes, because our hashmap is very 
lightweight, the hash value can be re-calculated from grouping rows, or we can 
cache the hash value inside the `GroupState` to avoid the re-calculating.
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to