mingmwang commented on issue #1570:
URL:
https://github.com/apache/arrow-datafusion/issues/1570#issuecomment-1521564406
@alamb @yjshen
Can we make the `GroupState` and the Accumulator states serializable ?
With this approach, we do not need to do any sort when spiiling data to
disks. And when we read the data back, we reconstruct our raw hash table
quickly from the hash values and indexes, because our hashmap is very
lightweight, the hash value can be re-calculated from grouping rows, or we can
cache the hash value inside the `GroupState` to avoid the re-calculating.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]