Re: [I] Why hashing benefits from partitioning? [arrow-datafusion]

via GitHub Mon, 16 Oct 2023 23:30:32 -0700


yukkit commented on issue #7834:
URL: 
https://github.com/apache/arrow-datafusion/issues/7834#issuecomment-1765752478


   The partitioning granularity of `RoundRobin` is RecordBatch, and the 
partitioning granularity of `Hash` is row. When the input partitions are very 
few and the amount of data is relatively large, adding the RoundRobin 
partitioner can increase the parallelism of calculating hash values and improve 
the speed. I think it is beneficial only in this case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] Why hashing benefits from partitioning? [arrow-datafusion]

Reply via email to