andygrove commented on code in PR #3076: URL: https://github.com/apache/datafusion-comet/pull/3076#discussion_r2695427739
########## docs/source/contributor-guide/native_shuffle.md: ########## @@ -187,6 +186,18 @@ For range partitioning: The simplest case: all rows go to partition 0. Uses `SinglePartitionShufflePartitioner` which simply concatenates batches to reach the configured batch size. +### Round Robin Partitioning + +Round robin partitioning distributes rows evenly across partitions in a deterministic way: + +1. Computes a Murmur3 hash of **all columns** in each row (using seed 42) +2. Sorts rows by their hash values to ensure deterministic ordering Review Comment: It was true when I first started on this PR, but then the behavior changed. I will update this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
