saadtajwar commented on issue #23236:
URL: https://github.com/apache/datafusion/issues/23236#issuecomment-4844959622

   @gene-bordegaray Ooooh I see! I wasn't aware of the `Distribution` enum 
prior, that makes sense! Thanks so much for the thorough explanation here, I 
really appreciate it!
   
   Looking through the API bridge #23259  - it looks like the below is where an 
actual `Partitioning` scheme is created - so per our conversation in #23231, 
the question we're essentially trying to answer is when do we pick a `Hash` 
partitioning scheme for the `Key` distribution as opposed to our new `Range` 
scheme? If that's the case, are there any downsides to doing this only based on 
the caller specifying either `Hash`/`Range`? I'm struggling to think of 
situations where we would want to arbitrarily pick one on the behalf of the 
caller, given that they would provide either a hash function and the column to 
hash on/the ordering and split points? Or am I missing something here? 
   
   ```
               Distribution::HashPartitioned(expr) | 
Distribution::KeyPartitioned(expr) => {
                   Partitioning::Hash(expr, partition_count)
               }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to