gene-bordegaray commented on PR #20331: URL: https://github.com/apache/datafusion/pull/20331#issuecomment-3897713669
> This makes sense to me and will be very helpful for use cases where we want to avoid repartitioning data. My only concern is that API users would need to align the probe and build side partitions, but this seems like a reasonable tradeoff. Let’s see what other contributors think. (this is a partial review I will finish later today or early next week) but until now it's looking good to me :) 💯 I know we have discussed this but want to document here, for the API it is clear that partitioning structure is a bit vague. I would like to start an effort to make partitioning a trait that will more clearly define how data is partitioned to eliminate the overload on Hash partitioning. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
