gene-bordegaray opened a new issue, #23236: URL: https://github.com/apache/datafusion/issues/23236
### Is your feature request related to a problem or challenge? `Distribution::HashPartitioned` is currently documented as requiring rows with equal key values to land in the same partition. That is a key-colocation contract, not necessarily a requirement that the existing input is physically hash partitioned. As range partitioning support expands, this name is increasingly confusing: range partitioning can satisfy some single-input key-colocation requirements, while multi-input operators such as joins still need additional co-partitioning checks. ### Describe the solution you'd like Clarify the long-term API direction for this distribution requirement. Options include: - keep `HashPartitioned` but document it as historical naming for key colocation - deprecate / migrate to a `KeyPartitioned` name - separately model cross-input co-partitioning requirements for joins instead of encoding them as independent per-child distributions This should be handled independently from operator-specific range partitioning support so each operator can opt in deliberately. ### Additional context Range partitioning epic: #22395 Related work: - #23191 - #23184 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
