cloud-fan edited a comment on pull request #31083: URL: https://github.com/apache/spark/pull/31083#issuecomment-758713975
I recalled why the existing framework put `requiredChildDistribution/requiredChildOrdering` in physical plan: because the partitioning/ordering information can only be determined at the physical phase. For example, sort-merge-join and shuffle-hash-join have different output ordering. It's still OK to add the shuffle/sort at the optimizer phase, and eliminate it later at the physical phase. But it's more natural to not add the shuffle/sort at the first place. The cast is a problem, but if can fix it in `AliasAwareOutputOrdering`, it can benefit many other queries as well. We should do it anyway regardless of this feature. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
