Chao Sun created SPARK-41413: -------------------------------- Summary: Storage-Partitioned Join should avoid shuffle when partition keys mismatch, but join expressions are compatible Key: SPARK-41413 URL: https://issues.apache.org/jira/browse/SPARK-41413 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.1 Reporter: Chao Sun
Currently when checking whether two sides of a Storage Partitioned Join are compatible, we requires both the partition expressions as well as the partition keys are compatible. However, this condition could be relaxed so that we only require the former. In the case that the latter is not compatible, we can calculate a common superset of keys and push down the information to both sides of the join, and use empty partitions for the missing keys. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org