Prakhar Jain created SPARK-33758: ------------------------------------ Summary: Prune unnecessary output partitioning when the attribute is not part of output. Key: SPARK-33758 URL: https://issues.apache.org/jira/browse/SPARK-33758 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.0.1, 3.1.0 Reporter: Prakhar Jain
Consider the query: select t1.id from t1 JOIN t2 on t1.id = t2.id This query will have top level Project node which will just project t1.id. But the outputPartitioning of this project node will be: PartitioningCollection(HashPartitioning(t1.id), HashPartitioning(t2.id)) We should drop HashPartitioning(t2.id) from outputPartitioning of Project node. cc - [~maropu] [~cloud_fan] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org