Prakhar Jain created SPARK-33758:
------------------------------------

             Summary: Prune unnecessary output partitioning when the attribute 
is not part of output.
                 Key: SPARK-33758
                 URL: https://issues.apache.org/jira/browse/SPARK-33758
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.0.1, 3.1.0
            Reporter: Prakhar Jain


Consider the query:

 

select t1.id from t1 JOIN t2 on t1.id = t2.id

 

This query will have top level Project node which will just project t1.id. But 
the outputPartitioning of this project node will be:

PartitioningCollection(HashPartitioning(t1.id), HashPartitioning(t2.id))

 

We should drop HashPartitioning(t2.id) from outputPartitioning of Project node.

 

cc - [~maropu] [~cloud_fan]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to