Ankur Dave created SPARK-11077:
----------------------------------

             Summary: Join elimination in Catalyst
                 Key: SPARK-11077
                 URL: https://issues.apache.org/jira/browse/SPARK-11077
             Project: Spark
          Issue Type: New Feature
          Components: SQL
            Reporter: Ankur Dave
            Assignee: Ankur Dave


Join elimination is a query optimization where certain joins can be eliminated 
when followed by projections that only keep columns from one side of the join, 
and when certain columns are known to be unique or foreign keys. This can be 
very useful for queries involving views and machine-generated queries.

Adding join elimination to Catalyst requires (1) support for unique and foreign 
key hints in logical plans, (2) methods in the DataFrame API to let users 
provide these hints, and (3) an optimizer rule that eliminates unique key outer 
joins and referential integrity joins when followed by an appropriate 
projection.

This proposal is described in detail here: 
https://docs.google.com/document/d/1-YgQSQywHfAo4PhAT-zOOkFZtVcju99h3dYQq-i9GWQ/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to