alamb opened a new issue #1670:
URL: https://github.com/apache/arrow-datafusion/issues/1670


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   
   LEFT OUTER, RIGHT OUTER, and FULL OUTER JOINs are often more expensive to 
evaluate and preclude other optimizations (such as pushing down predicates as 
can be seen in #1618) 
   
   As such, sophisticated optimizers will actually rewrite OUTER joins to INNER 
joins depending on the predicates of the query to improve performance
   
   
   **Describe the solution you'd like**
   
   Add an OptimzierPass pass that will attempt to convert OUTER joins to inner 
joins.
   
   This will require some non trivial research  to figure out under what 
conditions the joins can be rewritten / converted
   
   **Additional context**
   Relevant discussion: 
https://github.com/apache/arrow-datafusion/pull/1618#discussion_r790020079
   
   
   You can see a version of this code in Spark here: 
https://github.com/apache/spark/blob/aaf0e5e71509a2324e110e45366b753c7926c64b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala#L119-L135
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to