alamb opened a new issue #1670: URL: https://github.com/apache/arrow-datafusion/issues/1670
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** LEFT OUTER, RIGHT OUTER, and FULL OUTER JOINs are often more expensive to evaluate and preclude other optimizations (such as pushing down predicates as can be seen in #1618) As such, sophisticated optimizers will actually rewrite OUTER joins to INNER joins depending on the predicates of the query to improve performance **Describe the solution you'd like** Add an OptimzierPass pass that will attempt to convert OUTER joins to inner joins. This will require some non trivial research to figure out under what conditions the joins can be rewritten / converted **Additional context** Relevant discussion: https://github.com/apache/arrow-datafusion/pull/1618#discussion_r790020079 You can see a version of this code in Spark here: https://github.com/apache/spark/blob/aaf0e5e71509a2324e110e45366b753c7926c64b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala#L119-L135 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
