Re: [I] Convert inner joins to semi joins [datafusion]

via GitHub Fri, 29 May 2026 11:02:44 -0700


neilconway commented on issue #22594:
URL: https://github.com/apache/datafusion/issues/22594#issuecomment-4578273241


   Working on this, there are some shortcomings we'll probably need to resign 
ourselves to in the initial implementation:
   
   * We don't currently compute the equivalence relation that is implied by the 
predicates in the WHERE clause. So for example, `SELECT a.x, max(b.y) FROM a, b 
WHERE a.y = b.y GROUP BY a.x;` _should_ be optimizable, but we don't currently 
realize that `b.y` is equivalent to `a.y`, and so `b` doesn't contribute any 
columns to the parent plan.
   * We don't currently identify which aggregates are duplicate-insensitive. So 
in the query mentioned previously, we would _also_ fail to optimize it because 
we don't recognize that `max` is duplicate-insensitive.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Convert inner joins to semi joins [datafusion]

Reply via email to