mingmwang commented on PR #6457: URL: https://github.com/apache/arrow-datafusion/pull/6457#issuecomment-1572297060
> Why we don't implement the unnesting arbitrary subquery paper? I think it's state of art.🤔 What this PR and the previous PRs I implemented/refactored still belong to the simple Unnesting method, they covers the Predicate(In/Exists) Subquery and Scalar Subquery cases in which the correlated expressions can be pull up and correlation can be converted to out joins or semi/anti joins. For other more complex cases, they can be de-correlated using the methods mentioned in the unnesting arbitrary subquery paper. I will try to implement it later this year. Why not implement the unnesting arbitrary subquery paper directly is because this method might introduce additional joins compared to the simple unnesting method. The additional join comes from the inner table join with the distinct set(magic set). You can play with the `Hyper` web interface(Hyper implemented this unnesting arbitrary subquery paper) and check the plan. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
