avantgardnerio commented on issue #5808: URL: https://github.com/apache/arrow-datafusion/issues/5808#issuecomment-1492124630
> take a look if you have time Unfortunately my availability is low right now. If @mingmwang 's claim is correct (which I have no reason to doubt) that: ``` SELECT t1.id, t1.name FROM t1 WHERE t1.id in (SELECT t2.id FROM t2 where t1.name = t2.name limit 10) ``` > can not be de-correlated then I think we'll need to have the ability to execute plans even if this rule fails (i.e. nested loop execution). I don't think I ever intended it to decorrelate _all_ subqueries - it was designed to hit the 80% case and get TPC-H working. At the time, returning an error was considered the proper thing to do. The API changed so now the rule needs to be updated to plumb `Ok(None)` down through all the layers of recursion, which can be verbose and non-trivial. My recommendation at the time (which I would still assert) is that it would make the life of optimizer rule authors considerably simpler if we add a `DataFusionError::CanNotOptimize` error and simply return that in this case, which would get treated the same as `Ok(None)` so it keeps the code readable and simplifies plumbing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
