avantgardnerio commented on issue #5808:
URL: 
https://github.com/apache/arrow-datafusion/issues/5808#issuecomment-1492124630

   > take a look if you have time
   
   Unfortunately my availability is low right now. If @mingmwang 's claim is 
correct (which I have no reason to doubt) that:
   
   ```
   SELECT t1.id, t1.name FROM t1 WHERE t1.id in (SELECT t2.id FROM t2 where 
t1.name = t2.name limit 10)
   ```
   
   > can not be de-correlated
   
   then I think we'll need to have the ability to execute plans even if this 
rule fails (i.e. nested loop execution). I don't think I ever intended it to 
decorrelate _all_ subqueries - it was designed to hit the 80% case and get 
TPC-H working.
   
   At the time, returning an error was considered the proper thing to do. The 
API changed so now the rule needs to be updated to plumb `Ok(None)` down 
through all the layers of recursion, which can be verbose and non-trivial.
   
   My recommendation at the time (which I would still assert) is that it would 
make the life of optimizer rule authors considerably simpler if we add a 
`DataFusionError::CanNotOptimize` error and simply return that in this case, 
which would get treated the same as `Ok(None)` so it keeps the code readable 
and simplifies plumbing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to