duongcongtoai commented on issue #5492: URL: https://github.com/apache/datafusion/issues/5492#issuecomment-2903387737
@logan-keede I expect to implement an optimizor that does 2 things (splitted into 2 PRs) - translate all subqueries (recursive aware) into dependent join - decorrelate the dependent join into `DelimJoin` and `DelimGet` In the first PR i don't expect to integrate this optimizor to the main flow yet, the behavior is tested through in-code test instead of sqllogictests In the second PR is where all integration happens, and we can integrate this optimizor into the mainbranch and test them with sqllogictest. But as you mention a lot of sqllogictests will be broken if we completely deprecate the followings ``` Arc::new(DecorrelatePredicateSubquery::new()), Arc::new(ScalarSubqueryToJoin::new()) ``` So i think we devide the 2nd phase into 2 more subphases: **2nd phase** Let 3 optimizor exists temporarily such as ``` impl Optimizer { /// Create a new optimizer using the recommended list of rules pub fn new() -> Self { ... Arc::new(DecorrelatePredicateSubquery::new()), Arc::new(ScalarSubqueryToJoin::new()), Arc::new(GeneralSubqueryDecorrelation::new()), <-----------This is newly added ``` Should any queries that cannot be decorrelated by the previous 2 optimizor, the new optimizor will come into play In this phase alot of work can be done in parallel to support different complex usecases, because more and more complex subqueries are supported (`DelimGetRemoval` can also be implemented in this phase) **3rd phase** Deprecate the old optimizor -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org