duongcongtoai commented on issue #5492:
URL: https://github.com/apache/datafusion/issues/5492#issuecomment-2903387737
@logan-keede I expect to implement an optimizor that does 2 things (splitted
into 2 PRs)
- translate all subqueries (recursive aware) into dependent join
- decorrelate the dependent join into `DelimJoin` and `DelimGet`
In the first PR i don't expect to integrate this optimizor to the main flow
yet, the behavior is tested through in-code test instead of sqllogictests
In the second PR is where all integration happens, and we can integrate this
optimizor into the mainbranch and test them with sqllogictest.
But as you mention a lot of sqllogictests will be broken if we completely
deprecate the followings
```
Arc::new(DecorrelatePredicateSubquery::new()),
Arc::new(ScalarSubqueryToJoin::new())
```
So i think we devide the 2nd phase into 2 more subphases:
**2nd phase**
Let 3 optimizor exists temporarily such as
```
impl Optimizer {
/// Create a new optimizer using the recommended list of rules
pub fn new() -> Self {
...
Arc::new(DecorrelatePredicateSubquery::new()),
Arc::new(ScalarSubqueryToJoin::new()),
Arc::new(GeneralSubqueryDecorrelation::new()), <-----------This
is newly added
```
Should any queries that cannot be decorrelated by the previous 2 optimizor,
the new optimizor will come into play
In this phase alot of work can be done in parallel to support different
complex usecases, because more and more complex subqueries are supported
(`DelimGetRemoval` can also be implemented in this phase)
**3rd phase**
Deprecate the old optimizor
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]