irenjj commented on issue #5492:
URL: https://github.com/apache/datafusion/issues/5492#issuecomment-2894078258

   > Thank you everyone for your opinions. Looks like my implementation is 
trying to wrap everything inside a single optimizor, which is hard to follow 
and reduces space for collaborative work, and since we can use duckdb's 
implementation detail, the next steps can be the followings:
   > 
   > * Define a new `DelimGet` LogicalPlan type and implement existing methods 
for a standard LogicalPlan. For `DelimJoin` we can reuse existing 
`LogicalPlan::Join` with a new join_type, because its purpose is only to detect 
if a join comes from a dependentJoin or not
   > * Rewrite the Existing Subqueries into 2 operators `DelimJoin` and 
`DelimGet` (either at planning stage or optimizor stage, duckdb does this at 
planning stage)
   > * Decorrelation (duckdb does this at planning stage)
   > * DelimGet removal
   
   Thanks @duongcongtoai ! This is a very good idea! I also think we can start 
with simple unnest, we may need to introduce some DuckDB structures: new 
logical plan/expr(`DelimScan`, ...), some new structures (`delim_offset`, 
`has_correlated_expressions`, ..) By trying to refactor simple unnest, we can 
discover some limitations of DataFusion and adjust our follow-up plans in a 
timely manner.
   cc @alamb @jayzhan211 @suibianwanwank @xudong963 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to