alamb commented on issue #10048:
URL: https://github.com/apache/datafusion/issues/10048#issuecomment-2541729713

   There is some more discussion about this here: 
https://github.com/apache/datafusion/issues/13659#issuecomment-2541722186
   
   Summarizing: Postgres appears to plan this type of query as a NestedLoops 
join:
   
   ```sql
   postgres=# create table unnest_test (c1 int[]);
   CREATE TABLE
   postgres=# insert into unnest_test values (ARRAY [1,2,3]);
   INSERT 0 1
   postgres=# select * from unnest_test u, unnest(u.c1);
      c1    | unnest 
   ---------+--------
    {1,2,3} |      1
    {1,2,3} |      2
    {1,2,3} |      3
   (3 rows)
   ```
   
   ```sql
    Nested Loop  (cost=0.00..295.60 rows=13600 width=36)
      Output: u.c1, unnest.unnest
      ->  Seq Scan on public.unnest_test u  (cost=0.00..23.60 rows=1360 
width=32)
            Output: u.c1
      ->  Function Scan on pg_catalog.unnest  (cost=0.00..0.10 rows=10 width=4)
            Output: unnest.unnest
            Function Call: unnest(u.c1)
   ```
   
   🤔 
   
   I think the postgres join operator re-evaluates the right input 
(`pg_catalog.unnest`) on each input row to the join which is how the lateral 
joins can refer to the outer query
   
   The current DataFusion join operators don't run the input over and over 
again -- they normally run each input and then combine them (they don't restart 
inputs)
   
   So in a query like
   ```sql
   select * from t1, unnest(t1.c1)
   ```
   
   I think we would need a (new) type of JoinExec that actually does run the 
outer query (aka `unnest(t1.c1)` for each input row of `t1`) 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to