houqp opened a new issue #1293:
URL: https://github.com/apache/arrow-datafusion/issues/1293


   **Describe the bug**
   
   Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.
   
   tpch query 8 is taking a long time to execute in ballista because one of its 
table joins in the subquery is planned as a cross join.
   
   **To Reproduce**
   
   ```
   ❯ create table part as select 1 as p_partkey;                                
                                                                                
                                
   0 rows in set. Query took 0.003 seconds.                                     
                 
   ❯ create table lineitem as select 1 as l_partkey, 2 as l_suppkey;            
                 
   0 rows in set. Query took 0.005 seconds.                                     
                 
   ❯ create table supplier as select 1 as s_suppkey;                            
                 
   0 rows in set. Query took 0.002 seconds.                                     
                                                           
   ❯ explain select * from part, supplier, lineitem where p_partkey = l_partkey 
and s_suppkey = l_suppkey;                                                      
                                
   
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------+                       
   | plan_type     | plan                                                       
                                                                                
                                                                                
  |               
   
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
   --------------------------------------------------+  
   | logical_plan  | Projection: #part.p_partkey, #supplier.s_suppkey, 
#lineitem.l_partkey, #lineitem.l_suppkey                                        
                                         
                                                     | 
   |               |   Join: #part.p_partkey = #lineitem.l_partkey, 
#supplier.s_suppkey = #lineitem.l_suppkey                                       
                                            
                                                     |                          
                 
   |               |     CrossJoin:                                             
                                                                                
                                                                                
  |                          
   |               |       TableScan: part projection=Some([0])                 
                 
                                                     |                          
                                                                                
                                
   |               |       TableScan: supplier projection=Some([0])             
                                                                                
                                
                                                     |                          
                                                                                
                                
   |               |     TableScan: lineitem projection=Some([0, 1])      
   ```
   
   **Expected behavior**
   
   The query should be planned into the following join statements:
   
   ```
   ❯ explain select * from part join lineitem on l_partkey = p_partkey join 
supplier on s_suppkey = l_suppkey;
   
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
   | plan_type     | plan                                                       
                                                                                
        |
   
+---------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
   | logical_plan  | Projection: #part.p_partkey, #lineitem.l_partkey, 
#lineitem.l_suppkey, #supplier.s_suppkey                                        
                 |
   |               |   Join: #lineitem.l_suppkey = #supplier.s_suppkey          
                                                                                
        |
   |               |     Join: #part.p_partkey = #lineitem.l_partkey            
                                                                                
        |
   |               |       TableScan: part projection=Some([0])                 
                                                                                
        |
   |               |       TableScan: lineitem projection=Some([0, 1])          
                                                                                
        |
   |               |     TableScan: supplier projection=Some([0])      
   ```
   
   **Additional context**
   
   Recreated from https://github.com/apache/arrow-datafusion/issues/892
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to