DamonZhao-sfu commented on issue #588:
URL: 
https://github.com/apache/datafusion-comet/issues/588#issuecomment-2240209375

   > > Hi @DamonZhao-sfu. For query 72, are you enabling CBO in Spark or using 
any form of join reordering or are you using the official version of the query 
that joins catalog_sales to inventory first? I am asking because your times for 
q72 (in both Spark and Comet) are faster than I am seeing locally.
   > 
   > No, i did not enable CBO and join reorder. I'm using the official version. 
here's my sql:
   > 
   > ```
   > select  i_item_desc
   >       ,w_warehouse_name
   >       ,d1.d_week_seq
   >       ,sum(case when p_promo_sk is null then 1 else 0 end) no_promo
   >       ,sum(case when p_promo_sk is not null then 1 else 0 end) promo
   >       ,count(*) total_cnt
   > from catalog_sales
   > join inventory on (cs_item_sk = inv_item_sk)
   > join warehouse on (w_warehouse_sk=inv_warehouse_sk)
   > join item on (i_item_sk = cs_item_sk)
   > join customer_demographics on (cs_bill_cdemo_sk = cd_demo_sk)
   > join household_demographics on (cs_bill_hdemo_sk = hd_demo_sk)
   > join date_dim d1 on (cs_sold_date_sk = d1.d_date_sk)
   > join date_dim d2 on (inv_date_sk = d2.d_date_sk)
   > join date_dim d3 on (cs_ship_date_sk = d3.d_date_sk)
   > left outer join promotion on (cs_promo_sk=p_promo_sk)
   > left outer join catalog_returns on (cr_item_sk = cs_item_sk and 
cr_order_number = cs_order_number)
   > where d1.d_week_seq = d2.d_week_seq
   >   and inv_quantity_on_hand < cs_quantity 
   >   and d3.d_date > d1.d_date + 5
   >   and hd_buy_potential = '1001-5000'
   >   and d1.d_year = 2001
   >   and cd_marital_status = 'M'
   > group by i_item_desc,w_warehouse_name,d1.d_week_seq
   > order by total_cnt desc, i_item_desc, w_warehouse_name, d_week_seq
   >  LIMIT 100 ;
   > ```
   
   @andygrove  could you provide the optimized join order version of q72? 
Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to