comphead commented on issue #901:
URL:
https://github.com/apache/datafusion-comet/issues/901#issuecomment-2332866179
Im running slightly changed Q72 in DF
```
select i_item_desc
,w_warehouse_name
,d1.d_week_seq
,sum(case when p_promo_sk is null then 1 else 0 end) no_promo
,sum(case when p_promo_sk is not null then 1 else 0 end) promo
,count(*) total_cnt
from catalog_sales
join date_dim d1 on (cs_sold_date_sk = d1.d_date_sk)
join customer_demographics on (cs_bill_cdemo_sk = cd_demo_sk)
join household_demographics on (cs_bill_hdemo_sk = hd_demo_sk)
join item on (i_item_sk = cs_item_sk)
join inventory on (cs_item_sk = inv_item_sk)
join warehouse on (w_warehouse_sk=inv_warehouse_sk)
join date_dim d2 on (inv_date_sk = d2.d_date_sk)
join date_dim d3 on (cs_ship_date_sk = d3.d_date_sk)
left outer join promotion on (cs_promo_sk=p_promo_sk)
left outer join catalog_returns on (cr_item_sk = cs_item_sk and
cr_order_number = cs_order_number)
where d1.d_week_seq = d2.d_week_seq
and inv_quantity_on_hand < cs_quantity
and d3.d_date > d1.d_date + interval '5' day
and d1.d_year = 1999
group by i_item_desc,w_warehouse_name,d1.d_week_seq
order by total_cnt desc, i_item_desc, w_warehouse_name, d_week_seq
LIMIT 100
```
HashJoin - 9sec, SMJ - 20 sec
Having LEFT OUTER joins removed the results are still the same
HashJoin - 8.5 SMJ - 19 sec
I'll build a flamegraph for SMJ soon
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]