alamb opened a new issue, #17259: URL: https://github.com/apache/datafusion/issues/17259
### Is your feature request related to a problem or challenge? @MrPowers reportes in discord: https://discord.com/channels/885562378132000778/1290751484807352412/1407568961561952277 > I ran the TPC-H queries on my Macbook M3 with 16GB of RAM with different scale factors. The DataFusion, DuckDB, and Polars Streaming results are similar for scale factor 5: <img width="1070" height="850" alt="Image" src="https://github.com/user-attachments/assets/21aba060-b4b3-4f0b-85ba-66e56c48f1a3" /> > I ran the TPC-H queries on my Macbook M3 with 16GB of RAM with different scale factors. The DataFusion, DuckDB, and Polars Streaming results are similar for scale factor 5: <img width="1054" height="844" alt="Image" src="https://github.com/user-attachments/assets/a7813828-2d3f-4d3d-bb29-9373841ce9ac" /> ### Describe the solution you'd like Figure out why q4, q7 and q9 are very slow The TPCH queries are here: https://github.com/apache/datafusion/tree/main/benchmarks/queries [q4](https://github.com/apache/datafusion/blob/main/benchmarks/queries/q4.sql) ```sql select o_orderpriority, count(*) as order_count from orders where o_orderdate >= '1993-07-01' and o_orderdate < date '1993-07-01' + interval '3' month and exists ( select * from lineitem where l_orderkey = o_orderkey and l_commitdate < l_receiptdate ) group by o_orderpriority order by o_orderpriority; ``` [q7](https://github.com/apache/datafusion/blob/main/benchmarks/queries/q7.sql) ```sql select supp_nation, cust_nation, l_year, sum(volume) as revenue from ( select n1.n_name as supp_nation, n2.n_name as cust_nation, extract(year from l_shipdate) as l_year, l_extendedprice * (1 - l_discount) as volume from supplier, lineitem, orders, customer, nation n1, nation n2 where s_suppkey = l_suppkey and o_orderkey = l_orderkey and c_custkey = o_custkey and s_nationkey = n1.n_nationkey and c_nationkey = n2.n_nationkey and ( (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY') or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE') ) and l_shipdate between date '1995-01-01' and date '1996-12-31' ) as shipping group by supp_nation, cust_nation, l_year order by supp_nation, cust_nation, l_year; ``` [q9](https://github.com/apache/datafusion/blob/main/benchmarks/queries/q9.sql) ```sql select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%green%' ) as profit group by nation, o_year order by nation, o_year desc; ``` ### Describe alternatives you've considered _No response_ ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org