alamb opened a new issue, #13188: URL: https://github.com/apache/datafusion/issues/13188
### Describe the bug While enabling `StringView` reading from Parquet in https://github.com/apache/datafusion/pull/13101 @Dandandan noticed a slight regression for TPCH 18 https://github.com/apache/datafusion/pull/13101#issuecomment-2437865910 here is the query ```sql select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in ( select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 300 ) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate; ``` ### To Reproduce To reproduce Make data ````shell # make the data and get to the correct location cd datafusion/benchmarks ./bench.sh data tpch cd data/tpch_sf1 ``` Run query: ``` datafusion-cli -f ../../queries/q18.sql | grep Elapsed Elapsed 0.088 seconds. ``` When StringView is enabled it seems like it is slightly slower ### Expected behavior StringView should always be faster ### Additional context I took a brief look at the flamegraphs -- it seems like one difference could be `BatchCoalescer::push_batch`  There is a special case for StringView here: https://github.com/apache/datafusion/blob/6034be42808b43e3f48f6e58ec38cc35fa253abb/datafusion/physical-plan/src/coalesce/mod.rs#L117-L116 Here are the explain plans for the query before and after the change - [q18-before.txt](https://github.com/user-attachments/files/17577287/q18-before.txt) - [q18-after.txt](https://github.com/user-attachments/files/17577286/q18-after.txt) Here are the flamegraphs for the query before/after the change -  -  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
