Playing with TPC-H and comparing performance between cached (serialized in-memory tables) and uncached (DF from parquet) results in various SQL queries performing much worse, duration-wise.
I see some physical plans have an extra layer of shuffle/sort/merge under cached scenario. I could do some filtering by key to optimize, but I'm just curious as to why out-of-the-box planning is more complex and slower when tables are cached to mem. Thanks!