peter-toth commented on PR #11683: URL: https://github.com/apache/datafusion/pull/11683#issuecomment-2271363280
I believe I understood what happened here. In https://github.com/apache/datafusion/pull/10835 the `CommonSubexprEliminate` rule [was converted to a `ApplyOrder::TopDown` rule](https://github.com/apache/datafusion/pull/10835/files#diff-351499880963d6a383c92e156e75019cd9ce33107724a9635853d7d4cd1898d0R577-R579), but also [the `self.rewrite()` call was kept](https://github.com/apache/datafusion/pull/10835/files#diff-351499880963d6a383c92e156e75019cd9ce33107724a9635853d7d4cd1898d0R195) and caused a weird plan traversal where some of the nodes are traversed 2 times. The main problem with the top-down conversion is that in some cases `CommonSubexprEliminate` collects adjacent nodes (e.g. `Window`) into groups and once we eliminated subexpression among those groups we transform back the result into adjacent nodes. This kind of transformations can't be definied with a simple top-down optimizer rule. Most likely this is why the rule handled recursion itself before https://github.com/apache/datafusion/pull/10835. Then I added `test_non_top_level_common_expression` test in this PR and noticed that a non root project node was not CSEd. But I made a mistake as the unit tests in `common_subexpr_eliminate.rs` don't actually use an `Optimizer` (configured to use `CommonSubexprEliminate`) but just invoke `CommonSubexprEliminate::rewrite()` directly. So despite the `CommonSubexprEliminate` was top-down, the test couldn't succeed. So I think the right fix is to: - Let the rule handle recursion (i.e. change `apply_order()` return `None`) to avoid double recursion. - Change `assert_optimized_plan_eq` to use an `Optimizer` to be able to test cases like `test_non_top_level_common_expression`. I've rebased the previous commit on the latest `main` and added a 2nd [commit](https://github.com/apache/datafusion/pull/11683/commits/6a6281120ff29952ce5b34656ff8c3645e863d8e) with the above 2 points. I ran some benchmarks and this PR is now sometimes better than `main`: ``` % critcmp main make-cse-top-down-like group main make-cse-top-down-like ----- ---- ---------------------- logical_aggregate_with_join 1.01 565.4±14.92µs ? ?/sec 1.00 561.0±9.20µs ? ?/sec logical_plan_tpcds_all 1.03 77.5±2.96ms ? ?/sec 1.00 75.2±1.10ms ? ?/sec logical_plan_tpch_all 1.00 7.7±0.09ms ? ?/sec 1.00 7.7±0.08ms ? ?/sec logical_select_all_from_1000 1.00 15.0±0.26ms ? ?/sec 1.01 15.2±0.18ms ? ?/sec logical_select_one_from_700 1.00 405.6±10.82µs ? ?/sec 1.00 404.3±10.31µs ? ?/sec logical_trivial_join_high_numbered_columns 1.01 402.7±5.89µs ? ?/sec 1.00 399.7±6.28µs ? ?/sec logical_trivial_join_low_numbered_columns 1.01 383.7±7.82µs ? ?/sec 1.00 378.9±5.03µs ? ?/sec physical_plan_tpcds_all 1.09 590.3±107.45ms ? ?/sec 1.00 543.2±3.93ms ? ?/sec physical_plan_tpch_all 1.01 34.1±0.47ms ? ?/sec 1.00 33.9±0.58ms ? ?/sec physical_plan_tpch_q1 1.00 1105.2±12.43µs ? ?/sec 1.00 1110.3±23.70µs ? ?/sec physical_plan_tpch_q10 1.01 1550.2±38.57µs ? ?/sec 1.00 1537.9±19.03µs ? ?/sec physical_plan_tpch_q11 1.01 1328.8±16.32µs ? ?/sec 1.00 1313.9±17.56µs ? ?/sec physical_plan_tpch_q12 1.01 1108.5±18.29µs ? ?/sec 1.00 1096.9±16.36µs ? ?/sec physical_plan_tpch_q13 1.00 755.8±10.98µs ? ?/sec 1.00 752.3±10.62µs ? ?/sec physical_plan_tpch_q14 1.04 942.4±73.52µs ? ?/sec 1.00 908.6±11.28µs ? ?/sec physical_plan_tpch_q16 1.01 1345.2±26.82µs ? ?/sec 1.00 1334.1±34.09µs ? ?/sec physical_plan_tpch_q17 1.01 1200.3±50.67µs ? ?/sec 1.00 1188.0±22.75µs ? ?/sec physical_plan_tpch_q18 1.02 1433.4±102.83µs ? ?/sec 1.00 1402.0±15.42µs ? ?/sec physical_plan_tpch_q19 1.02 2.6±0.20ms ? ?/sec 1.00 2.6±0.17ms ? ?/sec physical_plan_tpch_q2 1.00 2.9±0.04ms ? ?/sec 1.00 2.9±0.04ms ? ?/sec physical_plan_tpch_q20 1.04 1673.2±79.81µs ? ?/sec 1.00 1616.4±20.52µs ? ?/sec physical_plan_tpch_q21 1.04 2.4±0.10ms ? ?/sec 1.00 2.3±0.03ms ? ?/sec physical_plan_tpch_q22 1.02 1165.8±27.17µs ? ?/sec 1.00 1147.1±35.94µs ? ?/sec physical_plan_tpch_q3 1.00 1102.3±16.77µs ? ?/sec 1.00 1099.9±12.63µs ? ?/sec physical_plan_tpch_q4 1.00 846.3±8.77µs ? ?/sec 1.00 844.0±9.90µs ? ?/sec physical_plan_tpch_q5 1.01 1661.4±23.93µs ? ?/sec 1.00 1641.8±19.28µs ? ?/sec physical_plan_tpch_q6 1.00 584.7±9.64µs ? ?/sec 1.00 583.2±20.63µs ? ?/sec physical_plan_tpch_q7 1.00 2.1±0.03ms ? ?/sec 1.00 2.1±0.21ms ? ?/sec physical_plan_tpch_q8 1.01 2.7±0.04ms ? ?/sec 1.00 2.6±0.04ms ? ?/sec physical_plan_tpch_q9 1.01 1932.5±23.29µs ? ?/sec 1.00 1906.6±21.08µs ? ?/sec physical_select_all_from_1000 1.00 35.2±1.20ms ? ?/sec 1.01 35.5±0.48ms ? ?/sec physical_select_one_from_700 1.00 1781.3±59.97µs ? ?/sec 1.01 1805.4±43.25µs ? ?/sec ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
