Hi, Xiaoju, Thanks for sending this to the dev list. The current join reordering rule is just a stats based optimizer rule. Either top-down or bottom-up optimization can achieve the same-level optimized plans. DB2 is using bottom up. In the future, we plan to move the stats based join reordering rule to the cost-based planner, which is the right place of this rule based on the original design of Spark SQL.
Actually, building a good cost model is much more difficult than implementing such a classic framework, especially when Spark does not own the data. Also, we need to compute incremental stats instead of always recomputing the stats. Cheers, Xiao 吴晓菊 <chrysan...@gmail.com> 于2018年9月24日周一 下午7:53写道: > Hi All, > > Current Spark CBO implements a cost based multi-way join reordering > algorithm based on the System-R’s paper [Access Path-SIGMOD’79] > <http://x86.cs.duke.edu/courses/spring03/cps216/papers/selinger-etal-1979.pdf>. > When building m-way joins, it uses a bottom-up approach and put all items > (basic joined nodes) into level 0, then build all two-way joins at level 1 > from plans at level 0 (single items), then build all 3-way joins ... etc. > The algorithm also considers all combinations including left-deep trees, > bushy trees, and right-deep-trees. It also prunes cartesian product > candidates. > > While we still found many *limitations* of current CBO implementation: > 1. The current CBO is a rule in logic phase, it only outputs one logical > plan to physical phase optimize, while we cannot make sure the best plan in > logical phase is still the best after physical optimize. > > 2. In current bottom-up approach, we keeps only one best plan for each > level, while we cannot make sure to get the exact best plan for all from > the best plan for each level. > > 3. Current cost formula cost = weight * cardinality + (1.0 - weight) * > size from which the first portion roughly corresponds to the CPU cost and > the second portion roughly corresponds to the I/O cost. The cost formula > is over simplified and. It treats all the join implementations the same > way and doesn't take shuffle and sort cost into consideration, while Shuffle > Exchange is one of the heaviest physical operator in Spark SQL. > > 4. Equivalent join conditions are not supported. For example, (A join B > join C on a=b and b=c) can be reordered to (A join C join B on a=c and c=b) > which is possible to be more efficient. While in current implementation, we > will not get condition "a=c" so will take "A join C" like a Cartesian > Product and then exclude it. > > The bottom-up approach first came up from the System-R optimizer (1979). It > quickly became a standard and many of the modern relation database > optimizers are “System-R style”, for example, Oracle, PostgreSQL, MySQL, > DB2. > > As time goes by, new styles optimizer were invented: Volcano(1993) and > Cascades(1995). They are not that famous compared to System-R but still be > wildly used in practice: Microsoft SQL Server, Greenplum Orca, Apache > Calcite. They implement Top-down transformational search algorithm and > provide extensible optimization framework. > > A top-down optimization framework can help us solve the above limitations > since it has a more complete search space and combines the logical and > physical phases to have a more accurate cost estimation. And about the > efficiency of having all alternatives plans, Cascades also provides pruning > to save the search space. > > What about implementing *a new Cascades style CBO for Spark SQL*? > It could be a new rule in current "Planner" which reads a logical plan > after heuristics rules and outputs a best physical plan with least cost > after reorder and physical implementation rules. > > Xiaoju Wu > Phone:+86 17717640807 > >