I’m not aware of any research papers that we followed (except in the area of materialized view matching). Calcite’s strategy is driven by our use of the Volcano planing algorithm: let people write transformation rules, each of which is valid, and let people combine rules to achieve the desired effects.
Here are the rules that pertain to Union (and to general set-operations, which generally have an instance for Union): $ git ls-files | egrep '(Union|SetOp).*Rule' core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableMergeUnionRule.java core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableRepeatUnionRule.java core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableUnionRule.java core/src/main/java/org/apache/calcite/rel/rules/AggregateUnionAggregateRule.java core/src/main/java/org/apache/calcite/rel/rules/AggregateUnionTransposeRule.java core/src/main/java/org/apache/calcite/rel/rules/FilterSetOpTransposeRule.java core/src/main/java/org/apache/calcite/rel/rules/JoinUnionTransposeRule.java core/src/main/java/org/apache/calcite/rel/rules/ProjectSetOpTransposeRule.java core/src/main/java/org/apache/calcite/rel/rules/SortUnionTransposeRule.java core/src/main/java/org/apache/calcite/rel/rules/UnionEliminatorRule.java core/src/main/java/org/apache/calcite/rel/rules/UnionMergeRule.java core/src/main/java/org/apache/calcite/rel/rules/UnionPullUpConstantsRule.java core/src/main/java/org/apache/calcite/rel/rules/UnionToDistinctRule.java Rules are often “no brainer” improvements, and therefore can be applied with or without a cost model. But a few have to be applied with care. Generally Calcite provides the ingredients and it’s up to people how they use those ingredients. Most of the rules could be written for n-ary (not just binary) unions but I’m not sure that they have. Volcano rule matching is a bit tricky for variable numbers of inputs. Julian > On May 7, 2022, at 5:02 AM, Amela Fejza <[email protected]> wrote: > > Dear Madam, Sir, > > I am curious about the way you deal with the Union binary operator in the > Apache calcite query optimizer. > Is there any research paper that has been used as a reference for the > implementation ? > > Do you use any pulling techniques for Union ? Or do you deal with terms in > which the Union can appear anywhere ? > > Amela Fejza > PhD student at Inria
