Re: Make RelNode attribute order part of MEMO

Julian Hyde Fri, 19 Mar 2021 12:52:16 -0700

> There are hundreds of usages of the RexInputRef, so the implementation of
> this idea might be prohibitively expensive. But putting this problem aside,
> do you envision any other potential blockers for the implementation of that
> idea?


Suppose we wanted to build permutations into every registered RelNode. It’s 
rather difficult to put aside this problem, given that it would affect every 
RelOptRule (including ones in user code that we can’t fix).

Practically, we could achieve this by introducing a new API for rules (say an 
alternative to RelOptRuleCall.rels that returns the matched relational 
expressions and their field permutations.

One issue I see with unique field IDs is self-joins. Consider the query

SELECT *
FROM Emp AS e
  JOIN Emp AS m ON e.mgr = m.empno
WHERE e.sal > m.sal

Do e.sal and m.sal have the same field ID? For some purposes we would want them 
to have the same ID. But then the self-join brings them side-by-side, and we 
can’t have the same field ID twice in the result, so we have to arbitrarily 
give one of them a new ID. That arbitrariness is concerning.

We would save a lot of effort figuring out mappings between input and output 
fields. But we would have to put a lot of effort figuring out when to split 
fields (the self-join example above), combine them (when two columns become 
equal due to an = predicate) and other mappings (applying aggregate functions 
and expressions with certain identities).

We would end up maintaining equivalence sets of fields, because we did not know 
that two fields were identical when we created the RelNodes and discover the 
fact later. In the current scheme, based on int field offsets, all of that 
reasoning is local to rules and RelNodes, and that is a good thing. 

Julian


> On Mar 13, 2021, at 8:55 AM, Vladimir Ozerov <[email protected]> wrote:
> 
> Hi Julian,
> 
> Thank you for sharing these issues. We end up with almost the same ideas.
> We attempted to add an input permute to Join, which allowed us to avoid
> projects. However, that complicates the integration with other rules, just
> as you mention in [2]. Globally unique column IDs seem like a better option
> from that perspective. Moreover, unique IDs may simplify the implementation
> of other optimizations. For example, many join enumeration techniques,
> whether DP-based or top-down, require the decomposition of the join graph
> into independent vertices (inputs) and edges (conditions) and careful
> reconstruction of the alternative join trees, and RexInputRef is not very
> suitable for that process. Another possible example is recently reported
> [3].
> 
> There are hundreds of usages of the RexInputRef, so the implementation of
> this idea might be prohibitively expensive. But putting this problem aside,
> do you envision any other potential blockers for the implementation of that
> idea?
> 
> Regards,
> Vladimir.
> 
> [1] https://github.com/apache/calcite/pull/2359
> [2] https://issues.apache.org/jira/browse/CALCITE-62
> [3] https://issues.apache.org/jira/browse/CALCITE-4534
> 
> вт, 9 мар. 2021 г. в 22:40, Julian Hyde <[email protected]>:
> 
>> I investigated something similar a long time ago. We noticed that a
>> lot of trivial Project operators were being generated to compensate
>> for field re-ordering due to join transposition. And so the idea was
>> to allow each RelNode (and especially Join) to permute its output
>> fields.
>> 
>> Here is the case: https://issues.apache.org/jira/browse/CALCITE-62.
>> https://issues.apache.org/jira/browse/CALCITE-55 is related.
>> 
>> The problem, as I noted in CALCITE-62, is that people writing rules
>> have another mapping to deal with.
>> 
>> I believe that other systems, such as Spark's Catalyst planner, use
>> globally unique IDs for columns (as opposed to Calcite, whose column
>> references are only locally unique, ordinals of the input
>> operator(s)). Globally unique IDs would be superior for this problem
>> but perhaps introduce other challenges.
>> 
>> Julian
>> 
>> On Sun, Mar 7, 2021 at 11:25 PM Vladimir Ozerov <[email protected]>
>> wrote:
>>> 
>>> Hi,
>>> 
>>> Currently, the order of attributes is used to define operators
>> equivalence.
>>> This leads to several interesting problems, such as possible duplication
>> of
>>> expressions (e.g., "a,a,b") or additional work in rules to detect trivial
>>> projects and/or input permutations (e.g. "a,b -> b,a").
>>> 
>>> But the biggest problem is the join order planning. In Calcite, AxB is
>> not
>>> equivalent to BxA:
>>> 
>>>   1. It makes the ruleset [JoinCommuteRule, JoinAssociateRule]
>>>   insufficient to explore all bushy trees because the commute operation
>> adds
>>>   a project on top of the new join (AxB -> project(BxA)), thus not
>>>   allowing for the associate rule to be executed on the upper join. The
>>>   solution is to add the ProjectJoinTransposeRule to the ruleset, but
>> this
>>>   increases the planning time dramatically, making Apache Calcite
>> unsuitable
>>>   for the cost-based join planning even for relatively simple join
>> graphs.
>>>   2. It increases the number of join-related rule calls, which
>> complicates
>>>   the implementation of the new join enumeration planning rules (e.g.
>>>   top-down enumerators) because duplicate derivations compromise
>> performance.
>>> 
>>> My question is - has the community considered an idea to make the order
>> of
>>> columns a common property of all operators, somewhat similar to the
>> trait,
>>> but without an explicit enforcer?
>>> 
>>> For example, consider the following MEMO which the planner creates when
>>> trying to transform the join AxB to BxA:
>>> 
>>> G1: { Scan1[table=t1, cols=a,b] }
>>> G2: { Scan2[table=t2, cols=c,d] }
>>> G3: { AxB[G1, G2], Project[G4, cols=$2,$3,$0,$1] }
>>> G4: { BxA[G2, G1] }
>>> 
>>> However, if we make the column order part of the MEMO, we may potentially
>>> have something like this:
>>> 
>>> G1: { Scan1[table=t1, cols=a,b] }
>>> G2: { Scan2[table=t2, cols=c,d] }
>>> G3 [cols=G1.$0, G1.$1, G2.$0, G2.$1]: { AxB[G1, G2], BxA[G2, G1] }
>>> 
>>> Notice, how we were able to put AxB and BxA to the same equivalence
>> group.
>>> To my knowledge, CockroachDB uses a somewhat similar design.
>>> 
>>> I realize that this is rather a radical idea, and more design work is
>>> required to come with a proper proposal. At this point, I just would like
>>> to kindly ask the community to share high-level feedback on that. Were
>>> similar ideas proposed before?
>>> 
>>> Thank you,
>>> Vladimir.
>>

Re: Make RelNode attribute order part of MEMO

Reply via email to