Hello Haisheng,

thanks for bringing up this discussion. I was also part of the people who
expressed concerns about CALCITE-3997 due to a backwards compatibility
issue in a very specific scenario in a downstream project. This issue was
satisfactory handled with a temporary workaround for the RC, and now this
workaround needs to be removed for the next version (CALCITE-4032).
I agree with your analysis, and I think my specific scenario should not
prevent this new general approach to be fully implemented (including
CalcMergerRule). Thus, I am ok with CALCITE-4032, I will implement my own
rule on my side to solve my specific problem, under no circumstances this
should be the responsibility of the Calcite library.

Best regards,
Ruben




Le lun. 1 juin 2020 à 23:31, Stamatis Zampetakis <[email protected]> a
écrit :

> Hello,
>
> Since I was among the people who raised some concerns about this during the
> vote I would like to mention that it was mostly about the fact it came very
> late in the release cycle.
> It is better to avoid breaking changes in the RC phase since it may be
> difficult for people to react or test it.
>
> I agree with Haisheng that Calcite code should follow the best practice so
> that other projects can draw inspiration from it. Nevertheless, I would
> like to mention again that Enumerable is not a convention that is used only
> for demonstration purposes but it is also used in production systems so we
> have to be careful about backward compatibility. Still nobody complaint so
> far so it means that we are doing pretty good on this aspect :)
>
> The release brought some very cool features and I fully support the
> direction on which we are going. I don't think there is a need to open a
> new thread here since we can discuss under the respective jira(s) if
> necessary.
>
> Best,
> Stamatis
>
>
>
>
>
>
> On Wed, May 27, 2020 at 5:07 AM Haisheng Yuan <[email protected]> wrote:
>
> > Hi all,
> >
> > During vote of 1.23.0 rc1, there are some discussions about the solution
> > of CALCITE-3997.
> >
> > Some may think it is a matter of best practice, but in many cases it is a
> > matter of right or wrong.
> >
> > Here is the rationale.
> >
> > The original query in CALCITE-3997 is
> >
> > SELECT t1.k1, t2.k1 FROM t1, t2
> > ON t1.k1=t2.k1
> > WHERE t1.n1 + 1 = t2.n3
> >
> > After FilterIntoJoin and MergeJoinRule, planner generates an alternative:
> > MergeJoin on t1.k1=t2.k1 and t1.n1 + 1 = t2.n3
> >
> > The mergejoin's collation is [k1].
> >
> > This mergejoin matches JoinPushExpressionsRule, which pushes expression
> in
> > join condition, and generates a Project in child.
> > Project t1.k1, t2.k1
> >   -> MergeJoin on t1.k1=t2.k1 and t1.col = t2.n3
> >        -> Project k1, n1+1 as col
> >             -> TableScan t1
> >        -> TableScan t2
> >
> > The new mergejoin now has 2 equi-conditions, but the collation is the
> same
> > with old mergejoin, still [k1], which is wrong.
> > It should be updated to [k1,col] and [k1, n3]. Hence error.
> >
> > The similar issue already happened before, see CALCITE-3353. And it is
> > very easy to trigger same error using different query even without
> > FilterIntoJoin. If we change WHERE to AND in the query,
> > SELECT t1.k1, t2.k1 FROM t1, t2
> > ON t1.k1=t2.k1
> > AND t1.n1 + 1 = t2.n3
> >
> > or with the following query with join condition const reductions on, we
> > can still see the issue.
> > SELECT t1.k1, t2.k1 FROM t1, t2
> > ON t1.k1=t2.k1
> > AND (t1.n1 + 1 = t2.n3 or t1.n1 + 1 = t2.n3)
> >
> > Can we use the similar approach that in CALCITE-3353 to fix the issue?
> > We can, but it just hide the issue. Doing that way can generate mergejoin
> > with correct traitset, but the rule is still generating an invalid plan.
> > Because the MergeJoin's new input is a LogicalProject, which is generated
> > by RelBuilder.
> >
> > In Calcite, RelNode is a very special class. Unlike operators in Cascades
> > or Volcano paper, where logical/physical operator and plan expression are
> > separated, when constructing logical/physical expression tree, each
> > operator doesn't require anything trait from its child. But RelNode in
> > Calcite not only represents operator, but also plan expression, and can
> > even represent a MEMO group with required traitset, which is RelSubset.
> > When constructing a RelNode XXX, we need to specify the input RelNodes,
> the
> > input nodes' traitsets will become node XXX's requirement on those
> inputs,
> > even we are not aware that we are requesting it, which is extremely
> > error-prone.
> >
> > In above case, the new MergeJoin's left input is a LogicalProject, which
> > means the physical MergeJoin is requesting its left input to be logical
> > operator with NONE convention, which has {inf} large cost. That means the
> > newly generated MergeJoin is never implementable.
> >
> > What if we have a physical RelBuilder to generate the Project in the
> rule?
> > The consequence is worse.
> > With logical RelBuilder, it just generates an invalid plan, we are still
> > safe because it won't be selected as the best plan. If we get a physical
> > RelBuilder, which generates a physical plan that is implementable, but
> this
> > is a wrong plan. Because in RelOptUtil.pushDownJoinConditions, when
> > creating the MergeJoin, it just specifies its traitset and inputs, but it
> > doesn't request the correct collation on the MergeJoin's inputs. The
> > Project's collation is EMPTY, that means the MergeJoin doesn't require
> any
> > collation on the Project. Apparently this will be a wrong plan.
> >
> > This kind of issue doesn't limit to JoinPushExpressionsRule, any rule
> that
> > has Join as operand may have the same issue. In fact, all rules that has
> > Aggregate as operand may suffer the same issue. Because in 1.23.0, we
> add a
> > new physical operator EnumerableSortedAggregate, which requires its input
> > to be sorted by group keys. AggregateProjectMergeRule can generate wrong
> > plan if we apply it on EnumerableSortedAggregate. In addition, we are now
> > discussing about adding a new EnumerableMergeUnion, to require the
> UNION's
> > input sorted. Maybe in next few releases, we will have sort based
> > EnumerableWindow operator. Finally we may find that only ProjectMerge,
> > FilterMerge, CalcMerge can work correctly with physical operators,
> because
> > these operators doesn't require any trait from input, so it just happens
> to
> > work. The change of TransformationRule looks risky at first glance,
> > however, it makes those rules less risky.
> >
> > Some thinks we should provide different rule instances / types to match
> > logical/physical operators. But IMO, these built-in transformation rules
> > shouldn't match physical operators at all in VolcanoPlanner, because
> most,
> > if not all, of them can't worked correctly with physical operators. Even
> > for Project/Filter/Calc merge rules, all the merge work can be done
> through
> > logical operators, applying on physical operators will duplicate the
> > process again.
> >
> > Even if we just view this as a best practice, and we (as Calcite project
> > members) don't follow the best practice on Calcite's built-in
> > EnumerableConvention, how can we tell downstream projects to follow the
> > best practices? Is there any document stating the best practices? Most of
> > them just follow what Calcite does. Just like children follow the example
> > of their parents. If we checkout the code of Apache Druid, Apache Beam,
> we
> > will find that they are doing the similar way as Calcite's Enumerable
> > Convention. Beam even adds JoinCommuteRule, JoinAssociateRule,
> > JoinPushThroughJoinRule.RIGHT, JoinPushThroughJoinRule.LEFT all together
> in
> > its ruleset. I guess join performance is not a problem for BEAM.
> >
> > Finally, I don't think CalcMergeRule should be an exception, nothing is
> > impossible without physical CalcMergeRule.
> >
> > Thanks,
> > Haisheng
> >
> >
> >
>

Reply via email to