Sorry for the late reply, I've reproduced the problem here
https://github.com/cshuo/calcite/commit/b9a7fb5f536825d3a577bf42a5fc6cc7d4df7929
.

On Wed, Mar 25, 2020 at 12:38 AM Julian Hyde <[email protected]> wrote:

> It does seem to be something that RelBuilder could do. (RexSimplify can’t
> really do it, because it doesn’t know how the expression is being used.)
>
> It’s also worth discovering why the CAST was added in the first place. It
> doesn’t seem to be helpful. I think we should strive to eliminate all of
> the slightly unhelpful things that Calcite does; those things can add up
> and cause major inefficiencies in the planning process and/or sub-optimal
> plans.
>
> Julian
>
>
> > On Mar 24, 2020, at 1:47 AM, Zoltan Haindrich <[email protected]> wrote:
> >
> > Hey,
> >
> > That's a great diagnosis :)
> > I would guess that newCondition became non-nullable for some reason
> (rexSimplify runs under RexProgramBuilder so it might be able to narrow the
> nullability)
> > you could try invoking simplify.simplifyPreservingType() on it to see if
> that would help.
> >
> > > I know it's necessary to preserve the nullability when simplifying a
> boolean expression in project columns, but as for condition in Filter/Calc,
> may be we can omit the
> > > nullability?
> > I think that could probably work - we can't change the nullability on
> project columns because those could be referenced (and the reference also
> has the type) ; but for filter/join conditions we don't need to care with
> it.
> > It seems we already have a "matchnullability" in ReduceExpressionsRule ;
> for FILTER/JOIN we should probably turn that off...  :)
> >
> > cheers,
> > Zoltan
> >
> >
> > On 3/24/20 9:15 AM, Shuo Cheng wrote:
> >> Hi Zoltan,
> >> I encountered the problem when running TPC tests, and have not
> reproduced it in Calcite master.
> >> But I figured it out how the problem is produced:
> >> There is semi join with the condition:AND(EXPANDED_INDF1,
> EXPANDED_INDF2), type of AND is BOOLEAN with nullable `true`
> >> After JoinPushExpressionsRule -->> join condition: AND(INDF1, INDF2),
> type of AND is BOOLEAN with nullable `true`
> >> After  SemiJoinProjectTransposeRule --> Join condition: CAST(AND(INDF1,
> INDF2)), type of AND is BOOLEAN with nullable `false`
> >> Just as what you suspected, It's in `SemiJoinProjectTransposeRule`
> where forced type correction is added by `RexProgramBuilder#addCondition`,
> which will call `RexSimplify#simplifyPreservingType` before registering an
> expression.
> >> I know it's necessary to preserve the nullability when simplifying a
> boolean expression in project columns, but as for condition in Filter/Calc,
> may be we can omit the nullability?
> >> Best Regards,
> >> Shuo
> >> On Tue, Mar 24, 2020 at 3:35 PM Zoltan Haindrich <[email protected] <mailto:
> [email protected]>> wrote:
> >>    Hey Shuo!
> >>    I think that simplification should been made on join conditions -
> I've done a quick check; and it seems to be working for me.
> >>    I suspected that it will be either a missing call to RexSimplify for
> some reason - or it is added by a forced return type correction: IIRC there
> are some cases in which
> >>    the
> >>    RexNode type should retained after simplification.
> >>    Is this reproducible on current master; could you share a testcase?
> >>    cheers,
> >>    Zoltan
> >>    On 3/24/20 7:28 AM, Shuo Cheng wrote:
> >>     > Hi, Julian, That's what we do as a workaround way. we remove CAST
> which are
> >>     > only widening nullability as what CALCITE-2695 does before
> applying
> >>     > hash-join/sort-merge-join rule, such that equiv predicate can be
> split
> >>     > out.  I'm not sure whether it's properly for Calcite to do the
> 'convert
> >>     > back' job, for example, simplify the join condition when create a
> Join; Or
> >>     > maybe let other systems what use Calcite to do the "convert back"
> job as an
> >>     > optimization? What do you think?
> >>     >
> >>     > On Tue, Mar 24, 2020 at 2:04 PM Julian Hyde <
> [email protected] <mailto:[email protected]>> wrote:
> >>     >
> >>     >> Or convert it back to a not-nullable BOOLEAN? The join condition
> treats
> >>     >> UNKNOWN the same as FALSE, and besides UNKNOWN will never occur,
> so the
> >>     >> conditions with and without the CAST are equivalent.
> >>     >>
> >>     >> Julian
> >>     >>
> >>     >>> On Mar 23, 2020, at 9:34 PM, Shuo Cheng <[email protected]
> <mailto:[email protected]>> wrote:
> >>     >>>
> >>     >>> Hi all,
> >>     >>>
> >>     >>> Considering the Join condition 'CAST(IS_NOT_DISTINCT_FROM($1,
> $2),
> >>     >>> BOOLEAN)', which cast the non-nullable BOOLEAN to nullable
> BOOLEAN,
> >>     >> Calcite
> >>     >>> can not split out equiv predicate, thus some join operation
> like hash
> >>     >> join
> >>     >>> / sort merge join may not be used. Maybe we can
> >>     >>> expand RelOptUtil#splitJoinCondition to support this scenario?
> >>     >>
> >>     >
>
>

Reply via email to