Hey,

That's a great diagnosis :)
I would guess that newCondition became non-nullable for some reason 
(rexSimplify runs under RexProgramBuilder so it might be able to narrow the 
nullability)
you could try invoking simplify.simplifyPreservingType() on it to see if that 
would help.

> I know it's necessary to preserve the nullability when simplifying a boolean 
expression in project columns, but as for condition in Filter/Calc, may be we can 
omit the
> nullability?
I think that could probably work - we can't change the nullability on project columns because those could be referenced (and the reference also has the type) ; but for filter/join conditions we don't need to care with it.
It seems we already have a "matchnullability" in ReduceExpressionsRule ; for 
FILTER/JOIN we should probably turn that off...  :)

cheers,
Zoltan


On 3/24/20 9:15 AM, Shuo Cheng wrote:
Hi Zoltan,

I encountered the problem when running TPC tests, and have not reproduced it in 
Calcite master.

But I figured it out how the problem is produced:

There is semi join with the condition:AND(EXPANDED_INDF1, EXPANDED_INDF2), type 
of AND is BOOLEAN with nullable `true`

After JoinPushExpressionsRule -->> join condition: AND(INDF1, INDF2), type of 
AND is BOOLEAN with nullable `true`

After  SemiJoinProjectTransposeRule --> Join condition: CAST(AND(INDF1, 
INDF2)), type of AND is BOOLEAN with nullable `false`

Just as what you suspected, It's in `SemiJoinProjectTransposeRule` where forced type correction is added by `RexProgramBuilder#addCondition`, which will call `RexSimplify#simplifyPreservingType` before registering an expression.

I know it's necessary to preserve the nullability when simplifying a boolean expression in project columns, but as for condition in Filter/Calc, may be we can omit the nullability?


Best Regards,
Shuo

On Tue, Mar 24, 2020 at 3:35 PM Zoltan Haindrich <[email protected] 
<mailto:[email protected]>> wrote:

    Hey Shuo!

    I think that simplification should been made on join conditions - I've done 
a quick check; and it seems to be working for me.
    I suspected that it will be either a missing call to RexSimplify for some 
reason - or it is added by a forced return type correction: IIRC there are some 
cases in which
    the
    RexNode type should retained after simplification.
    Is this reproducible on current master; could you share a testcase?

    cheers,
    Zoltan


    On 3/24/20 7:28 AM, Shuo Cheng wrote:
     > Hi, Julian, That's what we do as a workaround way. we remove CAST which 
are
     > only widening nullability as what CALCITE-2695 does before applying
     > hash-join/sort-merge-join rule, such that equiv predicate can be split
     > out.  I'm not sure whether it's properly for Calcite to do the 'convert
     > back' job, for example, simplify the join condition when create a Join; 
Or
     > maybe let other systems what use Calcite to do the "convert back" job as 
an
     > optimization? What do you think?
     >
     > On Tue, Mar 24, 2020 at 2:04 PM Julian Hyde <[email protected] 
<mailto:[email protected]>> wrote:
     >
     >> Or convert it back to a not-nullable BOOLEAN? The join condition treats
     >> UNKNOWN the same as FALSE, and besides UNKNOWN will never occur, so the
     >> conditions with and without the CAST are equivalent.
     >>
     >> Julian
     >>
     >>> On Mar 23, 2020, at 9:34 PM, Shuo Cheng <[email protected] 
<mailto:[email protected]>> wrote:
     >>>
     >>> Hi all,
     >>>
     >>> Considering the Join condition 'CAST(IS_NOT_DISTINCT_FROM($1, $2),
     >>> BOOLEAN)', which cast the non-nullable BOOLEAN to nullable BOOLEAN,
     >> Calcite
     >>> can not split out equiv predicate, thus some join operation like hash
     >> join
     >>> / sort merge join may not be used. Maybe we can
     >>> expand RelOptUtil#splitJoinCondition to support this scenario?
     >>
     >

Reply via email to