It does seem to be something that RelBuilder could do. (RexSimplify can’t 
really do it, because it doesn’t know how the expression is being used.)

It’s also worth discovering why the CAST was added in the first place. It 
doesn’t seem to be helpful. I think we should strive to eliminate all of the 
slightly unhelpful things that Calcite does; those things can add up and cause 
major inefficiencies in the planning process and/or sub-optimal plans.

Julian


> On Mar 24, 2020, at 1:47 AM, Zoltan Haindrich <k...@rxd.hu> wrote:
> 
> Hey,
> 
> That's a great diagnosis :)
> I would guess that newCondition became non-nullable for some reason 
> (rexSimplify runs under RexProgramBuilder so it might be able to narrow the 
> nullability)
> you could try invoking simplify.simplifyPreservingType() on it to see if that 
> would help.
> 
> > I know it's necessary to preserve the nullability when simplifying a 
> > boolean expression in project columns, but as for condition in Filter/Calc, 
> > may be we can omit the
> > nullability?
> I think that could probably work - we can't change the nullability on project 
> columns because those could be referenced (and the reference also has the 
> type) ; but for filter/join conditions we don't need to care with it.
> It seems we already have a "matchnullability" in ReduceExpressionsRule ; for 
> FILTER/JOIN we should probably turn that off...  :)
> 
> cheers,
> Zoltan
> 
> 
> On 3/24/20 9:15 AM, Shuo Cheng wrote:
>> Hi Zoltan,
>> I encountered the problem when running TPC tests, and have not reproduced it 
>> in Calcite master.
>> But I figured it out how the problem is produced:
>> There is semi join with the condition:AND(EXPANDED_INDF1, EXPANDED_INDF2), 
>> type of AND is BOOLEAN with nullable `true`
>> After JoinPushExpressionsRule -->> join condition: AND(INDF1, INDF2), type 
>> of AND is BOOLEAN with nullable `true`
>> After  SemiJoinProjectTransposeRule --> Join condition: CAST(AND(INDF1, 
>> INDF2)), type of AND is BOOLEAN with nullable `false`
>> Just as what you suspected, It's in `SemiJoinProjectTransposeRule` where 
>> forced type correction is added by `RexProgramBuilder#addCondition`, which 
>> will call `RexSimplify#simplifyPreservingType` before registering an 
>> expression.
>> I know it's necessary to preserve the nullability when simplifying a boolean 
>> expression in project columns, but as for condition in Filter/Calc, may be 
>> we can omit the nullability?
>> Best Regards,
>> Shuo
>> On Tue, Mar 24, 2020 at 3:35 PM Zoltan Haindrich <k...@rxd.hu 
>> <mailto:k...@rxd.hu>> wrote:
>>    Hey Shuo!
>>    I think that simplification should been made on join conditions - I've 
>> done a quick check; and it seems to be working for me.
>>    I suspected that it will be either a missing call to RexSimplify for some 
>> reason - or it is added by a forced return type correction: IIRC there are 
>> some cases in which
>>    the
>>    RexNode type should retained after simplification.
>>    Is this reproducible on current master; could you share a testcase?
>>    cheers,
>>    Zoltan
>>    On 3/24/20 7:28 AM, Shuo Cheng wrote:
>>     > Hi, Julian, That's what we do as a workaround way. we remove CAST 
>> which are
>>     > only widening nullability as what CALCITE-2695 does before applying
>>     > hash-join/sort-merge-join rule, such that equiv predicate can be split
>>     > out.  I'm not sure whether it's properly for Calcite to do the 'convert
>>     > back' job, for example, simplify the join condition when create a 
>> Join; Or
>>     > maybe let other systems what use Calcite to do the "convert back" job 
>> as an
>>     > optimization? What do you think?
>>     >
>>     > On Tue, Mar 24, 2020 at 2:04 PM Julian Hyde <jhyde.apa...@gmail.com 
>> <mailto:jhyde.apa...@gmail.com>> wrote:
>>     >
>>     >> Or convert it back to a not-nullable BOOLEAN? The join condition 
>> treats
>>     >> UNKNOWN the same as FALSE, and besides UNKNOWN will never occur, so 
>> the
>>     >> conditions with and without the CAST are equivalent.
>>     >>
>>     >> Julian
>>     >>
>>     >>> On Mar 23, 2020, at 9:34 PM, Shuo Cheng <njucs...@gmail.com 
>> <mailto:njucs...@gmail.com>> wrote:
>>     >>>
>>     >>> Hi all,
>>     >>>
>>     >>> Considering the Join condition 'CAST(IS_NOT_DISTINCT_FROM($1, $2),
>>     >>> BOOLEAN)', which cast the non-nullable BOOLEAN to nullable BOOLEAN,
>>     >> Calcite
>>     >>> can not split out equiv predicate, thus some join operation like hash
>>     >> join
>>     >>> / sort merge join may not be used. Maybe we can
>>     >>> expand RelOptUtil#splitJoinCondition to support this scenario?
>>     >>
>>     >

Reply via email to