lgbo-ustc opened a new issue, #6868:
URL: https://github.com/apache/incubator-gluten/issues/6868

   ### Backend
   
   CH (ClickHouse)
   
   ### Bug description
   
   [Expected behavior] and [actual behavior].
   
   When enable `right join`, the nullablitiy of left columns is mismatch.
   
   ```
   [2024-08-15T09:05:15.545Z] 09:05:15.446 WARN 
org.apache.gluten.execution.CHBroadcastHashJoinExecTransformer: Can't not trace 
broadcast hash table data BuiltHashTable-577467 because execution id is null. 
Will clean up until expire time.
   [2024-08-15T09:05:15.545Z] - join - join using multiple columns and 
specifying join type *** FAILED ***
   [2024-08-15T09:05:15.546Z]   Results do not match for query:
   [2024-08-15T09:05:15.546Z]   Timezone: 
sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]]
   [2024-08-15T09:05:15.546Z]   Timezone Env: 
   [2024-08-15T09:05:15.546Z]   
   [2024-08-15T09:05:15.546Z]   == Parsed Logical Plan ==
   [2024-08-15T09:05:15.546Z]   'Join UsingJoin(RightOuter,List(int, str))
   [2024-08-15T09:05:15.546Z]   :- Project [_1#345574 AS int#345581, _2#345575 
AS int2#345582, _3#345576 AS str#345583]
   [2024-08-15T09:05:15.546Z]   :  +- LocalRelation [_1#345574, _2#345575, 
_3#345576]
   [2024-08-15T09:05:15.546Z]   +- Project [_1#345590 AS int#345597, _2#345591 
AS int2#345598, _3#345592 AS str#345599]
   [2024-08-15T09:05:15.546Z]      +- LocalRelation [_1#345590, _2#345591, 
_3#345592]
   [2024-08-15T09:05:15.546Z]   
   [2024-08-15T09:05:15.546Z]   == Analyzed Logical Plan ==
   [2024-08-15T09:05:15.546Z]   int: int, str: string, int2: int, int2: int
   [2024-08-15T09:05:15.546Z]   Project [int#345597, str#345599, int2#345582, 
int2#345598]
   [2024-08-15T09:05:15.546Z]   +- Join RightOuter, ((int#345581 = int#345597) 
AND (str#345583 = str#345599))
   [2024-08-15T09:05:15.546Z]      :- Project [_1#345574 AS int#345581, 
_2#345575 AS int2#345582, _3#345576 AS str#345583]
   [2024-08-15T09:05:15.546Z]      :  +- LocalRelation [_1#345574, _2#345575, 
_3#345576]
   [2024-08-15T09:05:15.546Z]      +- Project [_1#345590 AS int#345597, 
_2#345591 AS int2#345598, _3#345592 AS str#345599]
   [2024-08-15T09:05:15.546Z]         +- LocalRelation [_1#345590, _2#345591, 
_3#345592]
   [2024-08-15T09:05:15.546Z]   
   [2024-08-15T09:05:15.546Z]   == Optimized Logical Plan ==
   [2024-08-15T09:05:15.546Z]   Project [int#345597, str#345599, int2#345582, 
int2#345598]
   [2024-08-15T09:05:15.546Z]   +- Join RightOuter, ((int#345581 = int#345597) 
AND (str#345583 = str#345599))
   [2024-08-15T09:05:15.546Z]      :- Project [_1#345574 AS int#345581, 
_2#345575 AS int2#345582, _3#345576 AS str#345583]
   [2024-08-15T09:05:15.546Z]      :  +- Filter isnotnull(_3#345576)
   [2024-08-15T09:05:15.546Z]      :     +- LocalRelation [_1#345574, 
_2#345575, _3#345576]
   [2024-08-15T09:05:15.546Z]      +- Project [_1#345590 AS int#345597, 
_2#345591 AS int2#345598, _3#345592 AS str#345599]
   [2024-08-15T09:05:15.546Z]         +- LocalRelation [_1#345590, _2#345591, 
_3#345592]
   [2024-08-15T09:05:15.546Z]   
   [2024-08-15T09:05:15.546Z]   == Physical Plan ==
   [2024-08-15T09:05:15.546Z]   AdaptiveSparkPlan isFinalPlan=true
   [2024-08-15T09:05:15.546Z]   +- == Final Plan ==
   [2024-08-15T09:05:15.546Z]      CHNativeColumnarToRow
   [2024-08-15T09:05:15.546Z]      +- ^(11514) ProjectExecTransformer 
[int#345597, str#345599, int2#345582, int2#345598]
   [2024-08-15T09:05:15.546Z]         +- ^(11514) 
CHBroadcastHashJoinExecTransformer [int#345581, str#345583], [int#345597, 
str#345599], RightOuter, BuildLeft, false
   [2024-08-15T09:05:15.546Z]            :- ^(11514) 
InputIteratorTransformer[int#345581, int2#345582, str#345583]
   [2024-08-15T09:05:15.546Z]            :  +- BroadcastQueryStage 0
   [2024-08-15T09:05:15.546Z]            :     +- ColumnarBroadcastExchange 
HashedRelationBroadcastMode(List(input[0, int, false], input[2, string, 
true]),false), [plan_id=577503]
   [2024-08-15T09:05:15.546Z]            :        +- ^(11513) 
ProjectExecTransformer [_1#345574 AS int#345581, _2#345575 AS int2#345582, 
_3#345576 AS str#345583]
   [2024-08-15T09:05:15.546Z]            :           +- ^(11513) 
FilterExecTransformer isnotnull(_3#345576)
   [2024-08-15T09:05:15.546Z]            :              +- ^(11513) 
InputIteratorTransformer[_1#345574, _2#345575, _3#345576]
   [2024-08-15T09:05:15.546Z]            :                 +- 
RowToCHNativeColumnar
   [2024-08-15T09:05:15.546Z]            :                    +- LocalTableScan 
[_1#345574, _2#345575, _3#345576]
   [2024-08-15T09:05:15.546Z]            +- ^(11514) 
InputIteratorTransformer[int#345597, int2#345598, str#345599]
   [2024-08-15T09:05:15.546Z]               +- RowToCHNativeColumnar
   [2024-08-15T09:05:15.546Z]                  +- LocalTableScan [int#345597, 
int2#345598, str#345599]
   [2024-08-15T09:05:15.546Z]   +- == Initial Plan ==
   [2024-08-15T09:05:15.546Z]      Project [int#345597, str#345599, 
int2#345582, int2#345598]
   [2024-08-15T09:05:15.546Z]      +- BroadcastHashJoin [int#345581, 
str#345583], [int#345597, str#345599], RightOuter, BuildLeft, false
   [2024-08-15T09:05:15.546Z]         :- BroadcastExchange 
HashedRelationBroadcastMode(List(input[0, int, false], input[2, string, 
true]),false), [plan_id=577333]
   [2024-08-15T09:05:15.546Z]         :  +- Project [_1#345574 AS int#345581, 
_2#345575 AS int2#345582, _3#345576 AS str#345583]
   [2024-08-15T09:05:15.546Z]         :     +- Filter isnotnull(_3#345576)
   [2024-08-15T09:05:15.546Z]         :        +- LocalTableScan [_1#345574, 
_2#345575, _3#345576]
   [2024-08-15T09:05:15.546Z]         +- Project [_1#345590 AS int#345597, 
_2#345591 AS int2#345598, _3#345592 AS str#345599]
   [2024-08-15T09:05:15.546Z]            +- LocalTableScan [_1#345590, 
_2#345591, _3#345592]
   [2024-08-15T09:05:15.546Z]   
   [2024-08-15T09:05:15.546Z]   == Results ==
   [2024-08-15T09:05:15.546Z]   
   [2024-08-15T09:05:15.546Z]   == Results ==
   [2024-08-15T09:05:15.546Z]   !== Correct Answer - 2 ==   == Gluten Answer - 
2 ==
   [2024-08-15T09:05:15.546Z]   !struct<>                   
struct<int:int,str:string,int2:int,int2:int>
   [2024-08-15T09:05:15.546Z]    [1,1,2,3]                  [1,1,2,3]
   [2024-08-15T09:05:15.546Z]   ![5,5,null,6]               [5,5,0,6] 
(GlutenSQLTestsTrait.scala:97)
   ```
   
   All left columns should be converted to nullable
   
   ### Spark version
   
   None
   
   ### Spark configurations
   
   _No response_
   
   ### System information
   
   _No response_
   
   ### Relevant logs
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to