lgbo-ustc opened a new issue, #6868: URL: https://github.com/apache/incubator-gluten/issues/6868
### Backend CH (ClickHouse) ### Bug description [Expected behavior] and [actual behavior]. When enable `right join`, the nullablitiy of left columns is mismatch. ``` [2024-08-15T09:05:15.545Z] 09:05:15.446 WARN org.apache.gluten.execution.CHBroadcastHashJoinExecTransformer: Can't not trace broadcast hash table data BuiltHashTable-577467 because execution id is null. Will clean up until expire time. [2024-08-15T09:05:15.545Z] - join - join using multiple columns and specifying join type *** FAILED *** [2024-08-15T09:05:15.546Z] Results do not match for query: [2024-08-15T09:05:15.546Z] Timezone: sun.util.calendar.ZoneInfo[id="America/Los_Angeles",offset=-28800000,dstSavings=3600000,useDaylight=true,transitions=185,lastRule=java.util.SimpleTimeZone[id=America/Los_Angeles,offset=-28800000,dstSavings=3600000,useDaylight=true,startYear=0,startMode=3,startMonth=2,startDay=8,startDayOfWeek=1,startTime=7200000,startTimeMode=0,endMode=3,endMonth=10,endDay=1,endDayOfWeek=1,endTime=7200000,endTimeMode=0]] [2024-08-15T09:05:15.546Z] Timezone Env: [2024-08-15T09:05:15.546Z] [2024-08-15T09:05:15.546Z] == Parsed Logical Plan == [2024-08-15T09:05:15.546Z] 'Join UsingJoin(RightOuter,List(int, str)) [2024-08-15T09:05:15.546Z] :- Project [_1#345574 AS int#345581, _2#345575 AS int2#345582, _3#345576 AS str#345583] [2024-08-15T09:05:15.546Z] : +- LocalRelation [_1#345574, _2#345575, _3#345576] [2024-08-15T09:05:15.546Z] +- Project [_1#345590 AS int#345597, _2#345591 AS int2#345598, _3#345592 AS str#345599] [2024-08-15T09:05:15.546Z] +- LocalRelation [_1#345590, _2#345591, _3#345592] [2024-08-15T09:05:15.546Z] [2024-08-15T09:05:15.546Z] == Analyzed Logical Plan == [2024-08-15T09:05:15.546Z] int: int, str: string, int2: int, int2: int [2024-08-15T09:05:15.546Z] Project [int#345597, str#345599, int2#345582, int2#345598] [2024-08-15T09:05:15.546Z] +- Join RightOuter, ((int#345581 = int#345597) AND (str#345583 = str#345599)) [2024-08-15T09:05:15.546Z] :- Project [_1#345574 AS int#345581, _2#345575 AS int2#345582, _3#345576 AS str#345583] [2024-08-15T09:05:15.546Z] : +- LocalRelation [_1#345574, _2#345575, _3#345576] [2024-08-15T09:05:15.546Z] +- Project [_1#345590 AS int#345597, _2#345591 AS int2#345598, _3#345592 AS str#345599] [2024-08-15T09:05:15.546Z] +- LocalRelation [_1#345590, _2#345591, _3#345592] [2024-08-15T09:05:15.546Z] [2024-08-15T09:05:15.546Z] == Optimized Logical Plan == [2024-08-15T09:05:15.546Z] Project [int#345597, str#345599, int2#345582, int2#345598] [2024-08-15T09:05:15.546Z] +- Join RightOuter, ((int#345581 = int#345597) AND (str#345583 = str#345599)) [2024-08-15T09:05:15.546Z] :- Project [_1#345574 AS int#345581, _2#345575 AS int2#345582, _3#345576 AS str#345583] [2024-08-15T09:05:15.546Z] : +- Filter isnotnull(_3#345576) [2024-08-15T09:05:15.546Z] : +- LocalRelation [_1#345574, _2#345575, _3#345576] [2024-08-15T09:05:15.546Z] +- Project [_1#345590 AS int#345597, _2#345591 AS int2#345598, _3#345592 AS str#345599] [2024-08-15T09:05:15.546Z] +- LocalRelation [_1#345590, _2#345591, _3#345592] [2024-08-15T09:05:15.546Z] [2024-08-15T09:05:15.546Z] == Physical Plan == [2024-08-15T09:05:15.546Z] AdaptiveSparkPlan isFinalPlan=true [2024-08-15T09:05:15.546Z] +- == Final Plan == [2024-08-15T09:05:15.546Z] CHNativeColumnarToRow [2024-08-15T09:05:15.546Z] +- ^(11514) ProjectExecTransformer [int#345597, str#345599, int2#345582, int2#345598] [2024-08-15T09:05:15.546Z] +- ^(11514) CHBroadcastHashJoinExecTransformer [int#345581, str#345583], [int#345597, str#345599], RightOuter, BuildLeft, false [2024-08-15T09:05:15.546Z] :- ^(11514) InputIteratorTransformer[int#345581, int2#345582, str#345583] [2024-08-15T09:05:15.546Z] : +- BroadcastQueryStage 0 [2024-08-15T09:05:15.546Z] : +- ColumnarBroadcastExchange HashedRelationBroadcastMode(List(input[0, int, false], input[2, string, true]),false), [plan_id=577503] [2024-08-15T09:05:15.546Z] : +- ^(11513) ProjectExecTransformer [_1#345574 AS int#345581, _2#345575 AS int2#345582, _3#345576 AS str#345583] [2024-08-15T09:05:15.546Z] : +- ^(11513) FilterExecTransformer isnotnull(_3#345576) [2024-08-15T09:05:15.546Z] : +- ^(11513) InputIteratorTransformer[_1#345574, _2#345575, _3#345576] [2024-08-15T09:05:15.546Z] : +- RowToCHNativeColumnar [2024-08-15T09:05:15.546Z] : +- LocalTableScan [_1#345574, _2#345575, _3#345576] [2024-08-15T09:05:15.546Z] +- ^(11514) InputIteratorTransformer[int#345597, int2#345598, str#345599] [2024-08-15T09:05:15.546Z] +- RowToCHNativeColumnar [2024-08-15T09:05:15.546Z] +- LocalTableScan [int#345597, int2#345598, str#345599] [2024-08-15T09:05:15.546Z] +- == Initial Plan == [2024-08-15T09:05:15.546Z] Project [int#345597, str#345599, int2#345582, int2#345598] [2024-08-15T09:05:15.546Z] +- BroadcastHashJoin [int#345581, str#345583], [int#345597, str#345599], RightOuter, BuildLeft, false [2024-08-15T09:05:15.546Z] :- BroadcastExchange HashedRelationBroadcastMode(List(input[0, int, false], input[2, string, true]),false), [plan_id=577333] [2024-08-15T09:05:15.546Z] : +- Project [_1#345574 AS int#345581, _2#345575 AS int2#345582, _3#345576 AS str#345583] [2024-08-15T09:05:15.546Z] : +- Filter isnotnull(_3#345576) [2024-08-15T09:05:15.546Z] : +- LocalTableScan [_1#345574, _2#345575, _3#345576] [2024-08-15T09:05:15.546Z] +- Project [_1#345590 AS int#345597, _2#345591 AS int2#345598, _3#345592 AS str#345599] [2024-08-15T09:05:15.546Z] +- LocalTableScan [_1#345590, _2#345591, _3#345592] [2024-08-15T09:05:15.546Z] [2024-08-15T09:05:15.546Z] == Results == [2024-08-15T09:05:15.546Z] [2024-08-15T09:05:15.546Z] == Results == [2024-08-15T09:05:15.546Z] !== Correct Answer - 2 == == Gluten Answer - 2 == [2024-08-15T09:05:15.546Z] !struct<> struct<int:int,str:string,int2:int,int2:int> [2024-08-15T09:05:15.546Z] [1,1,2,3] [1,1,2,3] [2024-08-15T09:05:15.546Z] ![5,5,null,6] [5,5,0,6] (GlutenSQLTestsTrait.scala:97) ``` All left columns should be converted to nullable ### Spark version None ### Spark configurations _No response_ ### System information _No response_ ### Relevant logs _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
