wangyum commented on a change in pull request #28642:
URL: https://github.com/apache/spark/pull/28642#discussion_r435725267
##########
File path:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/InferFiltersFromConstraintsSuite.scala
##########
@@ -316,4 +316,19 @@ class InferFiltersFromConstraintsSuite extends PlanTest {
condition)
}
}
+
+ test("Infer IsNotNull for non null-intolerant child of null intolerant join
condition") {
+ testConstraintsAfterJoin(
+ testRelation.subquery('left),
+ testRelation.subquery('right),
+ testRelation.where(IsNotNull(Coalesce(Seq('a, 'b)))).subquery('left),
Review comment:
```
hive> EXPLAIN SELECT t1.* FROM t1 JOIN t2 ON coalesce(t1.a, t1.b)=t2.a;
OK
STAGE DEPENDENCIES:
Stage-4 is a root stage
Stage-3 depends on stages: Stage-4
Stage-0 depends on stages: Stage-3
STAGE PLANS:
Stage: Stage-4
Map Reduce Local Work
Alias -> Map Local Tables:
$hdt$_0:t1
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
$hdt$_0:t1
TableScan
alias: t1
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column
stats: NONE
Filter Operator
predicate: COALESCE(a,b) is not null (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL
Column stats: NONE
Select Operator
expressions: a (type: string), b (type: string), c (type:
string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL
Column stats: NONE
HashTable Sink Operator
keys:
0 COALESCE(_col0,_col1) (type: string)
1 _col0 (type: string)
Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
alias: t2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column
stats: NONE
Filter Operator
predicate: a is not null (type: boolean)
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL
Column stats: NONE
Select Operator
expressions: a (type: string)
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL
Column stats: NONE
Map Join Operator
condition map:
Inner Join 0 to 1
keys:
0 COALESCE(_col0,_col1) (type: string)
1 _col0 (type: string)
outputColumnNames: _col0, _col1, _col2
Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL
Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 1 Data size: 0 Basic stats:
PARTIAL Column stats: NONE
table:
input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Execution mode: vectorized
Local Work:
Map Reduce Local Work
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]