Github user maropu commented on a diff in the pull request:
https://github.com/apache/spark/pull/22141#discussion_r214007983
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RewriteSubquerySuite.scala
---
@@ -52,4 +52,21 @@ class RewriteSubquerySuite extends PlanTest {
comparePlans(optimized, correctAnswer)
}
+ test("NOT-IN subquery nested inside OR") {
+ val relation1 = LocalRelation('a.int, 'b.int)
+ val relation2 = LocalRelation('c.int, 'd.int)
+ val exists = 'exists.boolean.notNull
+
+ val query = relation1.where('b === 1 ||
Not('a.in(ListQuery(relation2.select('c))))).select('a)
+
+ val plan = relation1.select('a).where('b === 1 || Not('exists))
+ val correctAnswer = relation1
+ .join(relation2.select('c), ExistenceJoin(exists), Some('a === 'c ||
IsNull('a === 'c)))
+ .where('b === 1 || Not(exists))
+ .select('a)
+ .analyze
+ val optimized = Optimize.execute(query.analyze)
+
--- End diff --
The end-to-end tests for conflicting attr case already exist though, IMO
it'd be better to add deduplication tests here, too.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]