Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/21416#discussion_r190722260
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala
---
@@ -220,6 +219,7 @@ object OptimizeIn extends Rule[LogicalPlan] {
def apply(plan: LogicalPlan): LogicalPlan = plan transform {
case q: LogicalPlan => q transformExpressionsDown {
case In(v, list) if list.isEmpty && !v.nullable => FalseLiteral
+ case In(v, list) if list.length == 1 => EqualTo(v, list.head)
--- End diff --
Could you add the following test case, too?
```scala
scala> sql("select * from t group by a having count(*) = (select count(*)
from t)").explain
== Physical Plan ==
*(2) Project [a#2L]
+- *(2) Filter (count(1)#75L = Subquery subquery62)
: +- Subquery subquery62
: +- *(2) HashAggregate(keys=[], functions=[count(1)])
: +- Exchange SinglePartition
: +- *(1) HashAggregate(keys=[], functions=[partial_count(1)])
: +- *(1) Project
: +- *(1) Range (0, 1, step=1, splits=8)
+- *(2) HashAggregate(keys=[a#2L], functions=[count(1)])
+- Exchange hashpartitioning(a#2L, 200)
+- *(1) HashAggregate(keys=[a#2L], functions=[partial_count(1)])
+- *(1) Project [id#0L AS a#2L]
+- *(1) Range (0, 1, step=1, splits=8)
scala> sql("select * from t group by a having count(*) in (select count(*)
from t)").explain
java.lang.StackOverflowError
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]