Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/22518#discussion_r232725686 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala --- @@ -1268,4 +1269,16 @@ class SubquerySuite extends QueryTest with SharedSQLContext { assert(getNumSortsInQuery(query5) == 1) } } + + test("SPARK-25482: Reuse same Subquery in order to execute it only once") { + withTempView("t1", "t2") { + sql("create temporary view t1(a int) using parquet") + sql("create temporary view t2(b int) using parquet") + val plan = sql("select * from t2 where b > (select max(a) from t1)") --- End diff -- > it also means the data source scan must wait until the subquery is finished The subquery should be executed anyway sooner or later, right? So I don't see the problem here: am I missing something? Ok, thanks, I'll follow your suggestion and forbid it here and create a new ticket about pushing it down to data sources. Thanks.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org