AngersZhuuuu commented on issue #26437: [SPARK-29800][SQL] Plan non-correlated Exists 's subquery in PlanSubqueries URL: https://github.com/apache/spark/pull/26437#issuecomment-570445224 **With current pr** ``` scala> (1 to 10000).toDF("id").createOrReplaceTempView("s1") scala> (0 to 50000).toDF("id").createOrReplaceTempView("s2") scala> (0 to 1000000).map(_ * 2).toDF("id").createOrReplaceTempView("s3") scala> val df = sql("SELECT s1.id FROM s1 WHERE EXISTS (SELECT * from s3)") df: org.apache.spark.sql.DataFrame = [id: int] scala> var start = System.currentTimeMillis() start: Long = 1578018739056 scala> df.show(5) +---+ | id| +---+ | 1| | 2| | 3| | 4| | 5| +---+ only showing top 5 rows scala> var end = System.currentTimeMillis() end: Long = 1578018740882 scala> println(s"duration = ${end - start}") duration = 1826 scala> ```  **Without pr** ``` scala> (1 to 10000).toDF("id").createOrReplaceTempView("s1") scala> (0 to 50000).toDF("id").createOrReplaceTempView("s2") scala> (0 to 1000000).map(_ * 2).toDF("id").createOrReplaceTempView("s3") scala> val df = sql("SELECT s1.id FROM s1 WHERE EXISTS (SELECT * from s3)") df: org.apache.spark.sql.DataFrame = [id: int] scala> var start = System.currentTimeMillis() start: Long = 1578020812055 scala> df.show(5) 20/01/03 11:07:00 dispatcher-event-loop-4 WARN TaskSetManager: Stage 0 contains a task of very large size (4035 KiB). The maximum recommended task size is 1000 KiB. +---+ | id| +---+ | 1| | 2| | 3| | 4| | 5| +---+ only showing top 5 rows scala> var end = System.currentTimeMillis() end: Long = 1578020823600 scala> println(s"duration = ${end - start}") duration = 11545 ``` 
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
