maropu edited a comment on issue #25710: [SPARK-29008][SQL] Define an individual method for each common subexpression in HashAggregateExec URL: https://github.com/apache/spark/pull/25710#issuecomment-530144242 ``` (base) maropu@~:$system_profiler SPHardwareDataType Hardware: Hardware Overview: Processor Name: Intel Core i5 Processor Speed: 2 GHz Number of Processors: 1 Total Number of Cores: 2 L2 Cache (per Core): 256 KB L3 Cache: 4 MB Memory: 8 GB (base) maropu@~:$java -version java version "1.8.0_181" Java(TM) SE Runtime Environment (build 1.8.0_181-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode) (base) maropu@~:$ /bin/spark-shell --master=local[1] --conf spark.driver.memory=8g --conf spark.sql.shurtitions=1 -v val numCols = 40 val colExprs = "id AS key" +: (0 until numCols).map { i => s"id AS _c$i" } spark.range(3000000).selectExpr(colExprs: _*).createOrReplaceTempView("t") val aggExprs = (2 until numCols).map { i => (0 until i).map(d => s"_c$d") .mkString("AVG(", " + ", ")") } // Drops the time of a first run then pick that of a second run timer { sql(s"SELECT ${aggExprs.mkString(", ")} FROM t").write.format("noop").save() } // the master maxCodeGen: 12957 Elapsed time: 36.309858661s // this pr maxCodeGen=4184 Elapsed time: 2.399490285s ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
