This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new a4a83a31ed3 [SPARK-39545][SQL] Override `concat` method for `ExpressionSet` in Scala 2.13 to improve the performance a4a83a31ed3 is described below commit a4a83a31ed355c85097bce284eac05dbfd06d039 Author: yangjie01 <yangji...@baidu.com> AuthorDate: Wed Jun 22 18:39:07 2022 -0500 [SPARK-39545][SQL] Override `concat` method for `ExpressionSet` in Scala 2.13 to improve the performance ### What changes were proposed in this pull request? `ExpressionSet ++` method in the master branch a little slower than the branch-3.3 with Scala-2.13, so this pr override `concat` method for `ExpressionSet` in Scala 2.13. ### Why are the changes needed? Improve the performance ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GA - Manual test 1: microbench as follows and run with Scala 2.13: ```scala val valuesPerIteration = 100000 val benchmark = new Benchmark("Test ExpressionSet ++ ", valuesPerIteration, output = output) val aUpper = AttributeReference("A", IntegerType)(exprId = ExprId(1)) val initialSet = ExpressionSet(aUpper + 1 :: Rand(0) :: Nil) val setToAddWithSameDeterministicExpression = ExpressionSet(aUpper + 1 :: Rand(0) :: Nil) benchmark.addCase("Test ++") { _: Int => for (_ <- 0L until valuesPerIteration) { initialSet ++ setToAddWithSameDeterministicExpression } } benchmark.run() ``` **branch-3.3 result:** ``` OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45 Intel(R) Xeon(R) Gold 6XXXC CPU 2.60GHz Test ExpressionSet ++ : Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ Test ++ 14 16 4 7.2 139.1 1.0X ``` **master result before this pr:** ``` OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45 Intel(R) Xeon(R) Gold 6XXXC CPU 2.60GHz Test ExpressionSet ++ : Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ Test ++ 16 19 5 6.1 163.9 1.0X ``` **master result after this pr:** ``` OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45 Intel(R) Xeon(R) Gold 6XXXC CPU 2.60GHz Test ExpressionSet ++ : Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ Test ++ 12 13 3 8.6 115.7 1.0X ``` - Manual test 2: ``` dev/change-scala-version.sh 2.13 mvn clean install -pl sql/core -am -DskipTests -Pscala-2.13 mvn test -pl sql/catalyst -Pscala-2.13 mvn test -pl sql/core -Pscala-2.13 ``` ``` Run completed in 10 minutes, 40 seconds. Total number of tests run: 6584 Suites: completed 285, aborted 0 Tests: succeeded 6584, failed 0, canceled 0, ignored 5, pending 0 All tests passed. ``` ``` Run completed in 1 hour, 27 minutes, 16 seconds. Total number of tests run: 11745 Suites: completed 520, aborted 0 Tests: succeeded 11745, failed 0, canceled 7, ignored 57, pending 0 All tests passed. ``` Closes #36942 from LuciferYang/ExpressionSet. Authored-by: yangjie01 <yangji...@baidu.com> Signed-off-by: Sean Owen <sro...@gmail.com> --- .../org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala b/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala index e38deedec6d..a615223ef79 100644 --- a/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala +++ b/sql/catalyst/src/main/scala-2.13/org/apache/spark/sql/catalyst/expressions/ExpressionSet.scala @@ -132,6 +132,12 @@ class ExpressionSet protected( newSet } + override def concat(that: IterableOnce[Expression]): ExpressionSet = { + val newSet = clone() + that.iterator.foreach(newSet.add) + newSet + } + override def --(that: IterableOnce[Expression]): ExpressionSet = { val newSet = clone() that.iterator.foreach(newSet.remove) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org