wankunde commented on PR #41119:
URL: https://github.com/apache/spark/pull/41119#issuecomment-1571755196
Add a microbenchmark to evaluate the overhead of an additional function call.
The expressions `"a * b"`, `"a / b"`, `"a * b / c"` will be wrapped in a
function and will only be called once.
If there is no complex expression, the query time changes from 1165ms to
1270ms,
If there is only one complex expression, the query time is the same as
before. (6527ms and 7982ms)
```java
override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {
spark.range(1, 20000000, 1, 1)
.selectExpr(
"cast(id + 1 as decimal) as a",
"cast(id + 2 as decimal) as b",
"cast(id + 3 as decimal) as c",
"cast(id + 4 as decimal) as d")
.createOrReplaceTempView("tab")
runBenchmark("Subexpression elimination in FilterExec") {
val benchmark =
new Benchmark("Subexpression elimination in FilterExec", 20000000,
output = output)
for (expr <- Seq("a * b", "a / b", "a * b / c")) {
benchmark.addCase(s"Test $expr expr") { _ =>
val query =
s"""
|SELECT a, b, c, d
|FROM tab
|WHERE $expr < 0 AND $expr < 1
|""".stripMargin
spark.sql(query).noop()
}
}
benchmark.run()
}
}
```
```
Before this change:
Subexpression elimination in FilterExec: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Test a * b expr 1046 1165
169 19.1 52.3 1.0X
Test a / b expr 6519 6527
12 3.1 325.9 0.2X
Test a * b / c expr 7634 7982
492 2.6 381.7 0.1X
After this change:
Subexpression elimination in FilterExec: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Test a * b expr 1245 1270
35 16.1 62.3 1.0X
Test a / b expr 6469 6582
160 3.1 323.4 0.2X
Test a * b / c expr 7751 7997
348 2.6 387.6 0.2X
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]