ZiyaZa commented on code in PR #55967:
URL: https://github.com/apache/spark/pull/55967#discussion_r3272163961
##########
sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/SqlBasedBenchmark.scala:
##########
@@ -46,17 +48,38 @@ trait SqlBasedBenchmark extends BenchmarkBase with
SQLHelper {
.getOrCreate()
}
- /** Runs function `f` with whole stage codegen on and off. */
- final def codegenBenchmark(name: String, cardinality: Long)(f: => Unit):
Unit = {
- val benchmark = new Benchmark(name, cardinality, output = output)
+ /**
+ * Runs function `f` with whole stage codegen on and off.
+ *
+ * @param warmupTime JIT warm-up duration per case before timed iterations.
+ * @param minTime minimum total timed duration per case when `numIters` is 0.
+ * @param wholestageOffNumIters if non-zero, run exactly this many timed
iterations
+ * (wholestage off); otherwise use `minNumIters` and `minTime`.
+ * @param wholestageOnNumIters same for wholestage on.
+ */
+ final def codegenBenchmark(
+ name: String,
+ cardinality: Long,
+ warmupTime: FiniteDuration = 2.seconds,
+ minTime: FiniteDuration = 2.seconds,
+ minNumIters: Int = 2,
Review Comment:
nit: I would order these parameters the same way as in `Benchmark`
constructor to prevent confusion:
```suggestion
minNumIters: Int = 2,
warmupTime: FiniteDuration = 2.seconds,
minTime: FiniteDuration = 2.seconds,
```
##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/MergeRowsExec.scala:
##########
@@ -579,28 +592,27 @@ case class MergeRowsExec(
null
}
- }
- // For group based merge, copy is inserted if row matches no other case
- private def incrementCopyMetric(): Unit = longMetric("numTargetRowsCopied")
+= 1
+ private def incrementCopyMetric(): Unit = numTargetRowsCopied += 1
Review Comment:
I'm now wondering if these `+= 1` can also be optimized be further. Doing
`+=` on a `SQLMetric` is technically doing a function call to `SQLMetric.add`
which does some more work in addition to simply adding the value. Maybe JVM is
smart enough to optimize it, I'm not sure. Do you think it's worth exploring?
We could replace the new `private lazy val` SQLMetric fields above with
simple integers initially all set to 0, and increment those only. And at the
end of `applyInstructions`, we can lookup and increment all metrics for which
the value is > 0.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]