sririshindra commented on a change in pull request #31477:
URL: https://github.com/apache/spark/pull/31477#discussion_r594431673
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
##########
@@ -57,7 +57,8 @@ case class BroadcastHashJoinExec(
}
override lazy val metrics = Map(
- "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output
rows"))
+ "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output
rows"),
+ "numMatchedRows" -> SQLMetrics.createMetric(sparkContext, "number of
matched rows"))
Review comment:
Update: In my most recent commit I fixed a perf issue that is causing a
significant degradation for query-16 & quer-94. However there still seems to be
some perf degradation that I cannot fix. Maybe it is because adding the new
metric itself or maybe there is something wrong with the way that I am running
the tests. I will run the benchmark again on a bare metal machine and see if it
makes any difference. I will also see if there is anything else that can be
done to improve the performance. My most recent runs are shared in this doc.
https://docs.google.com/spreadsheets/d/1A0jCx5BL0wcN28oDQDOs73QhVZ71uNbKDg9Q8S0IE6g/edit?usp=sharing
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]