sririshindra commented on a change in pull request #31477:
URL: https://github.com/apache/spark/pull/31477#discussion_r570708920
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
##########
@@ -57,7 +57,8 @@ case class BroadcastHashJoinExec(
}
override lazy val metrics = Map(
- "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output
rows"))
+ "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output
rows"),
+ "numMatchedPairs" -> SQLMetrics.createMetric(sparkContext, "number of
matched pairs"))
Review comment:
How about `number of matched rows` instead? I wanted highlight the fact
that this metric measures the number of keys matched between the stream side
and the build side. I also felt that users might not recognize the difference
between `number of joined rows` and `number of output rows`. But `number of
joined rows` is fine as well. If you think that users will understand that
better I will change the name of the metric to `number of joined rows`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]