[GitHub] [spark] sririshindra commented on a change in pull request #31477: [SPARK-34369][SQL][WEBUI] Track number of pairs processed out of Join.

GitBox Thu, 04 Feb 2021 20:13:40 -0800


sririshindra commented on a change in pull request #31477:
URL: https://github.com/apache/spark/pull/31477#discussion_r570708920




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
##########
@@ -57,7 +57,8 @@ case class BroadcastHashJoinExec(
   }
 
   override lazy val metrics = Map(
-    "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output 
rows"))
+    "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output 
rows"),
+    "numMatchedPairs" -> SQLMetrics.createMetric(sparkContext, "number of 
matched pairs"))

Review comment:
       How about `number of matched rows` instead?  I wanted highlight the fact 
that this metric measures the number of keys matched between the stream side 
and the build side.  I also felt that users might not recognize the difference 
between `number of joined rows` and `number of output rows`. But `number of 
joined rows` is fine as well. If you think that users will understand that 
better I will change the name of the metric to `number of joined rows`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] sririshindra commented on a change in pull request #31477: [SPARK-34369][SQL][WEBUI] Track number of pairs processed out of Join.

Reply via email to