[GitHub] [spark] sririshindra commented on a change in pull request #31477: [SPARK-34369][SQL] Track number of pairs processed out of Join

GitBox Mon, 15 Mar 2021 08:22:09 -0700


sririshindra commented on a change in pull request #31477:
URL: https://github.com/apache/spark/pull/31477#discussion_r594431673




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
##########
@@ -57,7 +57,8 @@ case class BroadcastHashJoinExec(
   }
 
   override lazy val metrics = Map(
-    "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output 
rows"))
+    "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output 
rows"),
+    "numMatchedRows" -> SQLMetrics.createMetric(sparkContext, "number of 
matched rows"))

Review comment:
       Update: In my most recent commit I fixed a perf issue that is causing a 
significant degradation for query-16 & quer-94. However there still seems to be 
some perf degradation that I cannot fix. Maybe it is because adding the new 
metric itself or maybe there is something wrong with the way that I am running 
the tests. I will run the benchmark again on a bare metal machine and see if it 
makes any difference. I will also see if there is anything else that can be 
done to improve the performance. My most recent runs are shared in this doc. 
https://docs.google.com/spreadsheets/d/1A0jCx5BL0wcN28oDQDOs73QhVZ71uNbKDg9Q8S0IE6g/edit?usp=sharing




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] sririshindra commented on a change in pull request #31477: [SPARK-34369][SQL] Track number of pairs processed out of Join

Reply via email to