Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/18301#discussion_r124714544
--- Diff:
sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala
---
@@ -267,10 +298,111 @@ class SQLMetricsSuite extends SparkFunSuite with
SharedSQLContext {
val df = df1.join(broadcast(df2), "key")
testSparkPlanMetrics(df, 2, Map(
1L -> ("BroadcastHashJoin", Map(
- "number of output rows" -> 2L)))
+ "number of output rows" -> 2L,
+ "avg hash probe (min, med, max)" -> "\n(1, 1, 1)")))
)
}
+ test("BroadcastHashJoin metrics: track avg probe") {
+ // The executed plan looks like:
+ // Project [a#210, b#211, b#221]
+ // +- BroadcastHashJoin [a#210], [a#220], Inner, BuildRight
+ // :- Project [_1#207 AS a#210, _2#208 AS b#211]
+ // : +- Filter isnotnull(_1#207)
+ // : +- LocalTableScan [_1#207, _2#208]
+ // +- BroadcastExchange HashedRelationBroadcastMode(List(input[0,
binary, true]))
+ // +- Project [_1#217 AS a#220, _2#218 AS b#221]
+ // +- Filter isnotnull(_1#217)
+ // +- LocalTableScan [_1#217, _2#218]
+ //
+ // Assume the execution plan is
+ // WholeStageCodegen disabled:
+ // ... -> BroadcastHashJoin(nodeId = 1) -> Project(nodeId = 0)
+ //
+ // WholeStageCodegen enabled:
+ // ... ->
+ // WholeStageCodegen(nodeId = 0, Filter(nodeId = 4) -> Project(nodeId
= 3) ->
--- End diff --
can you format it a little bit? to indicate that we only have a
`WholeStageCodegen`, all other plans are the inner children of
`WholeStageCodegen`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]