Reynold Xin created SPARK-20953:
-----------------------------------
Summary: Add hash map metrics to aggregate and join
Key: SPARK-20953
URL: https://issues.apache.org/jira/browse/SPARK-20953
Project: Spark
Issue Type: New Feature
Components: SQL
Affects Versions: 2.2.0
Reporter: Reynold Xin
It would be useful if we can identify hash map collision issues early on.
We should add avg hash map probe metric to aggregate operator and hash join
operator and report them. If the avg probe is greater than a specific
(configurable) threshold, we should log an error at runtime.
The primary classes to look at are UnsafeFixedWidthAggregationMap,
HashAggregateExec, HashedRelation, HashJoin.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]