GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/23105
[SPARK-26140] Pull TempShuffleReadMetrics creation out of shuffle reader ## What changes were proposed in this pull request? This patch defines an internal Spark interface for reporting shuffle metrics and uses that in shuffle reader. Before this patch, shuffle metrics is tied to a specific implementation (using a thread local temporary data structure and accumulators). After this patch, callers that define their own shuffle RDDs can create a custom metrics implementation. With this patch, we would be able to create a better metrics for the SQL layer, e.g. reporting shuffle metrics in the SQL UI, for each exchange operator. ## How was this patch tested? No behavior change expected, as it is a straightforward refactoring. Updated all existing test cases. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-26140 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/23105.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #23105 ---- commit da253b57c14bc0174f0330ae6fa5d3a61647269b Author: Reynold Xin <rxin@...> Date: 2018-11-21T14:56:23Z [SPARK-26140] Pull TempShuffleReadMetrics creation out of shuffle reader ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org