Josh Rosen created SPARK-20776: ---------------------------------- Summary: Fix performance problems in TaskMetrics.nameToAccums map initialization Key: SPARK-20776 URL: https://issues.apache.org/jira/browse/SPARK-20776 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.2.0 Reporter: Josh Rosen Assignee: Josh Rosen Attachments: screenshot-1.png
In {code} ./bin/spark-shell --master=local[64] {code} I ran {code} sc.parallelize(1 to 100000, 100000).count() {code} and profiled the time spend in the LiveListenerBus event processing thread. I discovered that the majority of the time was being spent initializing the {{TaskMetrics.nameToAccums}} map (see attached screenshot). By replacing the use of Scala's LinkedHashMap with a pre-sized Java hashmap I was able to remove this bottleneck and prevent dropped listener events. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org