[ https://issues.apache.org/jira/browse/SPARK-22147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237775#comment-16237775 ]
Bryan Cutler commented on SPARK-22147: -------------------------------------- Sorry, I linked the above PR to this JIRA accidentally > BlockId.hashCode allocates a StringBuilder/String on each call > -------------------------------------------------------------- > > Key: SPARK-22147 > URL: https://issues.apache.org/jira/browse/SPARK-22147 > Project: Spark > Issue Type: Improvement > Components: Block Manager > Affects Versions: 2.2.0 > Reporter: Sergei Lebedev > Assignee: Sergei Lebedev > Priority: Minor > Fix For: 2.3.0 > > > The base class {{BlockId}} > [defines|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockId.scala#L44] > {{hashCode}} and {{equals}} for all its subclasses in terms of {{name}}. > This makes the definitions of different ID types [very > concise|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockId.scala#L52]. > The downside, however, is redundant allocations. While I don't think this > could be the major issue, it is still a bit disappointing to increase GC > pressure on the driver for nothing. For our machine learning workloads, we've > seen as much as 10% of all allocations on the driver coming from > {{BlockId.hashCode}} calls done for > [BlockManagerMasterEndpoint.blockLocations|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManagerMasterEndpoint.scala#L54]. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org