[
https://issues.apache.org/jira/browse/SPARK-24437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16679745#comment-16679745
]
Marco Gaido commented on SPARK-24437:
-------------------------------------
[~dvogelbacher] the point is: a broadcast is never destroyed/recomputed. For
many reasons: in case you just re-execute a plan without caching it, for
instance, the broadcast doesn't need to be recomputed, etc.etc. This could be
definitely changed doing something like what I did in the PR in the related
JIRA (which is not enough anyway, since it misses the recompute logic). Yes, I
think your use-case is rather unusual and not well handled by Spark currently,
but fixing it is not trivial either since it is kind of a trade-off between
recomputation cost vs resource allocation.
> Memory leak in UnsafeHashedRelation
> -----------------------------------
>
> Key: SPARK-24437
> URL: https://issues.apache.org/jira/browse/SPARK-24437
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Reporter: gagan taneja
> Priority: Critical
> Attachments: Screen Shot 2018-05-30 at 2.05.40 PM.png, Screen Shot
> 2018-05-30 at 2.07.22 PM.png, Screen Shot 2018-11-01 at 10.38.30 AM.png
>
>
> There seems to memory leak withÂ
> org.apache.spark.sql.execution.joins.UnsafeHashedRelation
> We have a long running instance of STS.
> With each query execution requiring Broadcast Join, UnsafeHashedRelation is
> getting added for cleanup in ContextCleaner. This reference of
> UnsafeHashedRelation is being held at some other Collection and not becoming
> eligible for GC and because of this ContextCleaner is not able to clean it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]