Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/126#issuecomment-38607851
@yaoshengzhe
@andrewor14 helped me figure out it that its not too complicated to do it
with reference queue. However, it does introduce another collection in the
implementation (to store the WeakRefs), which is what is what I wanted to avoid
(as we have to think about cleaning that up again, etc.). But introducing this
HashMap is probably okay because it will get automatically cleaned as RDDs,
shuffles, etc. go out of scope. The only scenario where this is not going to
happen is in spark-shell where everything stays in scope. However, in that
scenario, this collection is the least of the worries (it does not consume much
memory or any other resources) as there are other bigger problems (RDD object
accumulating in memory, etc.). So introducing this collection is probably okay.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---