Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/126#issuecomment-38527275
  
    In case anyone else is interested, @tdas and I discussed this offline and 
decided that a cleaner way of doing this than using `finalize` is to use 
`ReferenceQueue`s with `WeakReference`s. This requires us to maintain a few 
self-cleaning maps (or buffers) in `ContextCleaner` to keep track of these 
references, which should amount to no more than the order of 10B per persisted 
RDD / shuffle dependency.
    
    The only tricky thing to be addressed in future PRs is dealing with shells, 
in which variables don't really go out of scope. Even in this case, however, 
the additional reference maps introduced in this approach amount to relatively 
insignificant memory use.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to