Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3570#issuecomment-65733162
  
    Hi @nkronenfeld,
    
    Thanks for this PR.  These sorts of resource leakage issues can be tricky 
to debug, so thanks for spotting this.  It would be great to file a dedicated 
JIRA for the memory-leak reported here.
    
    I find the current `Accumulators` object code to be difficult to 
understand, so I'd be open to a larger refactoring / reorganization of that 
code.  To handle cleanup after thread death, I think we can make `localAccums` 
into a thread-local `Map`, since thread-locals should get GC'd when threads 
die.  We still need to worry about threads staying in thread pools and being 
re-used, though, so we should ensure that the thread-local is cleared after 
each task.  I think we should be able to do this by moving the 
`Accumulators.clear()` in `Executor.scala` from the start of the task into the 
`finally` block that handles task cleanup.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to