Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/2183#issuecomment-53902169
  
    It looks like Spark core automatically registers JVM shutdown hooks for 
several different components, so it might okay to have the similar global logic 
here.
    
    One (perhaps minor) concern with the global hook is that the functions we 
register will contain strong references to the SparkContext, which might lead 
to resource leaks in a long-running Python process that creates and destroys 
many contexts.
    
    PySpark does not currently support running multiple SparkContexts at the 
same time, so one option would be to define a single shutdown hook that stops 
`SparkContext._active_spark_context`.  There's currently a lock 
(`SparkContext._lock`) guarding that field and I'm not sure whether it's safe 
to attempt to acquire it during a shutdown hook (it's fine for shutdown hooks 
to throw exceptions, but they shouldn't block).  To guard against this, maybe 
we can attempt to acquire the lock and just throw an exception after a short 
timeout.  This is a super-rare edge case, though, and I'd be shocked if anyone 
ran into it, since it requires a separate thread attempting to start or stop a 
SparkContext while the Python interpreter is exiting.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


Reply via email to