HyukjinKwon commented on a change in pull request #25315: 
[SPARK-28582][PYSPARK] Fix pyspark daemon exit failed when receive SIGTERM on 
Python 3.7
URL: https://github.com/apache/spark/pull/25315#discussion_r309958195
 
 

 ##########
 File path: python/pyspark/daemon.py
 ##########
 @@ -102,7 +102,7 @@ def shutdown(code):
         signal.signal(SIGTERM, SIG_DFL)
         # Send SIGHUP to notify workers of shutdown
         os.kill(0, SIGHUP)
-        sys.exit(code)
+        os._exit(code)
 
 Review comment:
   >  If this SystemExit exception is swallowed by user code somewhere or in 
finally block run into some blocking code, then the kill will fail.
   
   `os.kill(0, SIGHUP)` is non-blocking call IIRC and seems `SystemExit` is not 
being swallowed in the current daemon. Why is it specific to Python 3.7?
   
   Given my debug (IIRC), workers didn't die by `os.kill(0, SIGHUP)` at Python 
3.7 specifically. My impression was that it's related to the current bug.
   
   We should trigger the cleanup procedure per 
https://docs.python.org/3/library/os.html#os._exit; otherwise, it could cause 
another set of resource leak issues.
   
   I still think we should identify (and possibly file a bug in Python side). 
`os._exit` is a way officially discouraged and we need a reason to use this 
instead.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to