Github user JoshRosen commented on a diff in the pull request:
https://github.com/apache/spark/pull/1680#discussion_r15725947
--- Diff: python/pyspark/daemon.py ---
@@ -174,20 +116,41 @@ def handle_sigchld(*args):
# Initialization complete
sys.stdout.close()
try:
- while not should_exit():
+ while True:
try:
- # Spark tells us to exit by closing stdin
- if os.read(0, 512) == '':
- shutdown()
- except EnvironmentError as err:
- if err.errno != EINTR:
- shutdown()
+ ready_fds = select.select([0, listen_sock], [], [])[0]
+ except select.error as ex:
+ if ex[0] == EINTR:
--- End diff --
@davies raised a good question about whether `select.select()` can return
other errors here and whether we should try to more gracefully handle those
errors. According to `man select`:
```
An error return from select() indicates:
[EAGAIN] The kernel was (perhaps temporarily) unable to
allocate the requested number of file descriptors.
[EBADF] One of the descriptor sets specified an invalid
descriptor.
[EINTR] A signal was delivered before the time limit
expired and before any of the selected events occurred.
[EINVAL] The specified time limit is invalid. One of its
components is negative or too large.
[EINVAL] ndfs is greater than FD_SETSIZE and
_DARWIN_UNLIMITED_SELECT is not defined.
```
I think only `EINTR` is recoverable here. I've updated this code to use
the `EINTR` constant instead of the magic number `4`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---