[GitHub] spark pull request: [SPARK-2764] Simplify daemon.py process struct...

JoshRosen Fri, 01 Aug 2014 18:15:06 -0700

Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1680#discussion_r15725947
  
    --- Diff: python/pyspark/daemon.py ---
    @@ -174,20 +116,41 @@ def handle_sigchld(*args):
         # Initialization complete
         sys.stdout.close()
         try:
    -        while not should_exit():
    +        while True:
                 try:
    -                # Spark tells us to exit by closing stdin
    -                if os.read(0, 512) == '':
    -                    shutdown()
    -            except EnvironmentError as err:
    -                if err.errno != EINTR:
    -                    shutdown()
    +                ready_fds = select.select([0, listen_sock], [], [])[0]
    +            except select.error as ex:
    +                if ex[0] == EINTR:
    --- End diff --
    
    @davies raised a good question about whether `select.select()` can return 
other errors here and whether we should try to more gracefully handle those 
errors.  According to `man select`:
    
    ```
         An error return from select() indicates:
    
         [EAGAIN]           The kernel was (perhaps temporarily) unable to 
allocate the requested number of file descriptors.
    
         [EBADF]            One of the descriptor sets specified an invalid 
descriptor.
    
         [EINTR]            A signal was delivered before the time limit 
expired and before any of the selected events occurred.
    
         [EINVAL]           The specified time limit is invalid.  One of its 
components is negative or too large.
    
         [EINVAL]           ndfs is greater than FD_SETSIZE and 
_DARWIN_UNLIMITED_SELECT is not defined.
    ```
    
    I think only `EINTR` is recoverable here.  I've updated this code to use 
the `EINTR` constant instead of the magic number `4`.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2764] Simplify daemon.py process struct...

Reply via email to