Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18320#discussion_r122843971
  
    --- Diff: R/pkg/inst/worker/daemon.R ---
    @@ -31,7 +31,30 @@ inputCon <- socketConnection(
         port = port, open = "rb", blocking = TRUE, timeout = connectionTimeout)
     
     while (TRUE) {
    -  ready <- socketSelect(list(inputCon))
    +  ready <- socketSelect(list(inputCon), timeout = 1)
    +
    +  # Note that the children should be terminated in the parent. If each 
child terminates
    +  # itself, it appears that the resource is not released properly, that 
causes an unexpected
    +  # termination of this daemon due to, for example, running out of file 
descriptors
    +  # (see SPARK-21093). Therefore, the current implementation tries to 
retrieve children
    +  # that are exited (but not terminated) and then sends a kill signal to 
terminate them properly
    +  # in the parent.
    +  #
    +  # There are two paths that it sends a signal to terminate the children 
in the parent.
    +  #
    +  #   1. Every second if any socket connection is not available.
    +  #   2. Right after a socket connection is available.
    +  #
    +  # In other words, the parent sends the signal to children every second 
or right before
    +  # launching other worker children from the following new socket 
connection.
    +  #
    +  # Only the process IDs of exited children are returned and the 
termination is attempted below.
    +  finishedChildren <- parallel:::selectChildren(timeout = 0)
    --- End diff --
    
    This does not block at all. I tested this - 
https://github.com/apache/spark/pull/18320#discussion_r122605437 for sure. Let 
me double check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to